Specific embodiment
Hereinafter, will be described with reference to the accompanying drawings embodiment of the disclosure.However, it should be understood that these descriptions are only exemplary
, and it is not intended to limit the scope of the present disclosure.In the following detailed description, to elaborate many specific thin convenient for explaining
Section is to provide the comprehensive understanding to the embodiment of the present disclosure.It may be evident, however, that one or more embodiments are not having these specific thin
It can also be carried out in the case where section.In addition, in the following description, descriptions of well-known structures and technologies are omitted, to avoid
Unnecessarily obscure the concept of the disclosure.
Term as used herein is not intended to limit the disclosure just for the sake of description specific embodiment.It uses herein
The terms "include", "comprise" etc. show the presence of the feature, step, operation and/or component, but it is not excluded that in the presence of
Or add other one or more features, step, operation or component.
There are all terms (including technical and scientific term) as used herein those skilled in the art to be generally understood
Meaning, unless otherwise defined.It should be noted that term used herein should be interpreted that with consistent with the context of this specification
Meaning, without that should be explained with idealization or excessively mechanical mode.
It, in general should be according to this using statement as " at least one in A, B and C etc. " is similar to
Field technical staff is generally understood the meaning of the statement to make an explanation (for example, " system at least one in A, B and C "
Should include but is not limited to individually with A, individually with B, individually with C, with A and B, with A and C, have B and C, and/or
System etc. with A, B, C).Using statement as " at least one in A, B or C etc. " is similar to, generally come
Saying be generally understood the meaning of the statement according to those skilled in the art to make an explanation (for example, " having in A, B or C at least
One system " should include but is not limited to individually with A, individually with B, individually with C, with A and B, have A and C, have
B and C, and/or the system with A, B, C etc.).It should also be understood by those skilled in the art that substantially arbitrarily indicating two or more
The adversative conjunction and/or phrase of optional project shall be construed as either in specification, claims or attached drawing
A possibility that giving including one of these projects, either one or two projects of these projects.For example, phrase " A or B " should
A possibility that being understood to include " A " or " B " or " A and B ".
Embodiment of the disclosure provides a kind of data warehouse information processing method, includes multiple associations in the data warehouse
The history table of storage, this method comprises: obtaining at least one historical query sentence, historical query sentence is deposited for inquiring association
The related data of multiple history tables in the history table of storage determines the corresponding multiple history of at least one historical query sentence
Table generates target table based on the particular historical table in multiple history tables, and target table includes in particular historical table
Related data.
Fig. 1 is diagrammatically illustrated at data warehouse information processing method and data warehouse information according to the embodiment of the present disclosure
The system architecture of reason system.It should be noted that only can showing using the system architecture of the embodiment of the present disclosure shown in Fig. 1
Example, to help skilled in the art to understand the technology contents of the disclosure, but is not meant to that the embodiment of the present disclosure cannot be used
In other equipment, system, environment or scene.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network according to this embodiment
104 and server 105.Network 104 between terminal device 101,102,103 and server 105 to provide communication link
Medium.Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 101,102,103
(merely illustrative) such as the application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform softwares.
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, packet
Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as utilize terminal device 101,102,103 to user
The website browsed provides the back-stage management server (merely illustrative) supported.Back-stage management server can be to the use received
The data such as family request analyze etc. processing, and by processing result (such as according to user's request or the webpage of generation, believe
Breath or data etc.) feed back to terminal device.
It should be noted that data warehouse information processing method provided by the embodiment of the present disclosure generally can be by server
105 execute.Correspondingly, data warehouse information processing unit provided by the embodiment of the present disclosure generally can be set in server
In 105.Data warehouse information processing method provided by the embodiment of the present disclosure can also be by being different from server 105 and can be with
The server or server cluster that terminal device 101,102,103 and/or server 105 communicate execute.Correspondingly, the disclosure is real
Applying data warehouse information processing unit provided by example also can be set in being different from server 105 and can be with terminal device
101,102,103 and/or server 105 communicate server or server cluster in.
For example, the historical query sentence and history table of the embodiment of the present disclosure can store terminal device 101,102,
In 103, query statement and history table are sent in server 105 by terminal device 101,102,103, server 105
Target table is created based on query statement and history table, alternatively, terminal device 101,102,103 can also be directly based upon inquiry
Sentence and history table create target table.In addition, query statement and history table can also be stored directly in server 105
In, query statement is directly based upon by server 105 and history table creates target table.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
Fig. 2A~2C diagrammatically illustrates the data warehouse information processing method and data warehouse according to the embodiment of the present disclosure
The application scenarios of information processing system.It should be noted that being only the field that can apply the embodiment of the present disclosure shown in Fig. 2A~2C
The example of scape to help skilled in the art to understand the technology contents of the disclosure, but is not meant to the embodiment of the present disclosure not
It can be used for other equipment, system, environment or scene.
As shown in Fig. 2A~2C, which for example may include a variety of table schemas in data warehouse, such as
Including Star Schema 210 and snowflake type mode 220 and wide table 230.
According to the embodiment of the present disclosure, Star Schema 210 and snowflake type mode 220 for example can be in data warehouse and be used for
The table schema of storing data, every kind of table schema is for example including multiple associated tables.As shown in Figure 2 A, Star Schema
210 for example including table 211, table 212 and table 213 etc..As shown in Figure 2 B, snowflake type mode 220 is for example including table
Lattice 221, table 222, table 223, table 224, table 225, table 226 etc..
Wherein, the data of multiple table such as storage are more dispersed, such as sales data, multiple list
In each table for example store user information, merchandise news, Business Information etc. respectively.
In the embodiments of the present disclosure, usually by the related table in query statement query data repository library, by obtaining phase
The data in table are closed, analysis mining is carried out with this, provides decision for various businesses.
Due to obtaining the process of relevant information more by multiple related tables in query statement query data repository library
It is complicated, it is to be understood that needed for the incidence relation between the information and multiple tables of multiple table storages could be inquired preferably
Data.Such as when needing to inquire sales data, associatedly searching user's information table, merchandise news table, Business Information are needed
Table etc..
In order to improve the convenience that business uses, it usually needs wide table is established, such as establishes the wide table about sales data,
The width table includes the data information of multiple tables, the data letter for example including user message table, merchandise news table, Business Information table
Breath uses convenient for service inquiry.
The embodiment of the present disclosure can be by obtaining the query statement for inquiring multiple tables, such as acquisition for inquiry table
Lattice 221, table 222, table 223, table 224, table 225, table 226 query statement, from the query statement determine institute
The table being related to, such as determine that the table that query statement is related to is table 221, table 222, table 223, table 224, table
225, table 226 create relevant wide table 230 based on multiple table, and as shown in Figure 2 C, the wide table 230 is for example including multiple
The data information of table.
The embodiment of the present disclosure by from query statement determine data warehouse in multiple tables, and be based on multiple table
The wide table of data warehouse is created, realizes the automatic building process of wide table.
Fig. 3 A diagrammatically illustrates the flow chart of the data warehouse information processing method according to the embodiment of the present disclosure.
As shown in Figure 3A, this method includes operation S310~S330.
In the embodiments of the present disclosure, the major function of data warehouse is that operation system is passed through Transaction Processing (OLTP)
Generated mass data is utilized through data storage framework specific to data warehouse theory by systematically analysis and arrangement
Various analysis methods, such as on-line analytical processing (OLAP) and data mining (Data Mining), and then service such as decision branch
Hold the systems such as system (Decision Support System).Data warehouse can aid decision making person fast and effeciently from a large amount of
In data, valuable information is analyzed, drafts in order to decision and being changed with fast reaction external environment, helps construction business intelligence
It can solve scheme.
In the embodiments of the present disclosure, in building process data warehouse, dimension design can be by relationship map to one group of relationship
Table can be designed: Star Schema and snowflake type mode using two ways under normal conditions.Star Schema can be described as a letter
Single is star-like: central table includes factual data, and multiple tables are radially distributed centered on central table, they are by major key and outside
Key is connected with each other.
It in accordance with an embodiment of the present disclosure, include the history table of multiple associated storages in data warehouse, it will be understood that this public affairs
Opening table described in embodiment includes the tables of data in data warehouse for storing data, wherein the tables of data may include more
A data arrange (or data field), and the data of each data column are the data of different field types.Specifically, multiple history
Table for example can be true table, dimension table or wide table created in data warehouse etc..
In operation S310, at least one historical query sentence is obtained, historical query sentence is for inquiring going through for associated storage
The related data of multiple history tables in history table.
According to the embodiment of the present disclosure, historical query sentence for example can be related service personnel in query data repository library
Data used in query statement, which can be used for from the history table of the associated storage in data warehouse inquiring
The related data of multiple history tables.The query statement can be SQL query statement.
For example, obtaining at least one historical query sentence, comprising: obtain and inquire associated storage by historical query sentence
The history data of related data warehouse when the related data of multiple history tables in history table is transported based on history
At least one historical query sentence is determined in row data.
In the embodiments of the present disclosure, in the process for the related data for inquiring multiple history tables by historical query sentence
In, the history data of data warehouse can be generated, which for example can be in query data repository library
When related data, to base warehouse table, basic fairground table and the user's user-defined data table relevant original operation day in data warehouse
Will.
Fig. 3 B diagrammatically illustrates the history data schematic diagram according to the data warehouse of the embodiment of the present disclosure.
Wherein, the raw operational data of data warehouse for example can be the original operation day of data warehouse shown in Fig. 3 B
Will, the original running log include the historical query sentence for related data in query data repository library.
In the embodiments of the present disclosure, at least one historical query sentence is determined from operation data, such as can be from number
Believe according to proposing that the important system of clean, complete, orderly historical query sentence and correlation is run in the original running log in warehouse
It ceases (for example including associated alarm and error message).
Wherein, such as it can use the regular expression of customization and simply cleaned relatively chaotic running log,
Such as the original running log in Fig. 3 B is cleaned, obtained wash result is as shown in table 1.Wherein, the wash result
Including historical query sentence, for example including SQL content.
Table 1
In operation S320, the corresponding multiple history tables of at least one historical query sentence are determined.
In the embodiments of the present disclosure, multiple history tables for example can be table involved in historical query sentence, that is, when
When needing the related data in query data repository library, multiple tables in historical query sentence query data repository library can be passed through
Data, wherein multiple table is the corresponding history table of historical query sentence.
Wherein it is determined that the corresponding multiple history tables of at least one historical query sentence, comprising: look at least one history
It askes sentence to be parsed, obtains the related information of at least one historical query sentence, related information includes associate field and pass
Bracing part.
In the embodiments of the present disclosure, historical query sentence can be parsed to obtain the related letter of historical query sentence
Breath, such as historical query sentence can be parsed by SQL resolver to obtain the related information of historical query sentence.
Wherein, SQL resolver is a kind of using between field in metadata information and data query SQL analysis query SQL
Main foreign key relationship, table association etc. correlation analyses tool.
For example, historical query sentence is to analyze the association that the historical query sentence obtains using SQL resolver shown in table 2
Information result is as shown in table 3.
Wherein, which for example may include the associate field and Correlation Criteria in query statement, wherein being associated with
Field for example may include Aggregation field, and sort field, condition field, inquiry field etc., Correlation Criteria for example may include table
Association, field association.
Table 2
SELECT |
A, B, Z, COUNT (1) AS CT |
FROM ODS.FOO FOO |
INNER JOIN ODS.BAR BAR |
ON FOO.A=BAR.X AND FOO.B=BAR.Y |
GROUP BY A.B.Z |
ORDER BY A, B, Z DESC; |
Table 3
According to the embodiment of the present disclosure, the corresponding multiple history lists of at least one historical query sentence are determined based on related information
Lattice.
For example, it may be the corresponding multiple history tables of query statement are determined by the related information in table 3, it is multiple
History table is for example stored in data warehouse, for example, the multiple history tables determined are as shown in table 4 and table 5.
The specific workflow of SQL resolver is simply introduced below.
Fig. 3 C diagrammatically illustrates the visualization schematic diagram according to the abstract syntax tree of the embodiment of the present disclosure.
For convenience of explanation, citing is made for a relatively simple historical query sentence herein, the historical query language
Sentence is as shown in table 6.
Table 4
Field |
Field type |
Field annotation |
A |
STRING |
Column A |
B |
STRING |
Column B |
C |
STRING |
Column C |
D |
STRING |
Column D |
Table 5
Field |
Field type |
Field annotation |
X |
STRING |
Column X |
Y |
STRING |
Column Y |
Z |
STRING |
Column Z |
Table 6
SELECT A.Z |
FROM ODS.FOO FOO |
INNER JOIN ODS.BAR BAR |
ON FOO.B=BAR.Y |
WHERE BAR.Z LIKE′LEO′; |
Original query SQL (historical query sentence) is parsed into SQL abstract syntax tree, effect of visualization such as Fig. 3 C first
It is shown.
By the SQL abstract syntax tree according to building, related information therein is extracted, such as extracts table and is associated with (Join
Clauses), field association (Join Clauses Conditions), Aggregation field (Group By), sort field (Order
By), condition field (Where Clauses), inquiry field (Query Columns) etc..
In the embodiments of the present disclosure, can by the related information analyzed using SQL resolver (such as shown in table 3),
Building indexes and is put in storage query engine, so as to subsequent query calling.
In operation S330, target table is generated based on the particular historical table in multiple history tables, target table includes
Related data in particular historical table.
According to the embodiment of the present disclosure, particular historical table is, for example, all forms or part table in multiple history tables
Lattice.For example, particular historical table is the history table for meeting the second preset threshold in multiple history tables.Second preset threshold
Such as it can be the high table of multiple history table frequencies of occurrences, that is, particular historical table can be to be related in historical query sentence
And history table often can indicate the particular historical table since the frequency of occurrence of the particular historical table is high
It is queried often, and then learns that demand of the business personnel to the particular historical table is big.
In the embodiments of the present disclosure, by the particular historical table create at data warehouse target table, such as creation at
The wide table of data warehouse, the target table include the related data in particular historical table, and the target table is convenient for users to making
With in other words, leniently inquiry data are more convenient in table by user, reach effectively and rapidly that query analysis has from data warehouse
The information of value, convenient for making a policy.
In accordance with an embodiment of the present disclosure, by determining multiple history tables based on historical query sentence, multiple history are based on
Table constructs target table, which includes the related data of multiple history tables, which is, for example, data bins
The wide table in library, can be realized the building process of wide table in optimization data warehouse by the scheme of the embodiment of the present disclosure, such as reach
The technical effect of the automation building of wide table.
Fig. 4 diagrammatically illustrates the flow chart of the data warehouse information processing method according to another embodiment of the disclosure.
As shown in figure 4, this method includes operation S310~S330 and operation S410.Wherein, operation S310~S320 with
The upper operation with reference to described in Fig. 3 A is same or like, and details are not described herein.
In operation S410, the query statement conduct for meeting the first preset condition is obtained from multiple initial history query statements
At least one historical query sentence.
Implemented according to the disclosure, multiple initial history query statements for example can be phase in query data repository library for a long time
Close the query statement of data, wherein the first preset condition for example can be the high query statement of similarity, that is, from multiple initial
The high query statement of similarity is obtained in historical query sentence as at least one historical query sentence.
According to the embodiment of the present disclosure, the inquiry language for meeting the first preset condition is obtained from multiple initial history query statements
Sentence is used as at least one historical query sentence, comprising: is clustered to obtain at least one to multiple initial history query statements and be looked into
Ask sentence group, wherein the similarity between the historical query sentence in each query statement group meets the first preset threshold.
In the embodiments of the present disclosure, such as can be gone out by clustering method off-line analysis similar in multiple historical query sentences
High cluster is spent, the high cluster of the similarity can be used for school for example including at least one historical query sentence, the high cluster of the similarity
It tests between the wide table of current data warehouse and fairground with the presence or absence of the wide table of redundancy that similarity is excessively high.Wherein, for already existing
The wide table of data warehouse and fairground and the customized all temporary query sentences of user, according to the real-valued vectors after its vectorization,
And an only inquiry ID is assigned, real value search index is constructed, storage arrives real-valued vectors query engine, makes for follow-up process inquiry
With.
It in the embodiments of the present disclosure, can also be to multiple before being clustered to multiple initial history query statements
Initial history query statement is pre-processed.
Wherein, data prediction is carried out to initial history query statement, such as can be handled by SQL resolver and is initially gone through
The related information that history query statement obtains constructs Semantic mapping based on associated data.Semantic mapping is it is to be understood that for same
A concept (such as: commodity ID), field name in table 4 is A, and field name in table 5 is B, according to related information, really
Field name in fixed unique identification replacement SQL in different table names.In addition to this, pretreatment further includes some SQL syntaxes
Standardization adjustment, the work such as capital and small letter conversion, it is therefore an objective to ensure to have the code snippet of semantic consistency natively to ensure that it
The similitude of content.
In disclosure implementation, multiple initial history query statements are clustered to obtain at least one query statement group packet
It includes:
Multiple initial history query statements are handled, the corresponding vector of multiple initial history query statements is obtained.
For example, need to carry out the query statement vectorization before clustering to multiple initial history query statements, and
Clustering processing is carried out to the query statement after vectorization.For example, using Word2Vec, Sentence2Vec and Document2Vec
Initial history query statement is converted into real-valued vectors by equal natural languages vectorization method.For example, the example after a conversion is such as
Shown in table 7.
Table 7
It is clustered the corresponding vector of multiple initial history query statements to obtain at least one query statement group, at least one
A query statement group includes the corresponding vector of respective queries sentence.
For example, the corresponding vector of multiple initial history query statements is clustered to obtain at least one query statement group,
Each query statement group includes the vector of multiple queries sentence, and the query statement in each query statement group has certain similar
Degree, the similarity for example can be the default similarity set according to demand.
Determine the query statement group for meeting the first preset condition as target query language from least one query statement group
Sentence group, target query sentence group includes at least one historical query sentence.
In the embodiments of the present disclosure, the first preset condition for example can be present count magnitude, wherein in query statement group
Query statement have corresponding quantitative value, when the quantitative value meets present count magnitude, can using the query statement group as
Target query sentence group.The target query sentence group is for example including multiple historical query sentences.
Fig. 5 diagrammatically illustrates the flow chart of the data warehouse information processing method according to disclosure another embodiment.
As shown in figure 5, this method includes operation S310~S320, operation S410 and operation S510.Wherein, S310 is operated
~S330 is same or like with the upper operation with reference to described in Fig. 3 A, and operation S410 is identical as the upper operation with reference to described in Fig. 4
Or it is similar, details are not described herein.
Target table is stored in the case where target table meets the second preset condition in operation S510.For example,
The similarity for obtaining other history tables in target table and data warehouse, the case where similarity meets third predetermined threshold value
Under, store target table.
According to the embodiment of the present disclosure, the second preset condition for example can be the target table for meeting default similarity, such as
Determine the similarity of other history tables in target table and data warehouse, the storage when similarity meets third predetermined threshold value
Target table, wherein the third predetermined threshold value for example can be specific data, avoid the table in data warehouse similar with this
Degree height causes data redundancy.
Fig. 6 diagrammatically illustrates the wide table building flow chart of data warehouse according to the embodiment of the present disclosure.
As shown in fig. 6, the embodiment of the present disclosure discloses a kind of wide table automation building of the data warehouse based on information extraction
Method, entire construction method include operation S610~S650.
In operation S610, collect some in depot layer, collection city level and the client layer obtained by conformable layer data summarization
Base warehouse table, basic fairground table unstructured data related to the query SQL of user's user-defined data table, inquiry log etc..
In operation S620, using customized SQL resolver, parsed from query SQL and inquiry log different tables it
Between main foreign key relationship, correlation inquiry, correlativities such as alias, and establish the query engine of related data.
In operation S630, using customized SQL vectorization method (SQL2Vec), according to query SQL, inquiry log with
And the related data index built excavates and obtains similar inquiry, and establishes historical query SQL real-valued vectors query engine and look into
Ask SQL similitude cluster result.
In operation S640, the related data engine, historical query SQL real-valued vectors query engine, inquiry built is utilized
SQL similitude cluster result, statistics summarize the information such as the higher data field of co-occurrence frequency, tables of data, generate new data warehouse
The candidate template of wide table.
In operation S650, according to the candidate template of the wide table in new data warehouse, in conjunction with business expert advice, obtain final new
The wide table of data warehouse simultaneously solidifies.
Fig. 7 is diagrammatically illustrated according to the wide table candidate template generation of data warehouse of the embodiment of the present disclosure and auditing flow
Figure.
As shown in fig. 7, in the embodiments of the present disclosure, the last one process of the wide table constructing plan in automated data warehouse is just
It is the solidification of the candidate template generation of the wide table of data warehouse and the audit of business expert and the final wide table in new data warehouse.It should
Process includes operation S710~S790.
Its vector is obtained by the pretreatment in above-mentioned process for customized new query SQL is used in operation S710
Change result.
In operation S720, the result after its vectorization is added to historical query SQL real-valued vectors query engine.
In operation S730, query SQL similitude cluster result is updated.
In operation S740, periodically by updated historical query SQL real-valued vectors data and query SQL similitude cluster knot
Fruit passes to trigger, and trigger judges whether to generate the wide table template of new data warehouse according to the rule of definition.In trigger
In, core trigger rule can be understood as largely new inquiry similarity with higher being gathered for a cluster, while with it is existing
When all inquiry similarities in some data warehouse tables are respectively less than certain value, then gathered in the new inquiry for cluster from this
Extract the information such as the higher tables of data of co-occurrence frequency and data field.
In operation S750, the information such as the higher tables of data of co-occurrence frequency and data field based on extraction generate new data
The template of the wide table in warehouse.
Auditing flow is triggered, by the business of data warehouse after the new data warehouse of generation wide table template in operation S760
Expert carries out audit amendment.
In operation S770, the wide table of new data warehouse will be finally cured as by auditing revised wide table template.
The relevant information of the new wide table of data warehouse is updated to the inquiry of historical query SQL real-valued vectors in operation S780 and is drawn
In holding up.
The relevant information of the new wide table of data warehouse is updated into query SQL similitude cluster result in operation S790.
Fig. 8 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to the embodiment of the present disclosure.
As shown in figure 8, data warehouse information processing unit 800 include first obtain module 810, determining module 820 and
Generation module 830.
First obtains at least one the available historical query sentence of module 810, and historical query sentence is for inquiring association
The related data of multiple history tables in the history table of storage.
According to the embodiment of the present disclosure, at least one historical query sentence is obtained, comprising: acquisition is looked by historical query sentence
Ask the history run number of related data warehouse when the related data of multiple history tables in the history table of associated storage
According to based at least one historical query sentence determining in operation data.
According to the embodiment of the present disclosure, the first acquisition module 810 can for example execute the operation above with reference to Fig. 3 A description
S310, details are not described herein.
Determining module 820 can determine the corresponding multiple history tables of at least one historical query sentence.
According to the embodiment of the present disclosure, the corresponding multiple history tables of at least one historical query sentence are determined, comprising: to extremely
A few historical query sentence is parsed, and the related information of at least one historical query sentence is obtained, and related information includes closing
Join field and Correlation Criteria, the corresponding multiple history tables of at least one historical query sentence are determined based on related information.
According to the embodiment of the present disclosure, determining module 820 can for example execute the operation S320 above with reference to Fig. 3 A description,
This is repeated no more.
Generation module 830 can generate target table, target table based on the particular historical table in multiple history tables
Including the related data in particular historical table.
According to the embodiment of the present disclosure, particular historical table is the history lists for meeting the second preset threshold in multiple history tables
Lattice.
According to the embodiment of the present disclosure, generation module 830 can for example execute the operation S330 above with reference to Fig. 3 A description,
This is repeated no more.
Fig. 9 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to another embodiment of the disclosure.
As shown in figure 9, data warehouse information processing unit 900 includes the first acquisition module 810, determining module 820, generates
Module 830 and the second acquisition module 910.Wherein, first obtain module 810, determining module 820 and generation module 830 with
On with reference to Fig. 8 describe module it is same or like, details are not described herein.
Second acquisition module 910 can obtain the inquiry for meeting the first preset condition from multiple initial history query statements
Sentence is as at least one historical query sentence.
According to the embodiment of the present disclosure, the inquiry language for meeting the first preset condition is obtained from multiple initial history query statements
Sentence is used as at least one historical query sentence, comprising: is clustered to obtain at least one to multiple initial history query statements and be looked into
Ask sentence group, wherein similarity between the historical query sentence in each query statement group meets the first preset threshold, to
Determine the query statement group for meeting the first preset condition as target query sentence group, target query in a few query statement group
Sentence group includes at least one historical query sentence.
According to the embodiment of the present disclosure, multiple initial history query statements are clustered to obtain at least one query statement
Group, comprising: multiple initial history query statements are handled, the corresponding vector of multiple initial history query statements is obtained, it will
The corresponding vector of multiple initial history query statements is clustered to obtain at least one query statement group, at least one query statement
Group includes the corresponding vector of respective queries sentence.
According to the embodiment of the present disclosure, the second acquisition module 910 can for example execute the operation above with reference to Fig. 4 description
S410, details are not described herein.
Figure 10 diagrammatically illustrates the block diagram of the data warehouse information processing unit according to disclosure another embodiment.
As shown in Figure 10, data warehouse information processing unit 1000 includes the first acquisition module 810, determining module 820, life
Module 910 and memory module 1010 are obtained at module 830, second.Wherein, first obtain module 810, determining module 820 with
And the module that generation module 830 is described on reference to Fig. 8 is same or like, details are not described herein.Second obtain module 910 with it is upper
The module described with reference to Fig. 9 is same or like, and details are not described herein.
Memory module 1010 can store target table in the case where target table meets the second preset condition.
According to the embodiment of the present disclosure, in the case where target table meets the second preset condition, target table, packet are stored
It includes: obtaining the similarity of other history tables in target table and data warehouse, meet third predetermined threshold value in similarity
In the case of, store target table.
According to the embodiment of the present disclosure, memory module 1010 can for example execute the operation S510 above with reference to Fig. 5 description,
This is repeated no more.
It is module according to an embodiment of the present disclosure, submodule, unit, any number of or in which any more in subelement
A at least partly function can be realized in a module.It is single according to the module of the embodiment of the present disclosure, submodule, unit, son
Any one or more in member can be split into multiple modules to realize.According to the module of the embodiment of the present disclosure, submodule,
Any one or more in unit, subelement can at least be implemented partly as hardware circuit, such as field programmable gate
Array (FPGA), programmable logic array (PLA), system on chip, the system on substrate, the system in encapsulation, dedicated integrated electricity
Road (ASIC), or can be by the hardware or firmware for any other rational method for integrate or encapsulate to circuit come real
Show, or with any one in three kinds of software, hardware and firmware implementations or with wherein any several appropriately combined next reality
It is existing.Alternatively, can be at least by part according to one or more of the module of the embodiment of the present disclosure, submodule, unit, subelement
Ground is embodied as computer program module, when the computer program module is run, can execute corresponding function.
For example, first obtains module 810, determining module 820, the acquisition module 910 of generation module 830, second and storage
Any number of in module 1010, which may be incorporated in a module, to be realized or any one module therein can be split
At multiple modules.Alternatively, at least partly function of one or more modules in these modules can be with other modules at least
Partial function combines, and realizes in a module.In accordance with an embodiment of the present disclosure, first module 810, determining module are obtained
820, the acquisition of generation module 830, second at least one of module 910 and memory module 1010 can be at least by partly real
Now on hardware circuit, such as field programmable gate array (FPGA), programmable logic array (PLA), system on chip, substrate
System, specific integrated circuit (ASIC) in system, encapsulation, or can by circuit carry out it is integrated or encapsulate any other
The hardware such as rational method or firmware realize, with any one in three kinds of software, hardware and firmware implementations or with
It is wherein any several appropriately combined to realize.Alternatively, first obtains module 810, determining module 820, generation module 830, the
Two acquisition at least one of modules 910 and memory module 1010 can at least be implemented partly as computer program mould
Block can execute corresponding function when the computer program module is run.
Figure 11 diagrammatically illustrates the computer system for being suitable for data warehouse information processing according to the embodiment of the present disclosure
Block diagram.Computer system shown in Figure 11 is only an example, should not function and use scope to the embodiment of the present disclosure
Bring any restrictions.
It as shown in figure 11, include processor 1101 according to the computer system of the embodiment of the present disclosure 1100, it can basis
The program that is stored in read-only memory (ROM) 1102 is loaded into random access storage device (RAM) from storage section 1108
Program in 1103 and execute various movements appropriate and processing.Processor 1101 for example may include general purpose microprocessor (example
Such as CPU), instruction set processor and/or related chip group and/or special microprocessor (for example, specific integrated circuit (ASIC)),
Etc..Processor 1101 can also include the onboard storage device for caching purposes.Processor 1101 may include for executing
According to single treatment unit either multiple processing units of the different movements of the method flow of the embodiment of the present disclosure.
In RAM 1103, it is stored with system 1100 and operates required various programs and data.Processor 1101, ROM
1102 and RAM 1103 is connected with each other by bus 1104.Processor 1101 is by executing ROM 1102 and/or RAM 1103
In program execute the various operations of the method flow according to the embodiment of the present disclosure.It is noted that described program can also deposit
Storage is in one or more memories in addition to ROM 1102 and RAM 1103.Processor 1101 can also be by executing storage
Program in one or more of memories executes the various operations of the method flow according to the embodiment of the present disclosure.
In accordance with an embodiment of the present disclosure, system 1100 can also include input/output (I/O) interface 1105, input/output
(I/O) interface 1105 is also connected to bus 1104.System 1100 can also include being connected in lower component of I/O interface 1105
It is one or more: the importation 1106 including keyboard, mouse etc.;Including such as cathode-ray tube (CRT), liquid crystal display
And the output par, c 1107 of loudspeaker etc. (LCD) etc.;Storage section 1108 including hard disk etc.;And including such as LAN card,
The communications portion 1109 of the network interface card of modem etc..Communications portion 1109 executes logical via the network of such as internet
Letter processing.Driver 1110 is also connected to I/O interface 1105 as needed.Detachable media 1116, such as disk, CD, magnetic
CD, semiconductor memory etc. are mounted on as needed on driver 1110, in order to from the computer program read thereon
It is mounted into storage section 1108 as needed.
In accordance with an embodiment of the present disclosure, computer software journey may be implemented as according to the method flow of the embodiment of the present disclosure
Sequence.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer readable storage medium
Computer program, which includes the program code for method shown in execution flow chart.In such implementation
In example, which can be downloaded and installed from network by communications portion 1109, and/or from detachable media
1111 are mounted.The computer program by processor 1101 execute when, execute limited in the system of the embodiment of the present disclosure it is upper
State function.In accordance with an embodiment of the present disclosure, system as described above, unit, module, unit etc. can pass through computer
Program module is realized.
The disclosure additionally provides a kind of computer readable storage medium, which can be above-mentioned reality
It applies included in equipment/device/system described in example;Be also possible to individualism, and without be incorporated the equipment/device/
In system.Above-mentioned computer readable storage medium carries one or more program, when said one or multiple program quilts
When execution, the method according to the embodiment of the present disclosure is realized.
In accordance with an embodiment of the present disclosure, it is non-volatile computer-readable to can be computer for computer readable storage medium
Storage medium, such as can include but is not limited to: portable computer diskette, hard disk, random access storage device (RAM),
Read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), portable compact disc read-only memory
(CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In the disclosure, computer-readable
Storage medium can be it is any include or storage program tangible medium, the program can be commanded execution system, device or
Device use or in connection.
For example, in accordance with an embodiment of the present disclosure, computer readable storage medium may include above-described ROM 1102
And/or one or more memories other than RAM 1103 and/or ROM 1102 and RAM 1103.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule
The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
It will be understood by those skilled in the art that the feature recorded in each embodiment and/or claim of the disclosure can
To carry out multiple combinations or/or combination, even if such combination or combination are not expressly recited in the disclosure.Particularly, exist
In the case where not departing from disclosure spirit or teaching, the feature recorded in each embodiment and/or claim of the disclosure can
To carry out multiple combinations and/or combination.All these combinations and/or combination each fall within the scope of the present disclosure.
Embodiment of the disclosure is described above.But the purpose that these embodiments are merely to illustrate that, and
It is not intended to limit the scope of the present disclosure.Although respectively describing each embodiment above, but it is not intended that each reality
Use cannot be advantageously combined by applying the measure in example.The scope of the present disclosure is defined by the appended claims and the equivalents thereof.It does not take off
From the scope of the present disclosure, those skilled in the art can make a variety of alternatives and modifications, these alternatives and modifications should all fall in this
Within scope of disclosure.