CN103425780B - The querying method of a kind of data and device - Google Patents
The querying method of a kind of data and device Download PDFInfo
- Publication number
- CN103425780B CN103425780B CN201310362238.4A CN201310362238A CN103425780B CN 103425780 B CN103425780 B CN 103425780B CN 201310362238 A CN201310362238 A CN 201310362238A CN 103425780 B CN103425780 B CN 103425780B
- Authority
- CN
- China
- Prior art keywords
- data
- inquiry
- unstructured data
- assembly
- unstructured
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses querying method and the device of a kind of data, belong to mass data processing technical field.Described method includes: obtain correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery;When the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is resolved, obtain the data having pattern;The described data having pattern are associated inquiry with structural data, obtain the result set of described correlation inquiry.The present invention, by processing the independent parsing of unstructured data, obtains the data having pattern, wherein need not artificially get involved, automatically can resolve unstructured data, it is achieved that unstructured data and the correlation inquiry of structural data.
Description
Technical field
The present invention relates to mass data processing technical field, particularly to querying method and the dress of a kind of data
Put.
Background technology
Along with the development of data service, in same business, general existing structure data, also there is non-structural
Changing data, the most two kinds of data also have certain corresponding relation, need association process.Wherein,
Structural data refers to row data, is stored in lane database, can carry out the number of logical expression by bivariate table structure
According to, and be difficult to i.e. be referred to as unstructured data, destructuring by the data that data base's two dimension logical table shows
Data include the office documents of possessive case formula, text, picture, XML, HTML, all kinds of form, image
With audio/visual information etc..
At the Data processing of prior art, can directly use relational data stock for structural data
Storage, and in relevant database, realize the inquiry to structural data, filter or calculate;For non-structural
Change data, use MapReduce to carry out batch processing, including to the inquiry of unstructured data, filtration
Or calculate.In prior art, the process to structural data and unstructured data is carried out separately, i.e.
Structural data internal correlation is inquired about, and unstructured data internal correlation is inquired about, it is impossible to support structural data
And so how correlation inquiry between unstructured data, realize the pass of structural data and unstructured data
The problem being to need to solve is ask in joint investigation.
Summary of the invention
Can not the asking of auto-associating inquiry in order to solve structural data and unstructured data in prior art
Topic, provides querying method and the device of a kind of data in the embodiment of the present invention, described technical scheme is as follows:
On the one hand, it is provided that the querying method of a kind of data, for structural data and destructuring number
According to being associated inquiry, described method includes:
Acquisition correlation inquiry is asked, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
When the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call institute
State the analysis mode that unstructured data assembly is corresponding, described unstructured data assembly is resolved,
To the data having pattern;
The described data having pattern are associated inquiry with structural data, obtain the knot of described correlation inquiry
Fruit collection.
On the other hand, it is provided that the inquiry unit of a kind of data, for structural data and destructuring
Data are associated inquiry, and described device includes:
Task-decomposing module, is used for obtaining correlation inquiry request, and is decomposed into many by the request of described correlation inquiry
Individual sub-inquiry request;
Unstructured data parsing module, for including destructuring number when the request of the plurality of subquery
During according to the inquiry request of assembly, call the analysis mode that described unstructured data assembly is corresponding, to described non-
Structural data assembly resolves, and obtains the data having pattern;
Correlation inquiry module, for the described data having pattern are associated inquiry with structural data,
Result set to described correlation inquiry.
The technical scheme that the embodiment of the present invention provides has the benefit that
Acquisition correlation inquiry is asked, and the request of described correlation inquiry is decomposed into the request of multiple subquery;Work as institute
When stating the inquiry request that the request of multiple subquery includes unstructured data assembly, call described non-structural
Change the analysis mode that data package is corresponding, described unstructured data assembly is resolved, obtains there is pattern
Data;The described data having pattern are associated inquiry with structural data, obtain described correlation inquiry
Result set.By the independent parsing of unstructured data is processed, obtain the data having pattern, the most not
Need artificial intervention, can automatically unstructured data be resolved, it is achieved that unstructured data and knot
The correlation inquiry of structure data.
Accompanying drawing explanation
The specific embodiment of the present invention is described below with reference to accompanying drawings, wherein:
Fig. 1 is the flow chart of the querying method of a kind of data provided in the embodiment of the present invention one;
Fig. 2 is the flow chart of the querying method of a kind of data provided in the embodiment of the present invention two;
Fig. 3 is the platform schematic diagram after the initialization provided in the embodiment of the present invention two;
Fig. 4 is the schematic diagram of the inquiry unit of a kind of data provided in the embodiment of the present invention three;
Fig. 5 is the schematic diagram of the inquiry unit of the another kind of data provided in the embodiment of the present invention three.
Detailed description of the invention
In order to make technical scheme and advantage clearer, below in conjunction with accompanying drawing to the present invention's
Exemplary embodiment is described in more detail, it is clear that described embodiment is only the one of the present invention
Section Example rather than all embodiments exhaustive.
The data pattern related in the present embodiment is that the one to data is expressly recited mode, and data base deposits
The pattern of data, just because of there being data pattern, the data structure that could construct complexity is set up between data
Internal relation and complex relationship, thus constitute the global structure pattern of data.Data pattern is based on selected
Data model carries out " type " aspect and portrays data, and corresponding " example " is then to data " value "
The description of aspect.Prior data model, could discuss corresponding data pattern according to it, have data pattern, just
Corresponding example can be obtained according to this pattern.Generally data have clear and definite field, and type is exactly data pattern,
It is referred to as structural data, is otherwise non-mode, for unstructured data, similar picture, video, audio frequency literary composition
Part etc..
The correlation inquiry related in the present embodiment is not only the join of two bivariate tables in relevant database
Operation, and refer to have join to operate between structuring and the two kinds of data of destructuring, union operates,
Cascade operations etc., structuring and destructuring are as the data object of equality, by the behaviour to two kinds of data objects
It is fused in unified operation.
The execution of correlation inquiry in this paper, mainly includes but not limited to following three: structural data with
Unstructured data carries out join operation;Structural data sQL inquiry and unstructured data MapReduce
The cascade processed;Structured data query result and unstructured data carry out Union operation.
In order to those skilled in the art can more be apparent from the present invention involved correlation inquiry, existing
Provide the scene of following three kinds of correlation inquiries, but during concrete execution, be likely not limited to following three kinds of passes
The scene that joint investigation is ask, is not specifically limited in this present embodiment.
Correlation inquiry scene one:
Structural data and unstructured data carry out join operation, and this type of is applied and is divided into again two kinds of situations:
The first situation: structuring and unstructured data belong to an object, have identical associated characters
Section, and be one to one.
For this correlation inquiry scene, it is illustrated as a example by the application scenarios of hospital's case history below: each
The case history of patient has structural data, including the age, sex, medical history, the last time see a doctor the date and
State of an illness descriptions etc., also have unstructured data, including: CT sheet, electrocardiogram and oscillogram etc..To disease
The tracking of patient is treated by the analysis meeting of people's medical record data, and illness analysis, pathogenic factor analysis etc. has the biggest
Helping, such as, bigger patient of fluctuating electrocardiogram is analyzed, it is necessary to search electrocardiogram fluctuation is bigger
Patient's feature, then analyze, firstly the need of providing unstructured data analytical tool, the heart that fluctuation is big
Electrograph, then exports other information of patient corresponding to this type of picture, according to the information exported to such patient
It is analyzed.This process is through unstructured data (electrocardiogram) and goes index structure data (sick
People's information) process.
In use above scene, structural data and unstructured data have relation one to one, structuring number
Electrocardiogram sheet path, CT sheet figure path, oscillogram path, unstructured data road can be stored according to
Can be with structured data key value in footpath, such as identification card number or unique field identifying patient of patient,
Structural data and unstructured data store an identical associated column simultaneously.Associated column is exactly two assemblies
The row that (structuring and destructuring) all exists, two parts data are associated by this associated column, should
In scene, key value is exactly the identical associated column that structural data stores with unstructured data simultaneously.
Second case: structural data belongs to different objects from unstructured data, has associate field,
But it not one to one.Belonging to the situation of multi-to-multi, join operation is exactly by structural data and non-structural
Change data and do cartesian product calculating.Such as select col1, col2, col3 from obj1 where obj1.col1
=obj2.col1;Wherein obj1 is unstructured data, and obj2 is structural data.This is operation associated is exactly
Select the col1 of obj Data Entry present in the obj2.
Correlation inquiry scene two:
The cascade that structural data processes with unstructured data MapReduce, it is necessary first to destructuring
Data process output through MapReduce the result of pattern, then stores identical with structural data
Platform in be associated operation.
Select col1, col2, col3from (select col1, col2, col4from obj2) as t1join (select
Col3, col4from obj1) as t2on t1.col4=t2.col4;
Data object obj1 only comprises structural data assembly, and data object obj2 only comprises destructuring number
According to assembly.Above query statement achieves acquisition association results collection from unstructured data and structural data
Cascade operation.
Correlation inquiry scene three:
Structural data belongs to different objects from unstructured data, and the two carries out uion operation.General feelings
Under condition, first this type of inquiry can realize unstructured data and extract for the same data pattern of structural data
(a series of row name, row type is data pattern).Union behaviour is carried out the most again with structural data
Make.Correlation inquiry is exactly the process extracted to be shielded user.Wherein, data pattern is the one to data
Being expressly recited mode, data base deposits the pattern of data, just because of there being data pattern, could construct complexity
Data structure sets up the internal relation between data and complex relationship, thus constitutes the global structure mould of data
Formula.
In the present embodiment, query statement does not has phraseological district with the correlation inquiry statement of structural data objects
Not, simply the inquiry of unstructured data will be encapsulated by bottom.Structuring and unstructured data
Correlation inquiry scene includes but not limited to three kinds of scenes above, and proposing three kinds of scenes above is to preferably solve
Release correlation inquiry technology in this paper.
Embodiment one
The present embodiment provides the querying method of a kind of data, for structural data and destructuring number
According to being associated inquiry, seeing Fig. 1, method flow includes:
101, obtain correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
102, when the request of the plurality of subquery includes the inquiry request to unstructured data assembly,
Call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is solved
Analysis, obtains the data having pattern;
103, the described data having pattern are associated inquiry with structural data, obtain the joint investigation of described pass
The result set ask.
In the present embodiment, there are the data of pattern to refer to unstructured data assembly and resolve by the pattern of regulation
The data gone out, these data have the pattern of regulation.
In another embodiment, before the request of described acquisition correlation inquiry, also include:
Initializing structural data and/or unstructured data, described initialized operation includes: wound
Building data object and unified metadata, wherein, described unified metadata includes: containing structural data
The metadata information that the unified data object of assembly and unstructured data assembly is relevant, described data object
Including: structural data assembly and/or unstructured data assembly, and described unstructured data assembly
Analysis mode.Wherein relevant metadata information includes: the title of data object, module information, assembly mould
Formula and other attribute information relevant to Compatible object.
Preferably, the analysis mode of described unstructured data assembly includes: refer to for unstructured data assembly
Pattern after fixed parsing class and described unstructured data analyzing component.
In another embodiment, described in call the analysis mode that described unstructured data assembly is corresponding before,
Also include:
Ask to be packaged, with to described unstructured data to the subquery of described unstructured data assembly
Assembly carries out independent process.
Preferably, described the described data having a pattern are associated inquiry with structural data, obtain described
The result set of correlation inquiry, including:
The described data having pattern are imported in the storage platform identical with structural data;
To the described data having pattern and structuring number in the storage platform identical with described structural data
According to being associated inquiry, obtain the result set of described correlation inquiry.
The beneficial effect of the present embodiment includes: obtain correlation inquiry request, and by the request point of described correlation inquiry
Solve and ask for multiple subqueries;When the request of the plurality of subquery includes looking into unstructured data assembly
When asking request, call the analysis mode that described unstructured data assembly is corresponding, to described unstructured data
Assembly resolves, and obtains the data having pattern;The described data having pattern are closed with structural data
Joint investigation is ask, and obtains the result set of described correlation inquiry.By the independent parsing of unstructured data is processed,
Obtain the data having pattern, wherein need not artificially get involved, arrange when can automatically call data definition
Unstructured data is resolved by data parsing mode, it is achieved that unstructured data and structural data
Correlation inquiry.
Embodiment two
See Fig. 2, the querying method of a kind of data provided in the present embodiment, it is possible to achieve structural data
Inquire about with the auto-associating of unstructured data.In the present embodiment, structural data is mainly stored in structuring
Storage platform, if enormous amount, autgmentability is restricted, it would however also be possible to employ distributed data base, distribution
Formula file system or NoSQL store software structured data.The unstructured data of magnanimity includes
But it is not limited to use the mode of distributed file system storage.Due to structural data and unstructured data
Storage mode may be different, and therefore two types data are likely to be stored on different storage platforms, as
Fruit to realize the association of the two, needs to support the data association operation of difference storage platform, wherein will relate to
To the process of Data Migration, simultaneously in order to realize the association of the two, in addition it is also necessary to shield the different storage of bottom and put down
Platform, the process that encapsulation of data migrates.Concrete method flow includes:
201, structural data and/or unstructured data are initialized.
In the present embodiment, correlation inquiry premise is to formulate specific data definition mode, unified structure data
Store and all represent with data object with unstructured data storage.In the present embodiment, data object is exactly logarithm
According to a general name, be data definition in a term.A such as form in relevant database can
To be referred to as a kind of data object, the data object mentioned in the present embodiment is then to have widely implication, both
A set of structural data can be represented, a unstructured data set can be referred to, it is also possible to refer to same
Shi Hanyou structuring and a mixing set of two assemblies of unstructured data.
In the present embodiment, data definition mode includes but not limited to: create data_obj
Struct_columns (schema), nonstruct_columns (schema);[location:path] fileformat
Inputformat, outputformat.Wherein, create is the keyword of data definition.Data_obj is several
Name according to object.Struct_columns represents structural data.Nonstruct_columns represents non-knot
Structure data.[location:path] is the file directory of unstructured data.Fileformat is instruction tray
The keyword of formula.Inputformat refers to the parsing form class wrapper of digital independent in Hadoop platform, continues
Hold this type of can realize reading data from different data sources, resolve the data of different-format.Outputformat
Referring to that in Hadoop platform, data are written to the output format class wrapper of file system, inheriting this type of can be real
Now data are exported in a particular format different storage engines.In the present embodiment, if destructuring
Data storing platform is HDFS, or other distributed file systems that Hadoop supports, then can join
Examine the framework of digital independent and the output of Hadoop platform encapsulation, inputformat, outputformat are many
Kind of different data parsing and output organization mode can inherit the two class, it is achieved wherein most critical
Row resolves, the function of row analysis mode.So user has only to provide this framework to need two analytical functions,
Just can realize the parsing to unstructured data on a distributed.
In the present embodiment, the attribute of data object includes but not limited to: the pattern of data and analysis mode etc..
Pattern for unstructured data is exactly to the pattern after non-structural data parsing.Wherein, data
Object can only comprise structural data assembly, it is also possible to only comprises unstructured data assembly, it is also possible to bag
Containing related structural data and two assemblies of unstructured data, to this present embodiment does not do concrete limit
Fixed.
Therefore the present embodiment needs unstructured data and structural data are carried out initialization operation, its
In, initialized operation includes: creates data object and stores unified metadata, the most initially
Platform after change, described unified metadata includes: containing structural data assembly and unstructured data group
The metadata information that the data object that part is unified is correlated with.Described data object includes: structural data assembly and
/ or unstructured data assembly, and the analysis mode of described unstructured data assembly.Wherein relevant unit
Data message includes: the title of data object, module information, component pattern and other and Compatible object phase
The attribute information closed.
What deserves to be explained is, in the present embodiment, the form that structural data is specified according to storage platform in advance
Storage, and read according to the mode of regulation, and the process to unstructured data actual resolved is all in inquiry
During just perform because unstructured data form is not fixed, identical unstructured data passes through
Different analysis modes also can obtain different data, therefore can not use fixing analysis mode, Zhi Nengti
For fixing framework and the grammer of data definition, reduce the workload of user.So when initializing,
Appearance can be created for unstructured data, and specify parsing class, and the pattern after parsing, this
The data object of sample is just adapted for carrying out structuring and non-structured correlation inquiry.It is preferred that this enforcement
In example, the analysis mode of unstructured data assembly includes: the parsing class specified for unstructured data assembly
With the pattern after described unstructured data analyzing component.
What deserves to be explained is, this step needs when structured data and unstructured data to perform
Step, if having completed the storage to data during the correlation inquiry carrying out data, need not perform
This step, is not specifically limited this present embodiment.
202, obtain correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery.
In this step, obtaining the correlation inquiry request of user's input, the request of this correlation inquiry is probably above-mentioned
Request in a correlation inquiry scene of anticipating, is not specifically limited in this present embodiment.
In the present embodiment, correlation inquiry uses the query grammar of SQL standard, but is because in data object
May both include that structural data also included unstructured data, relate to the inquiry of unstructured data, institute
To need original SQL analytics engine is extended, unified correlation inquiry request is decomposed into multiple
Sub-SQL query, such as, decompose the inquiry request of association, obtains multiple subtask.Association statement
For select col1, col2, col4from objwhere otbj.structobj.col1=
Obj.nonstructobj.col1, resolves this statement, obtains multiple subquery task.Work as query statement
For cascade nested statement, then according to cascade nest relation, this query statement is resolved, obtain many height and look into
Inquiry task, repeats no more concrete resolving in the present embodiment.
203, when the request of multiple subqueries includes the inquiry request to unstructured data assembly, call
The analysis mode that unstructured data assembly is corresponding, resolves unstructured data assembly, obtains there is mould
The data of formula.
In the present embodiment, if query object has unstructured data assembly, alternatively, call described non-knot
Before the analysis mode that structure data package is corresponding, also include: the son of described unstructured data assembly is looked into
Inquiry request is packaged, so that described unstructured data assembly is carried out independent process.In the present embodiment, right
In the parsing of unstructured data, need to be encapsulated into independent task, give unstructured data and process
Engine processes.Wherein, due in advance at analysis mode defined in unstructured data assembly, so obtaining
During the parsing task of the destructuring assembly that must encapsulate, can be resolved by predefined parsing class,
And parse the data of designated mode, wherein, there are the data of pattern to refer to parse by the pattern of regulation
Data, these data have the pattern of regulation.
In the present embodiment, if the request of multiple subquery includes the inquiry request of structural data assembly, then
By fixing mode, structural data assembly is resolved, the most similarly to the prior art, to this this enforcement
Example repeats no more.
204, the data having pattern are associated inquiry with structural data, obtain the result of correlation inquiry
Collection.
In this step, owing to unstructured data and structural data are stored in different storage platforms, so
After obtaining having the data of pattern, need this Data Migration to the storage platform identical with structural data
In, and in this storage platform, the data and structural data having pattern are associated inquiry, obtain final
Correlation inquiry result set, this query results can be analyzed by user further.
What deserves to be explained is, after decomposing query task, unstructured data and structural data can be distinguished
Inquire about on now local storage platform, the most similarly to the prior art, in this present embodiment no longer
Repeat.After unstructured data resolved obtaining having the data of pattern, then to unstructured data and structure
Change data and be associated inquiry.
For the correlation inquiry technology making those skilled in the art be more clearly understood from the present invention, existing with level
The inquiry of connection nested type is illustrated:
OBJ1.STRUCTOBJ data pattern: (col1, col2, col3, col4)
Data pattern after OBJ2.NONSTRUCOBJ parsing: (col3, col4)
Query statement is: select col1, col2, col4from (select col1, col2, col3from
OBJ1.STRUCTOBJ) as t1 join (select col4, col3from OBJ2.NONSTRUCTOBJ) as
T2on t1.col3=t2.col3;Wherein comprise nested query.
This query statement is decomposed into task flow as follows:
The first step: select col1, the simple structural data of col2, col3from OBJ1.STRUCTOBJ
Inquiry.Result set R1;
Second step: select col4, col3from OBJ2.NONSTRUCTOBJ.Include data parsing process
Non-structural data enquiry result set R2;
3rd step: R2 is transferred on the data platform identical with R1;
4th step: select col1, col2, col4from R1join R2on R1.col3=R2.col3, real
The join inquiry of existing two relational data set.
The beneficial effect of the present embodiment includes: by unified data definition mode, it is achieved self-defining inquiry
Analytics engine, it is achieved that structuring is realized by a request with destructuring correlation inquiry, and will inquiry
Request is phased mission system stream at underlying translation, then by a self-defining enforcement engine, controls each rank
Completing of section task, it is ensured that whole task flow completes according to the order of regulation, returns correct Query Result,
Thus the problem solving the auto-associating inquiry of structural data and unstructured data.
Embodiment three
See Fig. 4, the present embodiment provide the inquiry unit of a kind of data, for structural data and
Unstructured data is associated inquiry, including: Task-decomposing module 301, unstructured data resolves mould
Block 302 and correlation inquiry module 303.
Task-decomposing module 301, is used for obtaining correlation inquiry request, and the request of described correlation inquiry is decomposed
Ask for multiple subqueries;
Unstructured data parsing module 302, for including non-structural when the request of the plurality of subquery
When changing the inquiry request of data package, call the analysis mode that described unstructured data assembly is corresponding, to institute
State unstructured data assembly to resolve, obtain the data having pattern;
Correlation inquiry module 303, for the described data having pattern are associated inquiry with structural data,
Obtain the result set of described correlation inquiry.
In another embodiment, seeing Fig. 5, described device also includes:
Data initialization module 304, for structural data and/or unstructured data are initialized,
Described initialized operation includes: creates data object and stores unified metadata, wherein, described unification
Metadata include: the data object unified to unstructured data assembly containing structural data assembly is relevant
Metadata information, described data object includes: structural data assembly and/or unstructured data assembly,
And the analysis mode of described unstructured data assembly.Wherein relevant metadata information includes: data pair
The title of elephant, module information, component pattern and other attribute information relevant to Compatible object.
Preferably, the analysis mode of described unstructured data assembly includes: refer to for unstructured data assembly
Pattern after fixed parsing class and described unstructured data analyzing component.
In another embodiment, seeing Fig. 5, described device also includes:
Package module 305, for calling described destructuring at described unstructured data parsing module 302
Before the analysis mode that data package is corresponding, ask to seal to the subquery of described unstructured data assembly
Dress, to carry out independent process to described unstructured data assembly.
See Fig. 5, it is preferable that correlation inquiry module 303, including:
Migration units 303a, for importing to the storage identical with structural data by the described data having pattern
In platform;
Correlation inquiry unit 303b, for having described in the storage platform identical with described structural data
The data of pattern and structural data are associated inquiry, obtain the result set of described correlation inquiry.
The beneficial effect of the present embodiment includes: obtain correlation inquiry request, and by the request point of described correlation inquiry
Solve and ask for multiple subqueries;When the request of the plurality of subquery includes looking into unstructured data assembly
When asking request, call the analysis mode that described unstructured data assembly is corresponding, to described unstructured data
Assembly resolves, and obtains the data having pattern;The described data having pattern are closed with structural data
Joint investigation is ask, and obtains the result set of described correlation inquiry.By the independent parsing of unstructured data is processed,
Obtain the data having pattern, wherein need not artificially get involved, can automatically unstructured data be resolved,
Achieve the correlation inquiry of unstructured data and structural data.
It should be understood that the inquiry unit of the data of above-described embodiment offer is when performing data query, only
It is illustrated with the division of above-mentioned each functional module, in actual application, can be as desired by above-mentioned
Function distribution is completed by different functional modules, the internal structure of equipment will be divided into different function moulds
Block, to complete all or part of function described above.
It addition, the inquiry unit of data that above-described embodiment provides belongs to same with the querying method embodiment of data
One design, it implements process and refers to embodiment of the method, repeats no more here.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can be led to
Cross hardware to complete, it is also possible to instructing relevant hardware by program and complete, described program can store
In a kind of computer-readable recording medium, storage medium mentioned above can be read only memory, disk
Or CD etc..
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all the present invention's
Within spirit and principle, any modification, equivalent substitution and improvement etc. made, should be included in the present invention's
Within protection domain.
Claims (8)
1. a querying method for data, for structural data and unstructured data being associated inquiry, it is characterised in that described method includes:
Acquisition correlation inquiry is asked, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
When the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is resolved, obtain the data having pattern;
The described data having pattern are associated inquiry with structural data, obtain the result set of described correlation inquiry;
Before the request of described acquisition correlation inquiry, also include:
Structural data and/or unstructured data are initialized, described initialized operation includes: creates data object and stores unified metadata, wherein, described unified metadata includes: containing the metadata information that the data object that structural data assembly is unified to unstructured data assembly is relevant, described data object includes: structural data assembly and/or unstructured data assembly, and the analysis mode of described unstructured data assembly.
2. the method for claim 1, it is characterised in that the analysis mode of described unstructured data assembly includes: the pattern after the parsing class specified for unstructured data assembly and described unstructured data analyzing component.
3. the method for claim 1, it is characterised in that described in call the analysis mode that described unstructured data assembly is corresponding before, also include:
Ask to be packaged, so that described unstructured data assembly is carried out independent process to the subquery of described unstructured data assembly.
4. the method for claim 1, it is characterised in that described with structural data, the described data having a pattern are associated inquiry, obtains the result set of described correlation inquiry, including:
The described data having pattern are imported in the storage platform identical with structural data;
In the storage platform identical with described structural data, the described data having pattern and structural data are associated inquiry, obtain the result set of described correlation inquiry.
5. an inquiry unit for data, for structural data and unstructured data being associated inquiry, it is characterised in that described device includes:
Task-decomposing module, is used for obtaining correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
Unstructured data parsing module, for when the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is resolved, obtain the data having pattern;
Correlation inquiry module, for the described data having pattern are associated inquiry with structural data, obtains the result set of described correlation inquiry
;
Described device also includes:
Data initialization module, for structural data and/or unstructured data are initialized, described initialized operation includes: creates data object and stores unified metadata, wherein, described unified metadata includes: containing the metadata information that the data object that structural data assembly is unified to unstructured data assembly is relevant, described data object includes: structural data assembly and/or unstructured data assembly, and the analysis mode of described unstructured data assembly.
6. device as claimed in claim 5, it is characterised in that the analysis mode of described unstructured data assembly includes: the pattern after the parsing class specified for unstructured data assembly and described unstructured data analyzing component.
7. device as claimed in claim 5, it is characterised in that described device also includes:
Package module, before calling, at described unstructured data parsing module, the analysis mode that described unstructured data assembly is corresponding, ask to be packaged, so that described unstructured data assembly is carried out independent process to the subquery of described unstructured data assembly.
8. device as claimed in claim 5, it is characterised in that described correlation inquiry module, including:
Migration units, for importing to the described data having pattern in the storage platform identical with structural data;
Correlation inquiry unit, for the described data having pattern and structural data being associated inquiry in the storage platform identical with described structural data, obtains the result set of described correlation inquiry.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310362238.4A CN103425780B (en) | 2013-08-19 | 2013-08-19 | The querying method of a kind of data and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310362238.4A CN103425780B (en) | 2013-08-19 | 2013-08-19 | The querying method of a kind of data and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103425780A CN103425780A (en) | 2013-12-04 |
CN103425780B true CN103425780B (en) | 2016-08-17 |
Family
ID=49650519
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310362238.4A Active CN103425780B (en) | 2013-08-19 | 2013-08-19 | The querying method of a kind of data and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103425780B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103714173B (en) * | 2013-12-31 | 2017-08-01 | 深圳市华宝电子科技有限公司 | A kind of search method of video source, device and monitor terminal |
CN104008180B (en) * | 2014-06-09 | 2017-04-12 | 北京奇虎科技有限公司 | Association method of structural data with picture, association device thereof |
CN105590066B (en) * | 2015-12-02 | 2018-08-10 | 中国银联股份有限公司 | The safe fusion method of big data of privacy is not revealed |
CN108241627A (en) * | 2016-12-23 | 2018-07-03 | 北京神州泰岳软件股份有限公司 | A kind of isomeric data storage querying method and system |
CN108268512B (en) * | 2016-12-30 | 2020-07-31 | 中国移动通信集团上海有限公司 | Label query method and device |
CN106649863A (en) * | 2016-12-30 | 2017-05-10 | 天津市测绘院 | Non-structured data management method and apparatus |
CN106909689A (en) * | 2017-03-07 | 2017-06-30 | 山东浪潮云服务信息科技有限公司 | A kind of data fusion method and device |
CN107257511A (en) * | 2017-06-06 | 2017-10-17 | 苏州小雨伞网络科技有限公司 | A kind of striding equipment data query method, system |
CN109087707A (en) * | 2018-07-18 | 2018-12-25 | 上海理工大学 | It is a kind of for establishing the method and apparatus of lung image database |
CN110968615B (en) * | 2018-09-30 | 2023-05-23 | 北京国双科技有限公司 | Data query method and device |
CN109408689B (en) * | 2018-10-24 | 2020-11-24 | 北京金山云网络技术有限公司 | Data acquisition method, device and system and electronic equipment |
CN109710602A (en) * | 2018-12-26 | 2019-05-03 | 中科曙光国际信息产业有限公司 | Data model detection method and device |
CN109829073B (en) * | 2018-12-29 | 2020-11-24 | 深圳云天励飞技术有限公司 | Image searching method and device |
CN111831684B (en) * | 2019-04-15 | 2024-04-05 | 北京沃东天骏信息技术有限公司 | Data query method, device and computer readable storage medium |
CN111897824A (en) * | 2020-03-25 | 2020-11-06 | 上海云励科技有限公司 | Data operation method, device, equipment and storage medium |
CN117271562B (en) * | 2023-11-21 | 2024-01-19 | 成都凌亚科技有限公司 | Data acquisition processing method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101164039A (en) * | 2005-03-02 | 2008-04-16 | 谷歌公司 | Generating structured information |
CN101477568A (en) * | 2009-02-12 | 2009-07-08 | 清华大学 | Integrated retrieval method for structured data and non-structured data |
CN102096673A (en) * | 2009-12-11 | 2011-06-15 | 西软软件股份有限公司 | Full text retrieval method for structured data and unstructured data |
CN103154996A (en) * | 2010-10-25 | 2013-06-12 | 惠普发展公司,有限责任合伙企业 | Providing information management |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080263006A1 (en) * | 2007-04-20 | 2008-10-23 | Sap Ag | Concurrent searching of structured and unstructured data |
-
2013
- 2013-08-19 CN CN201310362238.4A patent/CN103425780B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101164039A (en) * | 2005-03-02 | 2008-04-16 | 谷歌公司 | Generating structured information |
CN101477568A (en) * | 2009-02-12 | 2009-07-08 | 清华大学 | Integrated retrieval method for structured data and non-structured data |
CN102096673A (en) * | 2009-12-11 | 2011-06-15 | 西软软件股份有限公司 | Full text retrieval method for structured data and unstructured data |
CN103154996A (en) * | 2010-10-25 | 2013-06-12 | 惠普发展公司,有限责任合伙企业 | Providing information management |
Non-Patent Citations (1)
Title |
---|
异构数据集成系统中的智能查询研究及实现;黄海;《中国优秀硕士学位论文全文数据库 信息科技辑》;20040430;I138-225 * |
Also Published As
Publication number | Publication date |
---|---|
CN103425780A (en) | 2013-12-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103425780B (en) | The querying method of a kind of data and device | |
Singh et al. | Data management for developing digital twin ontology model | |
Clifford et al. | Tracking provenance in a virtual data grid | |
Morel et al. | The REBOOT environment (software reuse) | |
Eichler et al. | Modeling metadata in data lakes—A generic model | |
US9824128B1 (en) | System for performing single query searches of heterogeneous and dispersed databases | |
US8639712B2 (en) | Method and module for creating a relational database schema from an ontology | |
US10387401B2 (en) | Version control of records in an electronic database | |
Yuan et al. | A linked data approach for geospatial data provenance | |
Konstantinou et al. | Exposing scholarly information as linked open data: RDFizing DSpace contents | |
Lee et al. | An architecture for retaining and analyzing visual explorations of databases | |
US8271442B2 (en) | Formats for database template files shared between client and server environments | |
Thalheim | Component development and construction for database design | |
Munir et al. | Provision of an integrated data analysis platform for computational neuroscience experiments | |
Martin et al. | Semantic linking of research infrastructure metadata | |
Sevilmis et al. | Knowledge sharing by information retrieval in the semantic web | |
Novak et al. | Prototype of a Web ETL tool | |
Spaniol et al. | ATLAS: A web-based software architecture for multimedia e-learning environments in virtual communities | |
Murphy et al. | A web portal that enables collaborative use of advanced medical image processing and informatics tools through the Biomedical Informatics Research Network (BIRN) | |
Malaverri et al. | A Tool based on Web Services to Query Biodiversity Information. | |
Zoghlami et al. | Using a SKOS engine to create, share and transfer terminology data sets | |
Fosci et al. | Soft Querying Features in GeoJSON Documents: The GeoSoft Proposal | |
CN115774767B (en) | Geographic information metadata processing method and device | |
Aggarwal et al. | Employing graph databases as a standardization model for addressing heterogeneity and integration | |
Díaz et al. | Model-aware Wiki analysis tools: the case of HistoryFlow |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |