CN103425780B - The querying method of a kind of data and device - Google Patents

The querying method of a kind of data and device Download PDF

Info

Publication number
CN103425780B
CN103425780B CN201310362238.4A CN201310362238A CN103425780B CN 103425780 B CN103425780 B CN 103425780B CN 201310362238 A CN201310362238 A CN 201310362238A CN 103425780 B CN103425780 B CN 103425780B
Authority
CN
China
Prior art keywords
data
inquiry
unstructured data
assembly
unstructured
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310362238.4A
Other languages
Chinese (zh)
Other versions
CN103425780A (en
Inventor
王颖
李晋钢
宋怀明
苗艳超
刘新春
邵宗有
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Co Ltd
Original Assignee
Dawning Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Co Ltd filed Critical Dawning Information Industry Co Ltd
Priority to CN201310362238.4A priority Critical patent/CN103425780B/en
Publication of CN103425780A publication Critical patent/CN103425780A/en
Application granted granted Critical
Publication of CN103425780B publication Critical patent/CN103425780B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses querying method and the device of a kind of data, belong to mass data processing technical field.Described method includes: obtain correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery;When the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is resolved, obtain the data having pattern;The described data having pattern are associated inquiry with structural data, obtain the result set of described correlation inquiry.The present invention, by processing the independent parsing of unstructured data, obtains the data having pattern, wherein need not artificially get involved, automatically can resolve unstructured data, it is achieved that unstructured data and the correlation inquiry of structural data.

Description

The querying method of a kind of data and device
Technical field
The present invention relates to mass data processing technical field, particularly to querying method and the dress of a kind of data Put.
Background technology
Along with the development of data service, in same business, general existing structure data, also there is non-structural Changing data, the most two kinds of data also have certain corresponding relation, need association process.Wherein, Structural data refers to row data, is stored in lane database, can carry out the number of logical expression by bivariate table structure According to, and be difficult to i.e. be referred to as unstructured data, destructuring by the data that data base's two dimension logical table shows Data include the office documents of possessive case formula, text, picture, XML, HTML, all kinds of form, image With audio/visual information etc..
At the Data processing of prior art, can directly use relational data stock for structural data Storage, and in relevant database, realize the inquiry to structural data, filter or calculate;For non-structural Change data, use MapReduce to carry out batch processing, including to the inquiry of unstructured data, filtration Or calculate.In prior art, the process to structural data and unstructured data is carried out separately, i.e. Structural data internal correlation is inquired about, and unstructured data internal correlation is inquired about, it is impossible to support structural data And so how correlation inquiry between unstructured data, realize the pass of structural data and unstructured data The problem being to need to solve is ask in joint investigation.
Summary of the invention
Can not the asking of auto-associating inquiry in order to solve structural data and unstructured data in prior art Topic, provides querying method and the device of a kind of data in the embodiment of the present invention, described technical scheme is as follows:
On the one hand, it is provided that the querying method of a kind of data, for structural data and destructuring number According to being associated inquiry, described method includes:
Acquisition correlation inquiry is asked, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
When the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call institute State the analysis mode that unstructured data assembly is corresponding, described unstructured data assembly is resolved, To the data having pattern;
The described data having pattern are associated inquiry with structural data, obtain the knot of described correlation inquiry Fruit collection.
On the other hand, it is provided that the inquiry unit of a kind of data, for structural data and destructuring Data are associated inquiry, and described device includes:
Task-decomposing module, is used for obtaining correlation inquiry request, and is decomposed into many by the request of described correlation inquiry Individual sub-inquiry request;
Unstructured data parsing module, for including destructuring number when the request of the plurality of subquery During according to the inquiry request of assembly, call the analysis mode that described unstructured data assembly is corresponding, to described non- Structural data assembly resolves, and obtains the data having pattern;
Correlation inquiry module, for the described data having pattern are associated inquiry with structural data, Result set to described correlation inquiry.
The technical scheme that the embodiment of the present invention provides has the benefit that
Acquisition correlation inquiry is asked, and the request of described correlation inquiry is decomposed into the request of multiple subquery;Work as institute When stating the inquiry request that the request of multiple subquery includes unstructured data assembly, call described non-structural Change the analysis mode that data package is corresponding, described unstructured data assembly is resolved, obtains there is pattern Data;The described data having pattern are associated inquiry with structural data, obtain described correlation inquiry Result set.By the independent parsing of unstructured data is processed, obtain the data having pattern, the most not Need artificial intervention, can automatically unstructured data be resolved, it is achieved that unstructured data and knot The correlation inquiry of structure data.
Accompanying drawing explanation
The specific embodiment of the present invention is described below with reference to accompanying drawings, wherein:
Fig. 1 is the flow chart of the querying method of a kind of data provided in the embodiment of the present invention one;
Fig. 2 is the flow chart of the querying method of a kind of data provided in the embodiment of the present invention two;
Fig. 3 is the platform schematic diagram after the initialization provided in the embodiment of the present invention two;
Fig. 4 is the schematic diagram of the inquiry unit of a kind of data provided in the embodiment of the present invention three;
Fig. 5 is the schematic diagram of the inquiry unit of the another kind of data provided in the embodiment of the present invention three.
Detailed description of the invention
In order to make technical scheme and advantage clearer, below in conjunction with accompanying drawing to the present invention's Exemplary embodiment is described in more detail, it is clear that described embodiment is only the one of the present invention Section Example rather than all embodiments exhaustive.
The data pattern related in the present embodiment is that the one to data is expressly recited mode, and data base deposits The pattern of data, just because of there being data pattern, the data structure that could construct complexity is set up between data Internal relation and complex relationship, thus constitute the global structure pattern of data.Data pattern is based on selected Data model carries out " type " aspect and portrays data, and corresponding " example " is then to data " value " The description of aspect.Prior data model, could discuss corresponding data pattern according to it, have data pattern, just Corresponding example can be obtained according to this pattern.Generally data have clear and definite field, and type is exactly data pattern, It is referred to as structural data, is otherwise non-mode, for unstructured data, similar picture, video, audio frequency literary composition Part etc..
The correlation inquiry related in the present embodiment is not only the join of two bivariate tables in relevant database Operation, and refer to have join to operate between structuring and the two kinds of data of destructuring, union operates, Cascade operations etc., structuring and destructuring are as the data object of equality, by the behaviour to two kinds of data objects It is fused in unified operation.
The execution of correlation inquiry in this paper, mainly includes but not limited to following three: structural data with Unstructured data carries out join operation;Structural data sQL inquiry and unstructured data MapReduce The cascade processed;Structured data query result and unstructured data carry out Union operation.
In order to those skilled in the art can more be apparent from the present invention involved correlation inquiry, existing Provide the scene of following three kinds of correlation inquiries, but during concrete execution, be likely not limited to following three kinds of passes The scene that joint investigation is ask, is not specifically limited in this present embodiment.
Correlation inquiry scene one:
Structural data and unstructured data carry out join operation, and this type of is applied and is divided into again two kinds of situations:
The first situation: structuring and unstructured data belong to an object, have identical associated characters Section, and be one to one.
For this correlation inquiry scene, it is illustrated as a example by the application scenarios of hospital's case history below: each The case history of patient has structural data, including the age, sex, medical history, the last time see a doctor the date and State of an illness descriptions etc., also have unstructured data, including: CT sheet, electrocardiogram and oscillogram etc..To disease The tracking of patient is treated by the analysis meeting of people's medical record data, and illness analysis, pathogenic factor analysis etc. has the biggest Helping, such as, bigger patient of fluctuating electrocardiogram is analyzed, it is necessary to search electrocardiogram fluctuation is bigger Patient's feature, then analyze, firstly the need of providing unstructured data analytical tool, the heart that fluctuation is big Electrograph, then exports other information of patient corresponding to this type of picture, according to the information exported to such patient It is analyzed.This process is through unstructured data (electrocardiogram) and goes index structure data (sick People's information) process.
In use above scene, structural data and unstructured data have relation one to one, structuring number Electrocardiogram sheet path, CT sheet figure path, oscillogram path, unstructured data road can be stored according to Can be with structured data key value in footpath, such as identification card number or unique field identifying patient of patient, Structural data and unstructured data store an identical associated column simultaneously.Associated column is exactly two assemblies The row that (structuring and destructuring) all exists, two parts data are associated by this associated column, should In scene, key value is exactly the identical associated column that structural data stores with unstructured data simultaneously.
Second case: structural data belongs to different objects from unstructured data, has associate field, But it not one to one.Belonging to the situation of multi-to-multi, join operation is exactly by structural data and non-structural Change data and do cartesian product calculating.Such as select col1, col2, col3 from obj1 where obj1.col1 =obj2.col1;Wherein obj1 is unstructured data, and obj2 is structural data.This is operation associated is exactly Select the col1 of obj Data Entry present in the obj2.
Correlation inquiry scene two:
The cascade that structural data processes with unstructured data MapReduce, it is necessary first to destructuring Data process output through MapReduce the result of pattern, then stores identical with structural data Platform in be associated operation.
Select col1, col2, col3from (select col1, col2, col4from obj2) as t1join (select Col3, col4from obj1) as t2on t1.col4=t2.col4;
Data object obj1 only comprises structural data assembly, and data object obj2 only comprises destructuring number According to assembly.Above query statement achieves acquisition association results collection from unstructured data and structural data Cascade operation.
Correlation inquiry scene three:
Structural data belongs to different objects from unstructured data, and the two carries out uion operation.General feelings Under condition, first this type of inquiry can realize unstructured data and extract for the same data pattern of structural data (a series of row name, row type is data pattern).Union behaviour is carried out the most again with structural data Make.Correlation inquiry is exactly the process extracted to be shielded user.Wherein, data pattern is the one to data Being expressly recited mode, data base deposits the pattern of data, just because of there being data pattern, could construct complexity Data structure sets up the internal relation between data and complex relationship, thus constitutes the global structure mould of data Formula.
In the present embodiment, query statement does not has phraseological district with the correlation inquiry statement of structural data objects Not, simply the inquiry of unstructured data will be encapsulated by bottom.Structuring and unstructured data Correlation inquiry scene includes but not limited to three kinds of scenes above, and proposing three kinds of scenes above is to preferably solve Release correlation inquiry technology in this paper.
Embodiment one
The present embodiment provides the querying method of a kind of data, for structural data and destructuring number According to being associated inquiry, seeing Fig. 1, method flow includes:
101, obtain correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
102, when the request of the plurality of subquery includes the inquiry request to unstructured data assembly, Call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is solved Analysis, obtains the data having pattern;
103, the described data having pattern are associated inquiry with structural data, obtain the joint investigation of described pass The result set ask.
In the present embodiment, there are the data of pattern to refer to unstructured data assembly and resolve by the pattern of regulation The data gone out, these data have the pattern of regulation.
In another embodiment, before the request of described acquisition correlation inquiry, also include:
Initializing structural data and/or unstructured data, described initialized operation includes: wound Building data object and unified metadata, wherein, described unified metadata includes: containing structural data The metadata information that the unified data object of assembly and unstructured data assembly is relevant, described data object Including: structural data assembly and/or unstructured data assembly, and described unstructured data assembly Analysis mode.Wherein relevant metadata information includes: the title of data object, module information, assembly mould Formula and other attribute information relevant to Compatible object.
Preferably, the analysis mode of described unstructured data assembly includes: refer to for unstructured data assembly Pattern after fixed parsing class and described unstructured data analyzing component.
In another embodiment, described in call the analysis mode that described unstructured data assembly is corresponding before, Also include:
Ask to be packaged, with to described unstructured data to the subquery of described unstructured data assembly Assembly carries out independent process.
Preferably, described the described data having a pattern are associated inquiry with structural data, obtain described The result set of correlation inquiry, including:
The described data having pattern are imported in the storage platform identical with structural data;
To the described data having pattern and structuring number in the storage platform identical with described structural data According to being associated inquiry, obtain the result set of described correlation inquiry.
The beneficial effect of the present embodiment includes: obtain correlation inquiry request, and by the request point of described correlation inquiry Solve and ask for multiple subqueries;When the request of the plurality of subquery includes looking into unstructured data assembly When asking request, call the analysis mode that described unstructured data assembly is corresponding, to described unstructured data Assembly resolves, and obtains the data having pattern;The described data having pattern are closed with structural data Joint investigation is ask, and obtains the result set of described correlation inquiry.By the independent parsing of unstructured data is processed, Obtain the data having pattern, wherein need not artificially get involved, arrange when can automatically call data definition Unstructured data is resolved by data parsing mode, it is achieved that unstructured data and structural data Correlation inquiry.
Embodiment two
See Fig. 2, the querying method of a kind of data provided in the present embodiment, it is possible to achieve structural data Inquire about with the auto-associating of unstructured data.In the present embodiment, structural data is mainly stored in structuring Storage platform, if enormous amount, autgmentability is restricted, it would however also be possible to employ distributed data base, distribution Formula file system or NoSQL store software structured data.The unstructured data of magnanimity includes But it is not limited to use the mode of distributed file system storage.Due to structural data and unstructured data Storage mode may be different, and therefore two types data are likely to be stored on different storage platforms, as Fruit to realize the association of the two, needs to support the data association operation of difference storage platform, wherein will relate to To the process of Data Migration, simultaneously in order to realize the association of the two, in addition it is also necessary to shield the different storage of bottom and put down Platform, the process that encapsulation of data migrates.Concrete method flow includes:
201, structural data and/or unstructured data are initialized.
In the present embodiment, correlation inquiry premise is to formulate specific data definition mode, unified structure data Store and all represent with data object with unstructured data storage.In the present embodiment, data object is exactly logarithm According to a general name, be data definition in a term.A such as form in relevant database can To be referred to as a kind of data object, the data object mentioned in the present embodiment is then to have widely implication, both A set of structural data can be represented, a unstructured data set can be referred to, it is also possible to refer to same Shi Hanyou structuring and a mixing set of two assemblies of unstructured data.
In the present embodiment, data definition mode includes but not limited to: create data_obj Struct_columns (schema), nonstruct_columns (schema);[location:path] fileformat Inputformat, outputformat.Wherein, create is the keyword of data definition.Data_obj is several Name according to object.Struct_columns represents structural data.Nonstruct_columns represents non-knot Structure data.[location:path] is the file directory of unstructured data.Fileformat is instruction tray The keyword of formula.Inputformat refers to the parsing form class wrapper of digital independent in Hadoop platform, continues Hold this type of can realize reading data from different data sources, resolve the data of different-format.Outputformat Referring to that in Hadoop platform, data are written to the output format class wrapper of file system, inheriting this type of can be real Now data are exported in a particular format different storage engines.In the present embodiment, if destructuring Data storing platform is HDFS, or other distributed file systems that Hadoop supports, then can join Examine the framework of digital independent and the output of Hadoop platform encapsulation, inputformat, outputformat are many Kind of different data parsing and output organization mode can inherit the two class, it is achieved wherein most critical Row resolves, the function of row analysis mode.So user has only to provide this framework to need two analytical functions, Just can realize the parsing to unstructured data on a distributed.
In the present embodiment, the attribute of data object includes but not limited to: the pattern of data and analysis mode etc.. Pattern for unstructured data is exactly to the pattern after non-structural data parsing.Wherein, data Object can only comprise structural data assembly, it is also possible to only comprises unstructured data assembly, it is also possible to bag Containing related structural data and two assemblies of unstructured data, to this present embodiment does not do concrete limit Fixed.
Therefore the present embodiment needs unstructured data and structural data are carried out initialization operation, its In, initialized operation includes: creates data object and stores unified metadata, the most initially Platform after change, described unified metadata includes: containing structural data assembly and unstructured data group The metadata information that the data object that part is unified is correlated with.Described data object includes: structural data assembly and / or unstructured data assembly, and the analysis mode of described unstructured data assembly.Wherein relevant unit Data message includes: the title of data object, module information, component pattern and other and Compatible object phase The attribute information closed.
What deserves to be explained is, in the present embodiment, the form that structural data is specified according to storage platform in advance Storage, and read according to the mode of regulation, and the process to unstructured data actual resolved is all in inquiry During just perform because unstructured data form is not fixed, identical unstructured data passes through Different analysis modes also can obtain different data, therefore can not use fixing analysis mode, Zhi Nengti For fixing framework and the grammer of data definition, reduce the workload of user.So when initializing, Appearance can be created for unstructured data, and specify parsing class, and the pattern after parsing, this The data object of sample is just adapted for carrying out structuring and non-structured correlation inquiry.It is preferred that this enforcement In example, the analysis mode of unstructured data assembly includes: the parsing class specified for unstructured data assembly With the pattern after described unstructured data analyzing component.
What deserves to be explained is, this step needs when structured data and unstructured data to perform Step, if having completed the storage to data during the correlation inquiry carrying out data, need not perform This step, is not specifically limited this present embodiment.
202, obtain correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery.
In this step, obtaining the correlation inquiry request of user's input, the request of this correlation inquiry is probably above-mentioned Request in a correlation inquiry scene of anticipating, is not specifically limited in this present embodiment.
In the present embodiment, correlation inquiry uses the query grammar of SQL standard, but is because in data object May both include that structural data also included unstructured data, relate to the inquiry of unstructured data, institute To need original SQL analytics engine is extended, unified correlation inquiry request is decomposed into multiple Sub-SQL query, such as, decompose the inquiry request of association, obtains multiple subtask.Association statement For select col1, col2, col4from objwhere otbj.structobj.col1= Obj.nonstructobj.col1, resolves this statement, obtains multiple subquery task.Work as query statement For cascade nested statement, then according to cascade nest relation, this query statement is resolved, obtain many height and look into Inquiry task, repeats no more concrete resolving in the present embodiment.
203, when the request of multiple subqueries includes the inquiry request to unstructured data assembly, call The analysis mode that unstructured data assembly is corresponding, resolves unstructured data assembly, obtains there is mould The data of formula.
In the present embodiment, if query object has unstructured data assembly, alternatively, call described non-knot Before the analysis mode that structure data package is corresponding, also include: the son of described unstructured data assembly is looked into Inquiry request is packaged, so that described unstructured data assembly is carried out independent process.In the present embodiment, right In the parsing of unstructured data, need to be encapsulated into independent task, give unstructured data and process Engine processes.Wherein, due in advance at analysis mode defined in unstructured data assembly, so obtaining During the parsing task of the destructuring assembly that must encapsulate, can be resolved by predefined parsing class, And parse the data of designated mode, wherein, there are the data of pattern to refer to parse by the pattern of regulation Data, these data have the pattern of regulation.
In the present embodiment, if the request of multiple subquery includes the inquiry request of structural data assembly, then By fixing mode, structural data assembly is resolved, the most similarly to the prior art, to this this enforcement Example repeats no more.
204, the data having pattern are associated inquiry with structural data, obtain the result of correlation inquiry Collection.
In this step, owing to unstructured data and structural data are stored in different storage platforms, so After obtaining having the data of pattern, need this Data Migration to the storage platform identical with structural data In, and in this storage platform, the data and structural data having pattern are associated inquiry, obtain final Correlation inquiry result set, this query results can be analyzed by user further.
What deserves to be explained is, after decomposing query task, unstructured data and structural data can be distinguished Inquire about on now local storage platform, the most similarly to the prior art, in this present embodiment no longer Repeat.After unstructured data resolved obtaining having the data of pattern, then to unstructured data and structure Change data and be associated inquiry.
For the correlation inquiry technology making those skilled in the art be more clearly understood from the present invention, existing with level The inquiry of connection nested type is illustrated:
OBJ1.STRUCTOBJ data pattern: (col1, col2, col3, col4)
Data pattern after OBJ2.NONSTRUCOBJ parsing: (col3, col4)
Query statement is: select col1, col2, col4from (select col1, col2, col3from OBJ1.STRUCTOBJ) as t1 join (select col4, col3from OBJ2.NONSTRUCTOBJ) as T2on t1.col3=t2.col3;Wherein comprise nested query.
This query statement is decomposed into task flow as follows:
The first step: select col1, the simple structural data of col2, col3from OBJ1.STRUCTOBJ Inquiry.Result set R1;
Second step: select col4, col3from OBJ2.NONSTRUCTOBJ.Include data parsing process Non-structural data enquiry result set R2;
3rd step: R2 is transferred on the data platform identical with R1;
4th step: select col1, col2, col4from R1join R2on R1.col3=R2.col3, real The join inquiry of existing two relational data set.
The beneficial effect of the present embodiment includes: by unified data definition mode, it is achieved self-defining inquiry Analytics engine, it is achieved that structuring is realized by a request with destructuring correlation inquiry, and will inquiry Request is phased mission system stream at underlying translation, then by a self-defining enforcement engine, controls each rank Completing of section task, it is ensured that whole task flow completes according to the order of regulation, returns correct Query Result, Thus the problem solving the auto-associating inquiry of structural data and unstructured data.
Embodiment three
See Fig. 4, the present embodiment provide the inquiry unit of a kind of data, for structural data and Unstructured data is associated inquiry, including: Task-decomposing module 301, unstructured data resolves mould Block 302 and correlation inquiry module 303.
Task-decomposing module 301, is used for obtaining correlation inquiry request, and the request of described correlation inquiry is decomposed Ask for multiple subqueries;
Unstructured data parsing module 302, for including non-structural when the request of the plurality of subquery When changing the inquiry request of data package, call the analysis mode that described unstructured data assembly is corresponding, to institute State unstructured data assembly to resolve, obtain the data having pattern;
Correlation inquiry module 303, for the described data having pattern are associated inquiry with structural data, Obtain the result set of described correlation inquiry.
In another embodiment, seeing Fig. 5, described device also includes:
Data initialization module 304, for structural data and/or unstructured data are initialized, Described initialized operation includes: creates data object and stores unified metadata, wherein, described unification Metadata include: the data object unified to unstructured data assembly containing structural data assembly is relevant Metadata information, described data object includes: structural data assembly and/or unstructured data assembly, And the analysis mode of described unstructured data assembly.Wherein relevant metadata information includes: data pair The title of elephant, module information, component pattern and other attribute information relevant to Compatible object.
Preferably, the analysis mode of described unstructured data assembly includes: refer to for unstructured data assembly Pattern after fixed parsing class and described unstructured data analyzing component.
In another embodiment, seeing Fig. 5, described device also includes:
Package module 305, for calling described destructuring at described unstructured data parsing module 302 Before the analysis mode that data package is corresponding, ask to seal to the subquery of described unstructured data assembly Dress, to carry out independent process to described unstructured data assembly.
See Fig. 5, it is preferable that correlation inquiry module 303, including:
Migration units 303a, for importing to the storage identical with structural data by the described data having pattern In platform;
Correlation inquiry unit 303b, for having described in the storage platform identical with described structural data The data of pattern and structural data are associated inquiry, obtain the result set of described correlation inquiry.
The beneficial effect of the present embodiment includes: obtain correlation inquiry request, and by the request point of described correlation inquiry Solve and ask for multiple subqueries;When the request of the plurality of subquery includes looking into unstructured data assembly When asking request, call the analysis mode that described unstructured data assembly is corresponding, to described unstructured data Assembly resolves, and obtains the data having pattern;The described data having pattern are closed with structural data Joint investigation is ask, and obtains the result set of described correlation inquiry.By the independent parsing of unstructured data is processed, Obtain the data having pattern, wherein need not artificially get involved, can automatically unstructured data be resolved, Achieve the correlation inquiry of unstructured data and structural data.
It should be understood that the inquiry unit of the data of above-described embodiment offer is when performing data query, only It is illustrated with the division of above-mentioned each functional module, in actual application, can be as desired by above-mentioned Function distribution is completed by different functional modules, the internal structure of equipment will be divided into different function moulds Block, to complete all or part of function described above.
It addition, the inquiry unit of data that above-described embodiment provides belongs to same with the querying method embodiment of data One design, it implements process and refers to embodiment of the method, repeats no more here.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can be led to Cross hardware to complete, it is also possible to instructing relevant hardware by program and complete, described program can store In a kind of computer-readable recording medium, storage medium mentioned above can be read only memory, disk Or CD etc..
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all the present invention's Within spirit and principle, any modification, equivalent substitution and improvement etc. made, should be included in the present invention's Within protection domain.

Claims (8)

1. a querying method for data, for structural data and unstructured data being associated inquiry, it is characterised in that described method includes:
Acquisition correlation inquiry is asked, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
When the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is resolved, obtain the data having pattern;
The described data having pattern are associated inquiry with structural data, obtain the result set of described correlation inquiry;
Before the request of described acquisition correlation inquiry, also include:
Structural data and/or unstructured data are initialized, described initialized operation includes: creates data object and stores unified metadata, wherein, described unified metadata includes: containing the metadata information that the data object that structural data assembly is unified to unstructured data assembly is relevant, described data object includes: structural data assembly and/or unstructured data assembly, and the analysis mode of described unstructured data assembly.
2. the method for claim 1, it is characterised in that the analysis mode of described unstructured data assembly includes: the pattern after the parsing class specified for unstructured data assembly and described unstructured data analyzing component.
3. the method for claim 1, it is characterised in that described in call the analysis mode that described unstructured data assembly is corresponding before, also include:
Ask to be packaged, so that described unstructured data assembly is carried out independent process to the subquery of described unstructured data assembly.
4. the method for claim 1, it is characterised in that described with structural data, the described data having a pattern are associated inquiry, obtains the result set of described correlation inquiry, including:
The described data having pattern are imported in the storage platform identical with structural data;
In the storage platform identical with described structural data, the described data having pattern and structural data are associated inquiry, obtain the result set of described correlation inquiry.
5. an inquiry unit for data, for structural data and unstructured data being associated inquiry, it is characterised in that described device includes:
Task-decomposing module, is used for obtaining correlation inquiry request, and the request of described correlation inquiry is decomposed into the request of multiple subquery;
Unstructured data parsing module, for when the request of the plurality of subquery includes the inquiry request to unstructured data assembly, call the analysis mode that described unstructured data assembly is corresponding, described unstructured data assembly is resolved, obtain the data having pattern;
Correlation inquiry module, for the described data having pattern are associated inquiry with structural data, obtains the result set of described correlation inquiry ;
Described device also includes:
Data initialization module, for structural data and/or unstructured data are initialized, described initialized operation includes: creates data object and stores unified metadata, wherein, described unified metadata includes: containing the metadata information that the data object that structural data assembly is unified to unstructured data assembly is relevant, described data object includes: structural data assembly and/or unstructured data assembly, and the analysis mode of described unstructured data assembly.
6. device as claimed in claim 5, it is characterised in that the analysis mode of described unstructured data assembly includes: the pattern after the parsing class specified for unstructured data assembly and described unstructured data analyzing component.
7. device as claimed in claim 5, it is characterised in that described device also includes:
Package module, before calling, at described unstructured data parsing module, the analysis mode that described unstructured data assembly is corresponding, ask to be packaged, so that described unstructured data assembly is carried out independent process to the subquery of described unstructured data assembly.
8. device as claimed in claim 5, it is characterised in that described correlation inquiry module, including:
Migration units, for importing to the described data having pattern in the storage platform identical with structural data;
Correlation inquiry unit, for the described data having pattern and structural data being associated inquiry in the storage platform identical with described structural data, obtains the result set of described correlation inquiry.
CN201310362238.4A 2013-08-19 2013-08-19 The querying method of a kind of data and device Active CN103425780B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310362238.4A CN103425780B (en) 2013-08-19 2013-08-19 The querying method of a kind of data and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310362238.4A CN103425780B (en) 2013-08-19 2013-08-19 The querying method of a kind of data and device

Publications (2)

Publication Number Publication Date
CN103425780A CN103425780A (en) 2013-12-04
CN103425780B true CN103425780B (en) 2016-08-17

Family

ID=49650519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310362238.4A Active CN103425780B (en) 2013-08-19 2013-08-19 The querying method of a kind of data and device

Country Status (1)

Country Link
CN (1) CN103425780B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103714173B (en) * 2013-12-31 2017-08-01 深圳市华宝电子科技有限公司 A kind of search method of video source, device and monitor terminal
CN104008180B (en) * 2014-06-09 2017-04-12 北京奇虎科技有限公司 Association method of structural data with picture, association device thereof
CN105590066B (en) * 2015-12-02 2018-08-10 中国银联股份有限公司 The safe fusion method of big data of privacy is not revealed
CN108241627A (en) * 2016-12-23 2018-07-03 北京神州泰岳软件股份有限公司 A kind of isomeric data storage querying method and system
CN108268512B (en) * 2016-12-30 2020-07-31 中国移动通信集团上海有限公司 Label query method and device
CN106649863A (en) * 2016-12-30 2017-05-10 天津市测绘院 Non-structured data management method and apparatus
CN106909689A (en) * 2017-03-07 2017-06-30 山东浪潮云服务信息科技有限公司 A kind of data fusion method and device
CN107257511A (en) * 2017-06-06 2017-10-17 苏州小雨伞网络科技有限公司 A kind of striding equipment data query method, system
CN109087707A (en) * 2018-07-18 2018-12-25 上海理工大学 It is a kind of for establishing the method and apparatus of lung image database
CN110968615B (en) * 2018-09-30 2023-05-23 北京国双科技有限公司 Data query method and device
CN109408689B (en) * 2018-10-24 2020-11-24 北京金山云网络技术有限公司 Data acquisition method, device and system and electronic equipment
CN109710602A (en) * 2018-12-26 2019-05-03 中科曙光国际信息产业有限公司 Data model detection method and device
CN109829073B (en) * 2018-12-29 2020-11-24 深圳云天励飞技术有限公司 Image searching method and device
CN111831684B (en) * 2019-04-15 2024-04-05 北京沃东天骏信息技术有限公司 Data query method, device and computer readable storage medium
CN111897824A (en) * 2020-03-25 2020-11-06 上海云励科技有限公司 Data operation method, device, equipment and storage medium
CN117271562B (en) * 2023-11-21 2024-01-19 成都凌亚科技有限公司 Data acquisition processing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101164039A (en) * 2005-03-02 2008-04-16 谷歌公司 Generating structured information
CN101477568A (en) * 2009-02-12 2009-07-08 清华大学 Integrated retrieval method for structured data and non-structured data
CN102096673A (en) * 2009-12-11 2011-06-15 西软软件股份有限公司 Full text retrieval method for structured data and unstructured data
CN103154996A (en) * 2010-10-25 2013-06-12 惠普发展公司,有限责任合伙企业 Providing information management

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080263006A1 (en) * 2007-04-20 2008-10-23 Sap Ag Concurrent searching of structured and unstructured data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101164039A (en) * 2005-03-02 2008-04-16 谷歌公司 Generating structured information
CN101477568A (en) * 2009-02-12 2009-07-08 清华大学 Integrated retrieval method for structured data and non-structured data
CN102096673A (en) * 2009-12-11 2011-06-15 西软软件股份有限公司 Full text retrieval method for structured data and unstructured data
CN103154996A (en) * 2010-10-25 2013-06-12 惠普发展公司,有限责任合伙企业 Providing information management

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
异构数据集成系统中的智能查询研究及实现;黄海;《中国优秀硕士学位论文全文数据库 信息科技辑》;20040430;I138-225 *

Also Published As

Publication number Publication date
CN103425780A (en) 2013-12-04

Similar Documents

Publication Publication Date Title
CN103425780B (en) The querying method of a kind of data and device
Singh et al. Data management for developing digital twin ontology model
Clifford et al. Tracking provenance in a virtual data grid
Morel et al. The REBOOT environment (software reuse)
Eichler et al. Modeling metadata in data lakes—A generic model
US9824128B1 (en) System for performing single query searches of heterogeneous and dispersed databases
US8639712B2 (en) Method and module for creating a relational database schema from an ontology
US10387401B2 (en) Version control of records in an electronic database
Yuan et al. A linked data approach for geospatial data provenance
Konstantinou et al. Exposing scholarly information as linked open data: RDFizing DSpace contents
Lee et al. An architecture for retaining and analyzing visual explorations of databases
US8271442B2 (en) Formats for database template files shared between client and server environments
Thalheim Component development and construction for database design
Munir et al. Provision of an integrated data analysis platform for computational neuroscience experiments
Martin et al. Semantic linking of research infrastructure metadata
Sevilmis et al. Knowledge sharing by information retrieval in the semantic web
Novak et al. Prototype of a Web ETL tool
Spaniol et al. ATLAS: A web-based software architecture for multimedia e-learning environments in virtual communities
Murphy et al. A web portal that enables collaborative use of advanced medical image processing and informatics tools through the Biomedical Informatics Research Network (BIRN)
Malaverri et al. A Tool based on Web Services to Query Biodiversity Information.
Zoghlami et al. Using a SKOS engine to create, share and transfer terminology data sets
Fosci et al. Soft Querying Features in GeoJSON Documents: The GeoSoft Proposal
CN115774767B (en) Geographic information metadata processing method and device
Aggarwal et al. Employing graph databases as a standardization model for addressing heterogeneity and integration
Díaz et al. Model-aware Wiki analysis tools: the case of HistoryFlow

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant