CN103970880B - Distributed Multi data pick-up method - Google Patents
Distributed Multi data pick-up method Download PDFInfo
- Publication number
- CN103970880B CN103970880B CN201410208607.9A CN201410208607A CN103970880B CN 103970880 B CN103970880 B CN 103970880B CN 201410208607 A CN201410208607 A CN 201410208607A CN 103970880 B CN103970880 B CN 103970880B
- Authority
- CN
- China
- Prior art keywords
- data source
- guid
- data
- source table
- condition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2282—Tablespace storage structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
Abstract
Present invention relates particularly to Distributed Multi data pick-up methods;The following steps are included: step 101: establishing Data source table first against external data source DB and field structure, step 102: establishing Data source table;Step 103 establishes internal data source table;Step 104: selection need to introduce data field and step 105: addition tables of data location field GUID, step 106: generating internal data source table structure;Step 107:GUID positions code generator, step 108: generating the internal data source table with alignment code;Step 109: establishing program location data table, step 110: constraint condition Intelligence Generator, step 111: user's typing screening conditions, step 112: cell location marks screening conditions and color, step 113: identifying table name, field name, record condition, time, customer name by GUID;The GUID condition of step 114, step 115: generating SELECT;Step 116: obtaining target data, step 117: clustering judgement;Step 118: analysis report table;User is allowed to obtain any amount of garbled data result needed.
Description
Technical field
The present invention relates to technical field of data processing, and in particular to Distributed Multi data pick-up method.
Background technique
The major way of usual data analysis technique is to obtain the data element for the condition that meets by data screening.Mesh
Before, it realizes data Analysis and Screening, is that number is realized by the sentence that programs in the data platforms such as SQL, Access, Oracle
According to screening, advantage is can be by its statement fuction etc., and the write statement that programs realizes various the selection results.But it can not be
It is directly operated by mouse or keyboard click commands interfaceization on its data platform, realizes data screening, be unable to direct construction and go out
Screening conditions are bound and recorded with data element.In Excel software, screening conditions can be set and obtain the selection result, but
User's screening conditions can not save, and can not bind screening conditions and cell;Other existing China and foreign countries' applications or special-purpose software,
Do not occur the Distributed Multi Data Extraction Technology that the claims are related in the information published yet.
Summary of the invention
It is an object of the present invention to solve the above problems, Distributed Multi data pick-up method is provided.
To achieve the above object, the present invention provides Distributed Multi data pick-up methods, comprising the following steps:
Step 101: being directed to external data source DB and field structure;
Step 102: Data source table is established, then carries out determining whether to establish internal data source table again, if it is, into
Enter step 103: establishing internal data source table;If otherwise entering step 107:GUID positioning code generator;If necessary to establish
Internal data source table, then entering step 104: selection need to introduce data field and step 105: addition internal data source table positioning
Field GUID enters back into step 106: generating internal data source table structure;Step 107:GUID positioning code generator is subsequently entered,
The internal data source table structure of generation is handled by GUID positioning code generator, subsequently into step 108: generating band positioning
The internal data source table of code;109 are entered step for the internal data source table with alignment code is generated: establishing location data table, it is right
It establishes location data table to be made to determine whether to want generation step 110: constraint condition Intelligence Generator, if it is not, then entering step
113: table name, field name, record condition, time, customer name are identified by GUID;If it is, entering step 110: constraint condition
Intelligence Generator enters back into step 111: user's typing screening conditions, is screened by constraint condition Intelligence Generator to user's typing
Condition carries out judging whether to meet, and 112 are entered step if eligible: cell location marks screening conditions and color,
113 are entered step if ineligible: table name, field name, record condition, time, customer name are identified by GUID;By step
113 obtain the GUID condition of step 114, enter step 115 for GUID condition: generating SELECT statement;Hence into step
116: obtaining target data, enter step 117 for the target data of acquisition: clustering judgement;It is final to divide for by cluster
Analysis judgement obtains step 118: analysis report table.
The invention has the following advantages: can be allowed in the case where not writing program statement using method of the invention
User's can completely sets any amount of data screening condition, obtains any amount of garbled data of needs as a result, and will
Any amount of screening conditions combination is recorded in tables of data.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is Distributed Multi data pick-up method process flow diagram of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Referring to Fig. 1, the present invention provides Distributed Multi data pick-up methods, comprising the following steps:
Step 101: being directed to external data source DB and field structure;
Step 102: Data source table is established, then carries out determining whether to establish internal data source table again, if it is, into
Enter step 103: establishing internal data source table;If otherwise entering step 107:GUID positioning code generator;If necessary to establish
Internal data source table, then entering step 104: selection need to introduce data field and step 105: addition internal data source table positioning
Field GUID enters back into step 106: generating internal data source table structure;Step 107:GUID positioning code generator is subsequently entered,
The internal data source table structure of generation is handled by GUID positioning code generator, subsequently into step 108: generating band positioning
The internal data source table of code;109 are entered step for the internal data source table with alignment code is generated: establishing location data table, it is right
It establishes location data table to be made to determine whether to want generation step 110: constraint condition Intelligence Generator, if it is not, then entering step
113: table name, field name, record condition, time, customer name are identified by GUID;If it is, entering step 110: constraint condition
Intelligence Generator enters back into step 111: user's typing screening conditions, is screened by constraint condition Intelligence Generator to user's typing
Condition carries out judging whether to meet, and 112 are entered step if eligible: cell location marks screening conditions and color,
113 are entered step if ineligible: table name, field name, record condition, time, customer name are identified by GUID;By step
113 obtain the GUID condition of step 114, enter step 115 for GUID condition: generating SELECT statement;Hence into step
116: obtaining target data, enter step 117 for the target data of acquisition: clustering judgement;It is final to divide for by cluster
Analysis judgement obtains step 118: analysis report table.
In two-dimensional data table, using cell location information as binding point, data Analysis and Screening set by user is recorded
Condition, and several data screening conditions that capable each unit lattice have been set carry out mathematical logic association, sieve using data
Sentence is selected to extract the data sample for meeting combination condition.And the screening conditions set that several cells and data line are formed, it will
User needs the data screened, and is completely associated with each data cell in a manner of sentence, is distributed in data cell in form, shape
At distributed multipoint data extraction technique.
The present invention for example: 2-D data sets column mark X and line identifier Y:
It arranges set X={ X1, X2, X3, X4, X5......Xn }
Row set Y={ Y1, Y2, Y3, Y4, Y5......Yn }
I is line number value: i={ 1,2,3,4,5......m },
J is columns value: j={ 1,2,3,4,5......n }
Column subset X: the complete or collected works of Xj={ Dj1, Dj2, Dj3, Dj4, Dj5......Djm } Xj ∈ XY jth column
The complete or collected works of row subset Y:Yi={ Di1, Di2, Di3, Di4, Di5......Din } Yi ∈ i-th row of XY
Ranks subset: Dxy={ Dij }
Data cell (element) D:Dij
Dxj ∈ XjDxj is the subset of jth column set;
Dyi ∈ XiDyi is the subset of the i-th row set.
First, it imposes a condition, extracts column sample set:
Sample drawn condition Pij is set in cells D ij, the subset of elements Dxj for meeting condition Pij is sought from field column:
Dxj ∈ Xj is indicated are as follows: Dxj={ Xji | Pij } Pij be the condition element for obtaining Xj column and gathering.
Being expressed as Dxj set is the sample set for meeting Pij extracted from j column set.
Second, it is expert at and sets the set of circumstances Pi of sample drawn in record:
Pi={ Pxj }
The multiple combinations to impose a condition in each column (X) of the i-th row are expressed as, these set of circumstances are according to interrelated logic shape
Set of circumstances at mathematical logic set, as the sample drawn in complete or collected works XY.
Third will extract multiple lines and multiple rows element samples subset D xy by set of circumstances Pi from XY complete or collected works:
Dxy=XY | Pi }
4th, Pn are the set of the condition Pi of n row, will extract multiple groups sample set Dxy, we set entire two dimension thus
Condition complete or collected works in data are Pn, then: Pn={ Pi }
The sample set of extraction is Dn:Dn={ Dxy | Pn } Dn ∈ XY
Using method of the invention, in two-dimensional data table, row ID number with uniqueness, the field with specific table are established
The location information of the corresponding each data cell of locking, application program are generated at software operation interface, by user in software interface
According to actual needs, set data screening condition and mathematical logic relationship, believed data cells position by programming
Breath combines binding with the screening conditions, shows as effectively describing user's screening analysis condition in corresponding table unit lattice, right
The cell answered, which executes screening, will acquire the different data sample of several groups.
The present invention is theoretical application message, set theory and computer technology etc., summarizes information-intensive society and analyzes freely data
The needs of documenting analysis condition and achievement, proposition bind the screening conditions to multidimensional data with corresponding data element, record
User needs the garbled data sample obtained.Each data element contained by multidimensional data is provided with the ring for recording different screening conditions
Border, to generate multipoint data extract function.Such as in similar two-dimensional data table, screening conditions are recorded in data element correlation
In information position, after user sets screening conditions, expected garbled data sample can be filtered out in respective cells.It is counted
Theory sets conditional set based algorithm according to being to be based on subclass condition, the selected subset element samples from complete or collected works, and builds
It stands using condition as the set of object, to obtain the subset of diversification.It is then that skill is merged by information from the angle of information theory
Art obtains the information with homogeney (similar) according to information requirement from overall information, and realizes any amount of class condition,
Obtain the clustering information sample of diversification.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (1)
1. Distributed Multi data pick-up method, it is characterised in that: the following steps are included:
Step 101: being directed to external data source DB and field structure;
Step 102: establishing Data source table, then carry out determining whether to establish internal data source table again, if it is, entering step
Rapid 103: establishing internal data source table;If otherwise entering step 107:GUID positioning code generator;If necessary to establish inside
Data source table, then entering step 104: selection need to introduce data field and step 105: addition internal data source table location field
GUID enters back into step 106: generating internal data source table structure;Step 107:GUID positioning code generator is subsequently entered, by
GUID positioning code generator handles the internal data source table structure of generation, subsequently into step 108: generating band alignment code
Internal data source table;109 are entered step for the internal data source table with alignment code is generated: location data table are established, to building
Vertical location data table is made to determine whether to want generation step 110: constraint condition Intelligence Generator, if it is not, then entering step
113: table name, field name, record condition, time, customer name are identified by GUID;If it is, entering step 110: constraint condition
Intelligence Generator enters back into step 111: user's typing screening conditions, is screened by constraint condition Intelligence Generator to user's typing
Condition carries out judging whether to meet, and 112 are entered step if eligible: cell location marks screening conditions and color,
113 are entered step if ineligible: table name, field name, record condition, time, customer name are identified by GUID;By step
113 obtain the GUID condition of step 114, enter step 115 for GUID condition: generating SELECT statement;Hence into step
116: obtaining target data, enter step 117 for the target data of acquisition: clustering judgement;It is final to divide for by cluster
Analysis judgement obtains step 118: analysis report table.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410208607.9A CN103970880B (en) | 2014-05-17 | 2014-05-17 | Distributed Multi data pick-up method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410208607.9A CN103970880B (en) | 2014-05-17 | 2014-05-17 | Distributed Multi data pick-up method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103970880A CN103970880A (en) | 2014-08-06 |
CN103970880B true CN103970880B (en) | 2018-12-18 |
Family
ID=51240377
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410208607.9A Active CN103970880B (en) | 2014-05-17 | 2014-05-17 | Distributed Multi data pick-up method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103970880B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110909256B (en) * | 2019-11-20 | 2020-11-24 | 华育昌(肇庆)智能科技研究有限公司 | Artificial intelligence information filtering system for computer |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102339323A (en) * | 2011-11-11 | 2012-02-01 | 江苏鸿信系统集成有限公司 | Data extracting, scheduling and displaying method focused on DB2 data warehouse |
CN102902750A (en) * | 2012-09-20 | 2013-01-30 | 浪潮齐鲁软件产业有限公司 | Universal data extraction and conversion method |
CN103064659A (en) * | 2011-10-21 | 2013-04-24 | 镇江金软计算机科技有限责任公司 | Software as a service (SAAS) model based on metadata extraction user-defined worksheet system |
CN103235807A (en) * | 2013-04-19 | 2013-08-07 | 浪潮集团山东通用软件有限公司 | Data extracting and processing method supporting high-concurrency large-volume data |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9727628B2 (en) * | 2008-08-11 | 2017-08-08 | Innography, Inc. | System and method of applying globally unique identifiers to relate distributed data sources |
US8775476B2 (en) * | 2010-12-30 | 2014-07-08 | Skai, Inc. | System and method for creating, deploying, integrating, and distributing nodes in a grid of distributed graph databases |
-
2014
- 2014-05-17 CN CN201410208607.9A patent/CN103970880B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103064659A (en) * | 2011-10-21 | 2013-04-24 | 镇江金软计算机科技有限责任公司 | Software as a service (SAAS) model based on metadata extraction user-defined worksheet system |
CN102339323A (en) * | 2011-11-11 | 2012-02-01 | 江苏鸿信系统集成有限公司 | Data extracting, scheduling and displaying method focused on DB2 data warehouse |
CN102902750A (en) * | 2012-09-20 | 2013-01-30 | 浪潮齐鲁软件产业有限公司 | Universal data extraction and conversion method |
CN103235807A (en) * | 2013-04-19 | 2013-08-07 | 浪潮集团山东通用软件有限公司 | Data extracting and processing method supporting high-concurrency large-volume data |
Non-Patent Citations (1)
Title |
---|
半结构化文档中非标记化表格的抽取;宋强等;《计算机工程》;20050930;第31卷(第18期);第81-83,171页 * |
Also Published As
Publication number | Publication date |
---|---|
CN103970880A (en) | 2014-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106599230A (en) | Method and system for evaluating distributed data mining model | |
Taylor et al. | R package wgaim: QTL analysis in bi-parental populations using linear mixed models | |
CN101739454B (en) | Data processing system | |
Li et al. | Assembly processes of waterbird communities across subsidence wetlands in China: A functional and phylogenetic approach | |
CN109101519B (en) | Information acquisition system and heterogeneous information fusion system | |
Hankin | Introducing untb, an R package for simulating ecological drift under the unified neutral theory of biodiversity | |
CN108491228A (en) | A kind of binary vulnerability Code Clones detection method and system | |
CN110336838A (en) | Account method for detecting abnormality, device, terminal and storage medium | |
CN107092932A (en) | A kind of multi-tag Active Learning Method that tally set is relied on based on condition | |
CN108446720A (en) | Abnormal deviation data examination method and system | |
CN110019116A (en) | Data traceability method, apparatus, data processing equipment and computer storage medium | |
CN103970880B (en) | Distributed Multi data pick-up method | |
Burdick et al. | Table extraction and understanding for scientific and enterprise applications | |
Chalmandrier et al. | Comparing spatial diversification and meta-population models in the Indo-Australian Archipelago | |
CN105843605A (en) | Data mapping data and device | |
CN103227810B (en) | A kind of methods, devices and systems identifying remote desktop semanteme in network monitoring | |
CN109064036B (en) | Ecosystem service supply and demand index change detection method facing management field | |
CN112000389B (en) | Configuration recommendation method, system, device and computer storage medium | |
CN103092617A (en) | High reliability workflow development method based on backup services | |
CN110363198A (en) | A kind of neural network weight matrix fractionation and combined method | |
CN115995092A (en) | Drawing text information extraction method, device and equipment | |
Lu et al. | Bi-temporal Attention Transformer for Building Change Detection and Building Damage Assessment | |
CN105184168B (en) | The method for tracing that the association of android system source code loophole influences | |
CN114331226A (en) | Intelligent enterprise demand diagnosis method and system and storage medium | |
CN114328681A (en) | Data conversion method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |