CN105373535B - A kind of data extraction method of water quality benchmark - Google Patents
A kind of data extraction method of water quality benchmark Download PDFInfo
- Publication number
- CN105373535B CN105373535B CN201410401124.0A CN201410401124A CN105373535B CN 105373535 B CN105373535 B CN 105373535B CN 201410401124 A CN201410401124 A CN 201410401124A CN 105373535 B CN105373535 B CN 105373535B
- Authority
- CN
- China
- Prior art keywords
- data
- user
- extraction
- record
- water quality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/152—Water filtration
Abstract
The present invention discloses a kind of data extraction method of water quality benchmark, the data extraction specific steps including water quality benchmark.This method combines the new extraction scheme of traditional SQL query pattern formation, extraction target data that can be easily and effectively, and data supporting is provided to apply, standardizing.Data extraction method can carry out visible customization, available for CS, B/S framework, it can also be used to which database service interface provides technical support for water environment benchmark study.
Description
Technical field
The present invention relates to correlation data in database, the method for sharing data extraction, in particular to one kind
The data extraction method of water quality benchmark provides the technology of data support for water quality benchmark study.
Background technology
Water quality benchmark study is the great Science & Technology Demands of China's current environment management work and Research of Environmental Sciences field
Hot issue.The environmental criteria of developed country studies the history of existing upper a century, has been set up more perfect ring so far
Border Benchmark System, the particularly U.S., have issued numerous environmental criteria files and technical manual and huge benchmark dependency number
According to storehouse, the environmental criteria system for studying and establishing oneself country for countries in the world provides important references.
The relevant data structure of water environment benchmark is complicated, including environment master data, species distribution data, environmental exposure number
According to, aquatic toxicity data, sediment toxicity data, Ecology data, healthy data.Either USEPA-SSD methods,
The various benchmark projectional techniques based on different fitting functions such as EU-SSD methods or RIVM-SSD, are with a large amount of normalized number evidences
Premised on carry out deriving analysis.So we have invented a kind of data extraction method based on water quality benchmark, Ke Yifang
Just effective extraction target data, to study, application, standardization data supporting is provided.
The content of the invention
The present invention is intended to provide a kind of data extraction method of water quality benchmark, is extracted by data and solves to join in calculating
Validity and accuracy with calculating data.
The main case of technology of the present invention is as follows:
A kind of data extraction method of water quality benchmark, including water quality benchmark:
(1) data extraction method of water quality benchmark includes the following steps:
(1-1) set up data extraction system frame, system framework by database server, application server, user terminal,
Router, cable composition.Database server is as data warehouse storage data;Application server deployment middleware performs number
According to the application program of extraction operation;User terminal is supplied to user/administrator to upload data, calculate data, downloading data etc.;Road
It is used to connect database server, application server, user terminal by device and cable;
The data source of (1-2) database server in the transmission of user terminal, user terminal can be administrator or
Scientific research clients etc..Data are submitted to application server by user terminal, and application server carries out data detection according to verifying logic,
And by data extraction into data screening unit, application program is according to data category, data format, data typical value, data precision
It is compared, is processed into the mode data for meeting specification;
Data Jing Guo standardization processing are imported database server by (1-3) application server;
(1-4) application server is built according to the business relations logic such as aquatile, deposit, toxicity data, health
Vertical correlation model table.And data storage is converted into column storage, facilitate SQL query and extraction;
(1-5) aquatic environment data application at present towards with it is open be each colleges and universities, R&D institution user, administrator couple
It invites, actively application, the user of Unsolicited Grant test.System is stored in use according to user's classification situation, automatic scoring
Family classification factor table;
(1-6) user determines to calculate purpose by user terminal, in water quality benchmark, three kinds of methods can be used to carry out
Data are extracted:Manual retrieval's extraction, semiautomated retrieval extraction, full-automatic retrieval extraction;
(1-7) manual retrieval extraction is user according to the Academic Experience of itself, scientific research purpose or other situations, by hand choosing
Select the data source for participating in calculating.System determines four dimensions according to the selection of user:User is usually used in the data calculated, Yong Huji
Calculate the classification in direction, the grade of user-selected number evidence, the utilization rate of user's result of calculation, by the dimension map of user to classified body
In architecture.System carries out weight adjusting and calculates, result is stored to user data levels of detail according to user's dimension;
(1-8) semiautomated retrieval extracts, after user's craft selected section participates in the data source calculated;System is counted according to this
Calculate purpose, to the record set of artificial selection, verified automatically, shield or reject type mismatch conjunction, purpose do not meet, source not
Meet, the incongruent data source of level;System enables extracting method;Extraction data are calculated after collecting with user data;
(1-9) full-automatic retrieval extraction, system is according to this calculating purpose, and system enables extracting method, and extraction data are certainly
It is dynamic to be calculated;
(1-10) extraction algorithm
It is associated, adopted according to defined main external key association, dynamic attribute association, level identification in aquatic environment data first
Inquiry extraction is carried out with SQL traditional modes.
Then on the basis of table association extraction, the analogy of user's dimension, and real-time update user data levels of detail are increased.
The record that finally SQL pattern queries are gone out, the higher record storage of dimension analogy degree assign respectively in interim table
Identical initial positive-negative coefficient value carries out two-wheeled calculating, and a wheel positive number calculates, and a wheel negative calculates, and then positive and negative subtract each other can arrive F
Value judges the confidence level of record according to F, and the initial trusted parameter of record is 0.85, then assigns dimension than analog values R to every record
The combining weights of (user 1, and user 2 ... ...) composition.Every record has independent weight, by superposition calculation, generates new
Weight is adjusted, and every record is updated, and then carries out the big iteration of a new round again, for this calculating purpose, is obtained new
Authentication parameters.Authentication parameters are ranked up, data set with a high credibility is extracted using quantity function.
(1-11) whenever having user using new extraction, update, verification, these operations just enter number of users as new record
According to details queue.
Description of the drawings
Fig. 1 is the overall structure figure of system extracting method
Fig. 2 is the business relations logic chart of system data, is divided into Fig. 2 (a), Fig. 2 (b), Fig. 2 (c), Fig. 2 (d), Fig. 2 (e)
Specific embodiment
The specific embodiment of the present invention is described further below in conjunction with the accompanying drawings.
First, the present invention provides a kind of data extraction method based on water quality benchmark, includes the following steps:
(1) data extraction system frame is set up, system framework is by database server, application server, user terminal, road
It is made of device, cable.Database server is as data warehouse storage data;Application server deployment middleware performs data
The application program of extraction operation;User terminal is supplied to user/administrator to upload data, calculate data, downloading data etc.;Routing
Device and cable are used to connect database server, application server, user terminal.
(2) for the data source of database server in the transmission of user terminal, user terminal can be administrator or section
Grind user.Data are submitted to application server by user terminal, and application server carries out data detection according to verifying logic, and will
Data extraction enters data screening unit, is compared using according to data category, data format, data typical value, data precision,
It is processed into the level data for meeting specification.Data category is such as the country, external, experiment, a kind of, two level, to data
Grade retrieved with associating, and performanceization work.
(3) data Jing Guo standardization processing are imported database server by application server.
(4) application server is established according to the business relations logic such as aquatile, deposit, toxicity data, health
Correlation model table, as shown in Figure 2.And data storage is converted into column storage, facilitate SQL query and extraction;Application server
Data storage is converted into column storage, it only needs to read the row that application needs, and to read current line as row stores again
All row so as to reduce buffer data size, are effectively cached using database service, while reduce network transmission, and due to
It is the identical data Coutinuous store of data type, serializing and compression can be utilized to reduce the occupancy in space.
(5) aquatic environment data application at present towards with it is open be each colleges and universities, R&D institution user, administrator is to inviting
Please, actively application, the user of Unsolicited Grant test.System is stored in user according to user's classification situation, automatic scoring
Classification factor table;User's classification factor table is a dynamic data source, can be carried out according to the situation of being retrieved of user by trigger
Update.
(6)) user determines to calculate purpose by user terminal, in water quality benchmark, three kinds of methods can be used to carry out
Data are extracted:Manual retrieval's extraction, semiautomated retrieval extraction, full-automatic retrieval extraction.
(7) manual retrieval's extraction is user according to the Academic Experience of itself, scientific research purpose or other situations, by hand selection
Participate in the data source calculated.System determines four dimensions according to the selection of user:User is usually used in the data calculated, and user calculates
The classification in direction, the grade of user-selected number evidence, the utilization rate of user's result of calculation, by the dimension map of user to taxonomic hierarchies
In structure.System carries out weight adjusting and calculates, result is stored to user data levels of detail according to user's dimension.
(8) semiautomated retrieval extracts, after user's craft selected section participates in the data source calculated;System is calculated according to this
Purpose to the record set of artificial selection, is verified automatically, shields or reject type mismatch conjunction, purpose is not met, source is not inconsistent
It closes, the incongruent data source of level;System enables extracting method;Extraction data are calculated after collecting with user data.
(9) full-automatic retrieval extraction, system enable extracting method according to this calculating purpose, system, and extraction data are automatic
It is calculated.
(10) extraction algorithm
It is associated, adopted according to defined main external key association, dynamic attribute association, level identification in aquatic environment data first
Inquiry extraction is carried out with SQL traditional modes.Using optimizing on traditional mode, row is supported to cut power, attribute merging method, energy
It reduces and reads unnecessary attribute column and data transmission.
Then on the basis of table association extraction, the analogy of user's dimension, and real-time update user data levels of detail are increased..
Dimension analogy is the calculating compared for user's four dimensions.Dimension coefficients R, dimension number n, each dimension
Value represents that c is fine tuning parameter with x, and sm is to participate in calculating total degree, and sa is that current-user data calculates total degree.
The dimension calculated is stored in user's dimension table than analog values R.
The record that finally SQL pattern queries are gone out, the higher record storage of dimension analogy degree assign respectively in interim table
Identical initial positive-negative coefficient value carries out two-wheeled calculating, and a wheel positive number calculates, and a wheel negative calculates, and then positive and negative subtract each other can arrive F
Value judges the confidence level of record according to F, and the initial trusted parameter of record is 0.85, then assigns dimension than analog values R to every record
The combining weights of (user 1, and user 2 ... ...) composition.Every record has independent weight, by superposition calculation, generates new
Weight is adjusted, and every record is updated, and then carries out the big iteration of a new round again, for this calculating purpose, is obtained new
Authentication parameters.Authentication parameters are ranked up, data set with a high credibility is extracted using quantity function.
Report substitution code is as follows:
(11) whenever having user using new extraction, update, verification, these operations just enter user data as new record
Details queue.
It is above-described be only the present invention method is preferably implemented, the invention is not restricted to implement above.It is appreciated that ability
The oher improvements and changes that field technique personnel directly export or associate without departing from the spirit and concept in the present invention,
It is considered as being included within protection scope of the present invention.
Claims (1)
1. a kind of data extraction method of water quality benchmark, including water quality benchmark, it is characterised in that:
(1) data extraction method of water quality benchmark includes the following steps:
(1-1) sets up data extraction system frame, and system framework is by database server, application server, user terminal, routing
Device, cable composition;Database server is as data warehouse storage data;Application server deployment middleware performs data and carries
The application program of extract operation;User terminal is supplied to user/administrator to upload data, calculate data, downloading data;Router and
Cable is used to connect database server, application server, user terminal;
For the data source of (1-2) database server in the transmission of user terminal, user terminal can be administrator or scientific research
Data are submitted to application server by user, user terminal, and application server carries out data detection according to verifying logic, and by number
According to extraction into data screening unit, application program carries out pair according to data category, data format, data typical value, data precision
Than being processed into the mode data for meeting specification;
Data Jing Guo standardization processing are imported database server by (1-3) application server;
(1-4) application server establishes association according to aquatile, deposit, toxicity data, health business relations logic
Model table, and data storage is converted into column storage, facilitate SQL query and extraction;
(1-5) aquatic environment data application at present towards with it is open be each colleges and universities, R&D institution user, administrator is to inviting
Please, actively application, the user of Unsolicited Grant test, and system is stored in user according to user's classification situation, automatic scoring
Classification factor table;
(1-6) user determines to calculate purpose by user terminal, in water quality benchmark, three kinds of methods can be used to carry out data
Extraction:Manual retrieval's extraction, semiautomated retrieval extraction, full-automatic retrieval extraction;
(1-7) manual retrieval extraction is user according to the Academic Experience of itself, scientific research purpose or other situations, manual selection ginseng
With the data source of calculating, system determines four dimensions according to the selection of user:User is usually used in the data calculated, user calculating side
To classification, the grade of user-selected number evidence, the utilization rate of user's result of calculation, by the dimension map of user to classified body tying
In structure, system carries out weight adjusting and calculates, result is stored to user data levels of detail according to user's dimension;
(1-8) semiautomated retrieval extracts, after user's craft selected section participates in the data source calculated;System is according to this calculating mesh
, it to the record set of artificial selection, is verified automatically, shields or reject type mismatch conjunction, purpose is not met, source is not inconsistent
It closes, the incongruent data source of level;System enables extracting method;Extraction data are calculated after collecting with user data;
(1-9) automatically retrieval extraction, system enable extracting method according to this calculating purpose, system, extraction data automatically into
Row calculates;
(1-10) extraction algorithm
It is associated, used according to defined main external key association, dynamic attribute association, level identification in aquatic environment data first
SQL traditional modes carry out inquiry extraction;
Then on the basis of table association extraction, the analogy of user's dimension, and real-time update user data levels of detail are increased;
The record that finally SQL pattern queries are gone out, the higher record storage of dimension analogy degree assign identical respectively in interim table
Initial positive-negative coefficient value, carry out two-wheeled calculating, one wheel positive number calculate, one wheel negative calculate, then positive and negative subtract each other can arrive F values,
The confidence level of record is judged according to F, the initial trusted parameter of record is 0.85, then assigns dimension to every record and (is used than analog values R
Family 1, user 2 ... ...) composition combining weights;Every record has independent weight, by superposition calculation, generates new power
Section is resetted, and every record is updated, then carries out the big iteration of a new round again, for this calculating purpose, is obtained new
Authentication parameters are ranked up authentication parameters, and data set with a high credibility is extracted using quantity function;
(1-11) whenever having user using new extraction, update, verification, it is thin that these operations just enter user data as new record
Save queue.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410401124.0A CN105373535B (en) | 2014-08-15 | 2014-08-15 | A kind of data extraction method of water quality benchmark |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410401124.0A CN105373535B (en) | 2014-08-15 | 2014-08-15 | A kind of data extraction method of water quality benchmark |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105373535A CN105373535A (en) | 2016-03-02 |
CN105373535B true CN105373535B (en) | 2018-05-25 |
Family
ID=55375742
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410401124.0A Active CN105373535B (en) | 2014-08-15 | 2014-08-15 | A kind of data extraction method of water quality benchmark |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105373535B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108268515B (en) * | 2016-12-30 | 2020-07-31 | 北京国双科技有限公司 | Selection method and device for dimension of aggregation table |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101533000A (en) * | 2009-03-05 | 2009-09-16 | 重庆大学 | Method for constructing water eutrophication risk analysis model |
CN102446302A (en) * | 2011-12-31 | 2012-05-09 | 浙江大学 | Data preprocessing method of water quality prediction system |
CN103335955A (en) * | 2013-06-19 | 2013-10-02 | 华南农业大学 | Water quality on-line monitoring method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI343533B (en) * | 2007-11-08 | 2011-06-11 | Inst Information Industry | Event detection method and system |
US20120124099A1 (en) * | 2010-11-12 | 2012-05-17 | Lisa Ellen Stewart | Expert system for subject pool selection |
-
2014
- 2014-08-15 CN CN201410401124.0A patent/CN105373535B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101533000A (en) * | 2009-03-05 | 2009-09-16 | 重庆大学 | Method for constructing water eutrophication risk analysis model |
CN102446302A (en) * | 2011-12-31 | 2012-05-09 | 浙江大学 | Data preprocessing method of water quality prediction system |
CN103335955A (en) * | 2013-06-19 | 2013-10-02 | 华南农业大学 | Water quality on-line monitoring method and device |
Also Published As
Publication number | Publication date |
---|---|
CN105373535A (en) | 2016-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103514201B (en) | Method and device for querying data in non-relational database | |
CN100424704C (en) | Full text search system based on ciphertext | |
TW201430598A (en) | Method and server for searching and determining active areas | |
CN104298785B (en) | Searching method for public searching resources | |
CN107943952A (en) | A kind of implementation method that full-text search is carried out based on Spark frames | |
CN108805710A (en) | A kind of distribution type electric energy method of commerce based on block chain intelligence contract technology | |
CN107392568A (en) | A kind of project cost management system | |
CN106934596A (en) | Construction project data managing method and system based on similarity comparison | |
CN102446254A (en) | Similar loophole inquiry method based on text mining | |
CN101894129B (en) | Video topic finding method based on online video-sharing website structure and video description text information | |
WO2016197857A1 (en) | Position information providing method and device | |
CN108280366A (en) | A kind of batch linear query method based on difference privacy | |
Kuang et al. | A privacy protection model of data publication based on game theory | |
CN111367911A (en) | Site environment data analysis method and system | |
CN103853838A (en) | Data processing method and device | |
TWI254880B (en) | Method for classifying electronic document analysis | |
CN105373535B (en) | A kind of data extraction method of water quality benchmark | |
CN111126865A (en) | Technology maturity judging method and system based on scientific and technological big data | |
CN104765763B (en) | A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice | |
CN110704698B (en) | Correlation and query method for unstructured massive network security data | |
CN105550809A (en) | Credit reporting system for assessment of enterprise credit | |
CN107220363A (en) | It is a kind of to support the global complicated cross-region querying method retrieved and system | |
CN101576981A (en) | Scene-type service system | |
CN116028467A (en) | Intelligent service big data modeling method, system, storage medium and computer equipment | |
CN104731851A (en) | Big data analysis method based on topological network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
DD01 | Delivery of document by public notice | ||
DD01 | Delivery of document by public notice |
Addressee: Li Jiang Document name: payment instructions |