CN105373535B - A kind of data extraction method of water quality benchmark - Google Patents

A kind of data extraction method of water quality benchmark Download PDF

Info

Publication number
CN105373535B
CN105373535B CN201410401124.0A CN201410401124A CN105373535B CN 105373535 B CN105373535 B CN 105373535B CN 201410401124 A CN201410401124 A CN 201410401124A CN 105373535 B CN105373535 B CN 105373535B
Authority
CN
China
Prior art keywords
data
user
extraction
record
water quality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410401124.0A
Other languages
Chinese (zh)
Other versions
CN105373535A (en
Inventor
李江
李青香
罗吴亮
周浩
刘征涛
杨绍贵
闫振广
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NANJING GISC SOFTWARE Co Ltd
Original Assignee
NANJING GISC SOFTWARE Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NANJING GISC SOFTWARE Co Ltd filed Critical NANJING GISC SOFTWARE Co Ltd
Priority to CN201410401124.0A priority Critical patent/CN105373535B/en
Publication of CN105373535A publication Critical patent/CN105373535A/en
Application granted granted Critical
Publication of CN105373535B publication Critical patent/CN105373535B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Abstract

The present invention discloses a kind of data extraction method of water quality benchmark, the data extraction specific steps including water quality benchmark.This method combines the new extraction scheme of traditional SQL query pattern formation, extraction target data that can be easily and effectively, and data supporting is provided to apply, standardizing.Data extraction method can carry out visible customization, available for CS, B/S framework, it can also be used to which database service interface provides technical support for water environment benchmark study.

Description

A kind of data extraction method of water quality benchmark
Technical field
The present invention relates to correlation data in database, the method for sharing data extraction, in particular to one kind The data extraction method of water quality benchmark provides the technology of data support for water quality benchmark study.
Background technology
Water quality benchmark study is the great Science & Technology Demands of China's current environment management work and Research of Environmental Sciences field Hot issue.The environmental criteria of developed country studies the history of existing upper a century, has been set up more perfect ring so far Border Benchmark System, the particularly U.S., have issued numerous environmental criteria files and technical manual and huge benchmark dependency number According to storehouse, the environmental criteria system for studying and establishing oneself country for countries in the world provides important references.
The relevant data structure of water environment benchmark is complicated, including environment master data, species distribution data, environmental exposure number According to, aquatic toxicity data, sediment toxicity data, Ecology data, healthy data.Either USEPA-SSD methods, The various benchmark projectional techniques based on different fitting functions such as EU-SSD methods or RIVM-SSD, are with a large amount of normalized number evidences Premised on carry out deriving analysis.So we have invented a kind of data extraction method based on water quality benchmark, Ke Yifang Just effective extraction target data, to study, application, standardization data supporting is provided.
The content of the invention
The present invention is intended to provide a kind of data extraction method of water quality benchmark, is extracted by data and solves to join in calculating Validity and accuracy with calculating data.
The main case of technology of the present invention is as follows:
A kind of data extraction method of water quality benchmark, including water quality benchmark:
(1) data extraction method of water quality benchmark includes the following steps:
(1-1) set up data extraction system frame, system framework by database server, application server, user terminal, Router, cable composition.Database server is as data warehouse storage data;Application server deployment middleware performs number According to the application program of extraction operation;User terminal is supplied to user/administrator to upload data, calculate data, downloading data etc.;Road It is used to connect database server, application server, user terminal by device and cable;
The data source of (1-2) database server in the transmission of user terminal, user terminal can be administrator or Scientific research clients etc..Data are submitted to application server by user terminal, and application server carries out data detection according to verifying logic, And by data extraction into data screening unit, application program is according to data category, data format, data typical value, data precision It is compared, is processed into the mode data for meeting specification;
Data Jing Guo standardization processing are imported database server by (1-3) application server;
(1-4) application server is built according to the business relations logic such as aquatile, deposit, toxicity data, health Vertical correlation model table.And data storage is converted into column storage, facilitate SQL query and extraction;
(1-5) aquatic environment data application at present towards with it is open be each colleges and universities, R&D institution user, administrator couple It invites, actively application, the user of Unsolicited Grant test.System is stored in use according to user's classification situation, automatic scoring Family classification factor table;
(1-6) user determines to calculate purpose by user terminal, in water quality benchmark, three kinds of methods can be used to carry out Data are extracted:Manual retrieval's extraction, semiautomated retrieval extraction, full-automatic retrieval extraction;
(1-7) manual retrieval extraction is user according to the Academic Experience of itself, scientific research purpose or other situations, by hand choosing Select the data source for participating in calculating.System determines four dimensions according to the selection of user:User is usually used in the data calculated, Yong Huji Calculate the classification in direction, the grade of user-selected number evidence, the utilization rate of user's result of calculation, by the dimension map of user to classified body In architecture.System carries out weight adjusting and calculates, result is stored to user data levels of detail according to user's dimension;
(1-8) semiautomated retrieval extracts, after user's craft selected section participates in the data source calculated;System is counted according to this Calculate purpose, to the record set of artificial selection, verified automatically, shield or reject type mismatch conjunction, purpose do not meet, source not Meet, the incongruent data source of level;System enables extracting method;Extraction data are calculated after collecting with user data;
(1-9) full-automatic retrieval extraction, system is according to this calculating purpose, and system enables extracting method, and extraction data are certainly It is dynamic to be calculated;
(1-10) extraction algorithm
It is associated, adopted according to defined main external key association, dynamic attribute association, level identification in aquatic environment data first Inquiry extraction is carried out with SQL traditional modes.
Then on the basis of table association extraction, the analogy of user's dimension, and real-time update user data levels of detail are increased.
The record that finally SQL pattern queries are gone out, the higher record storage of dimension analogy degree assign respectively in interim table Identical initial positive-negative coefficient value carries out two-wheeled calculating, and a wheel positive number calculates, and a wheel negative calculates, and then positive and negative subtract each other can arrive F Value judges the confidence level of record according to F, and the initial trusted parameter of record is 0.85, then assigns dimension than analog values R to every record The combining weights of (user 1, and user 2 ... ...) composition.Every record has independent weight, by superposition calculation, generates new Weight is adjusted, and every record is updated, and then carries out the big iteration of a new round again, for this calculating purpose, is obtained new Authentication parameters.Authentication parameters are ranked up, data set with a high credibility is extracted using quantity function.
(1-11) whenever having user using new extraction, update, verification, these operations just enter number of users as new record According to details queue.
Description of the drawings
Fig. 1 is the overall structure figure of system extracting method
Fig. 2 is the business relations logic chart of system data, is divided into Fig. 2 (a), Fig. 2 (b), Fig. 2 (c), Fig. 2 (d), Fig. 2 (e)
Specific embodiment
The specific embodiment of the present invention is described further below in conjunction with the accompanying drawings.
First, the present invention provides a kind of data extraction method based on water quality benchmark, includes the following steps:
(1) data extraction system frame is set up, system framework is by database server, application server, user terminal, road It is made of device, cable.Database server is as data warehouse storage data;Application server deployment middleware performs data The application program of extraction operation;User terminal is supplied to user/administrator to upload data, calculate data, downloading data etc.;Routing Device and cable are used to connect database server, application server, user terminal.
(2) for the data source of database server in the transmission of user terminal, user terminal can be administrator or section Grind user.Data are submitted to application server by user terminal, and application server carries out data detection according to verifying logic, and will Data extraction enters data screening unit, is compared using according to data category, data format, data typical value, data precision, It is processed into the level data for meeting specification.Data category is such as the country, external, experiment, a kind of, two level, to data Grade retrieved with associating, and performanceization work.
(3) data Jing Guo standardization processing are imported database server by application server.
(4) application server is established according to the business relations logic such as aquatile, deposit, toxicity data, health Correlation model table, as shown in Figure 2.And data storage is converted into column storage, facilitate SQL query and extraction;Application server Data storage is converted into column storage, it only needs to read the row that application needs, and to read current line as row stores again All row so as to reduce buffer data size, are effectively cached using database service, while reduce network transmission, and due to It is the identical data Coutinuous store of data type, serializing and compression can be utilized to reduce the occupancy in space.
(5) aquatic environment data application at present towards with it is open be each colleges and universities, R&D institution user, administrator is to inviting Please, actively application, the user of Unsolicited Grant test.System is stored in user according to user's classification situation, automatic scoring Classification factor table;User's classification factor table is a dynamic data source, can be carried out according to the situation of being retrieved of user by trigger Update.
(6)) user determines to calculate purpose by user terminal, in water quality benchmark, three kinds of methods can be used to carry out Data are extracted:Manual retrieval's extraction, semiautomated retrieval extraction, full-automatic retrieval extraction.
(7) manual retrieval's extraction is user according to the Academic Experience of itself, scientific research purpose or other situations, by hand selection Participate in the data source calculated.System determines four dimensions according to the selection of user:User is usually used in the data calculated, and user calculates The classification in direction, the grade of user-selected number evidence, the utilization rate of user's result of calculation, by the dimension map of user to taxonomic hierarchies In structure.System carries out weight adjusting and calculates, result is stored to user data levels of detail according to user's dimension.
(8) semiautomated retrieval extracts, after user's craft selected section participates in the data source calculated;System is calculated according to this Purpose to the record set of artificial selection, is verified automatically, shields or reject type mismatch conjunction, purpose is not met, source is not inconsistent It closes, the incongruent data source of level;System enables extracting method;Extraction data are calculated after collecting with user data.
(9) full-automatic retrieval extraction, system enable extracting method according to this calculating purpose, system, and extraction data are automatic It is calculated.
(10) extraction algorithm
It is associated, adopted according to defined main external key association, dynamic attribute association, level identification in aquatic environment data first Inquiry extraction is carried out with SQL traditional modes.Using optimizing on traditional mode, row is supported to cut power, attribute merging method, energy It reduces and reads unnecessary attribute column and data transmission.
Then on the basis of table association extraction, the analogy of user's dimension, and real-time update user data levels of detail are increased..
Dimension analogy is the calculating compared for user's four dimensions.Dimension coefficients R, dimension number n, each dimension Value represents that c is fine tuning parameter with x, and sm is to participate in calculating total degree, and sa is that current-user data calculates total degree.
The dimension calculated is stored in user's dimension table than analog values R.
The record that finally SQL pattern queries are gone out, the higher record storage of dimension analogy degree assign respectively in interim table Identical initial positive-negative coefficient value carries out two-wheeled calculating, and a wheel positive number calculates, and a wheel negative calculates, and then positive and negative subtract each other can arrive F Value judges the confidence level of record according to F, and the initial trusted parameter of record is 0.85, then assigns dimension than analog values R to every record The combining weights of (user 1, and user 2 ... ...) composition.Every record has independent weight, by superposition calculation, generates new Weight is adjusted, and every record is updated, and then carries out the big iteration of a new round again, for this calculating purpose, is obtained new Authentication parameters.Authentication parameters are ranked up, data set with a high credibility is extracted using quantity function.
Report substitution code is as follows:
(11) whenever having user using new extraction, update, verification, these operations just enter user data as new record Details queue.
It is above-described be only the present invention method is preferably implemented, the invention is not restricted to implement above.It is appreciated that ability The oher improvements and changes that field technique personnel directly export or associate without departing from the spirit and concept in the present invention, It is considered as being included within protection scope of the present invention.

Claims (1)

1. a kind of data extraction method of water quality benchmark, including water quality benchmark, it is characterised in that:
(1) data extraction method of water quality benchmark includes the following steps:
(1-1) sets up data extraction system frame, and system framework is by database server, application server, user terminal, routing Device, cable composition;Database server is as data warehouse storage data;Application server deployment middleware performs data and carries The application program of extract operation;User terminal is supplied to user/administrator to upload data, calculate data, downloading data;Router and Cable is used to connect database server, application server, user terminal;
For the data source of (1-2) database server in the transmission of user terminal, user terminal can be administrator or scientific research Data are submitted to application server by user, user terminal, and application server carries out data detection according to verifying logic, and by number According to extraction into data screening unit, application program carries out pair according to data category, data format, data typical value, data precision Than being processed into the mode data for meeting specification;
Data Jing Guo standardization processing are imported database server by (1-3) application server;
(1-4) application server establishes association according to aquatile, deposit, toxicity data, health business relations logic Model table, and data storage is converted into column storage, facilitate SQL query and extraction;
(1-5) aquatic environment data application at present towards with it is open be each colleges and universities, R&D institution user, administrator is to inviting Please, actively application, the user of Unsolicited Grant test, and system is stored in user according to user's classification situation, automatic scoring Classification factor table;
(1-6) user determines to calculate purpose by user terminal, in water quality benchmark, three kinds of methods can be used to carry out data Extraction:Manual retrieval's extraction, semiautomated retrieval extraction, full-automatic retrieval extraction;
(1-7) manual retrieval extraction is user according to the Academic Experience of itself, scientific research purpose or other situations, manual selection ginseng With the data source of calculating, system determines four dimensions according to the selection of user:User is usually used in the data calculated, user calculating side To classification, the grade of user-selected number evidence, the utilization rate of user's result of calculation, by the dimension map of user to classified body tying In structure, system carries out weight adjusting and calculates, result is stored to user data levels of detail according to user's dimension;
(1-8) semiautomated retrieval extracts, after user's craft selected section participates in the data source calculated;System is according to this calculating mesh , it to the record set of artificial selection, is verified automatically, shields or reject type mismatch conjunction, purpose is not met, source is not inconsistent It closes, the incongruent data source of level;System enables extracting method;Extraction data are calculated after collecting with user data;
(1-9) automatically retrieval extraction, system enable extracting method according to this calculating purpose, system, extraction data automatically into Row calculates;
(1-10) extraction algorithm
It is associated, used according to defined main external key association, dynamic attribute association, level identification in aquatic environment data first SQL traditional modes carry out inquiry extraction;
Then on the basis of table association extraction, the analogy of user's dimension, and real-time update user data levels of detail are increased;
The record that finally SQL pattern queries are gone out, the higher record storage of dimension analogy degree assign identical respectively in interim table Initial positive-negative coefficient value, carry out two-wheeled calculating, one wheel positive number calculate, one wheel negative calculate, then positive and negative subtract each other can arrive F values, The confidence level of record is judged according to F, the initial trusted parameter of record is 0.85, then assigns dimension to every record and (is used than analog values R Family 1, user 2 ... ...) composition combining weights;Every record has independent weight, by superposition calculation, generates new power Section is resetted, and every record is updated, then carries out the big iteration of a new round again, for this calculating purpose, is obtained new Authentication parameters are ranked up authentication parameters, and data set with a high credibility is extracted using quantity function;
(1-11) whenever having user using new extraction, update, verification, it is thin that these operations just enter user data as new record Save queue.
CN201410401124.0A 2014-08-15 2014-08-15 A kind of data extraction method of water quality benchmark Active CN105373535B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410401124.0A CN105373535B (en) 2014-08-15 2014-08-15 A kind of data extraction method of water quality benchmark

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410401124.0A CN105373535B (en) 2014-08-15 2014-08-15 A kind of data extraction method of water quality benchmark

Publications (2)

Publication Number Publication Date
CN105373535A CN105373535A (en) 2016-03-02
CN105373535B true CN105373535B (en) 2018-05-25

Family

ID=55375742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410401124.0A Active CN105373535B (en) 2014-08-15 2014-08-15 A kind of data extraction method of water quality benchmark

Country Status (1)

Country Link
CN (1) CN105373535B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268515B (en) * 2016-12-30 2020-07-31 北京国双科技有限公司 Selection method and device for dimension of aggregation table

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101533000A (en) * 2009-03-05 2009-09-16 重庆大学 Method for constructing water eutrophication risk analysis model
CN102446302A (en) * 2011-12-31 2012-05-09 浙江大学 Data preprocessing method of water quality prediction system
CN103335955A (en) * 2013-06-19 2013-10-02 华南农业大学 Water quality on-line monitoring method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI343533B (en) * 2007-11-08 2011-06-11 Inst Information Industry Event detection method and system
US20120124099A1 (en) * 2010-11-12 2012-05-17 Lisa Ellen Stewart Expert system for subject pool selection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101533000A (en) * 2009-03-05 2009-09-16 重庆大学 Method for constructing water eutrophication risk analysis model
CN102446302A (en) * 2011-12-31 2012-05-09 浙江大学 Data preprocessing method of water quality prediction system
CN103335955A (en) * 2013-06-19 2013-10-02 华南农业大学 Water quality on-line monitoring method and device

Also Published As

Publication number Publication date
CN105373535A (en) 2016-03-02

Similar Documents

Publication Publication Date Title
CN103514201B (en) Method and device for querying data in non-relational database
CN100424704C (en) Full text search system based on ciphertext
TW201430598A (en) Method and server for searching and determining active areas
CN104298785B (en) Searching method for public searching resources
CN107943952A (en) A kind of implementation method that full-text search is carried out based on Spark frames
CN108805710A (en) A kind of distribution type electric energy method of commerce based on block chain intelligence contract technology
CN107392568A (en) A kind of project cost management system
CN106934596A (en) Construction project data managing method and system based on similarity comparison
CN102446254A (en) Similar loophole inquiry method based on text mining
CN101894129B (en) Video topic finding method based on online video-sharing website structure and video description text information
WO2016197857A1 (en) Position information providing method and device
CN108280366A (en) A kind of batch linear query method based on difference privacy
Kuang et al. A privacy protection model of data publication based on game theory
CN111367911A (en) Site environment data analysis method and system
CN103853838A (en) Data processing method and device
TWI254880B (en) Method for classifying electronic document analysis
CN105373535B (en) A kind of data extraction method of water quality benchmark
CN111126865A (en) Technology maturity judging method and system based on scientific and technological big data
CN104765763B (en) A kind of semantic matching method of the Heterogeneous Spatial Information classification of service based on concept lattice
CN110704698B (en) Correlation and query method for unstructured massive network security data
CN105550809A (en) Credit reporting system for assessment of enterprise credit
CN107220363A (en) It is a kind of to support the global complicated cross-region querying method retrieved and system
CN101576981A (en) Scene-type service system
CN116028467A (en) Intelligent service big data modeling method, system, storage medium and computer equipment
CN104731851A (en) Big data analysis method based on topological network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
DD01 Delivery of document by public notice
DD01 Delivery of document by public notice

Addressee: Li Jiang

Document name: payment instructions