CN104268181B - The quick check method of sea life enquiry data and device - Google Patents

The quick check method of sea life enquiry data and device Download PDF

Info

Publication number
CN104268181B
CN104268181B CN201410471975.2A CN201410471975A CN104268181B CN 104268181 B CN104268181 B CN 104268181B CN 201410471975 A CN201410471975 A CN 201410471975A CN 104268181 B CN104268181 B CN 104268181B
Authority
CN
China
Prior art keywords
data
name
sea life
checks
enquiry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410471975.2A
Other languages
Chinese (zh)
Other versions
CN104268181A (en
Inventor
路文海
杨翼
黄海燕
付瑞全
向先全
刘捷
陶以军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NATIONAL OCEANIC INFORMATION CENTER
Original Assignee
NATIONAL OCEANIC INFORMATION CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NATIONAL OCEANIC INFORMATION CENTER filed Critical NATIONAL OCEANIC INFORMATION CENTER
Priority to CN201410471975.2A priority Critical patent/CN104268181B/en
Publication of CN104268181A publication Critical patent/CN104268181A/en
Application granted granted Critical
Publication of CN104268181B publication Critical patent/CN104268181B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention relates to marine biodiversity technical field of information processing, particularly relate to the quick check method of sea life enquiry data and device.The method comprises: classify to sea life enquiry data; Sorted every class sea life enquiry data is carried out to complete property checks, standardization checks and accuracy checks, and generate and check result; Unify sea life enquiry data form, specification species name, amendment misdata according to checking result, improve the quality of data, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support.The embodiment of the present invention provides a kind of unified and standard checking flow process and check content to checking of sea life enquiry data, and from the complete property of data, standardization and accuracy, sea life enquiry data is checked, particularly achieve checking fast and batch correction sea life kind title, check more comprehensively, accuracy is higher and speed is fast.

Description

The quick check method of sea life enquiry data and device
Technical field
The present invention relates to marine biodiversity technical field of information processing, in particular to the quick check method of sea life enquiry data and device.
Background technology
In order to analyze state of marine environment, research sea life are various, need the support of high-quality sea life enquiry data, therefore need to carry out quality control to sea life enquiry data, namely the quality control of wherein late time data use procedure is check the sea life enquiry data obtained.
The quality control Main Basis marine monitoring specification part 2 of China's oceanographic survey (monitoring) data: data process&analysis quality control (GB17378.2-2007) and " marine monitoring quality control (assurrance) manual ", and the method provided in these specifications mainly concentrates on the quality control of data generating procedure, method of quality control for late time data use procedure relates to less, only rest on the data examination & verification aspect of check and correction, particularly lack later stage sea life investigation (monitoring) data validation method.The current mode that checks to sea life enquiry data mainly experience checks, namely the practical experience of professional and technical personnel is relied on to accumulate, the sea life enquiry data obtained is audited, with personal experience, biological Chinese name and latin name are checked, lack unified and standard checking flow process and check content.
Existing sea life enquiry data adopts experience to check mode, and experience checks with the experience accumulation degree of technician closely bound up, there is no unified step method, different technician has diverse ways, different standards and different steps to checking of sea life enquiry data, cannot form unified result.
Find out that the verification implementation method of sea life enquiry data cannot meet the actual demand of the quality control of sea life enquiry data late time data use procedure thus.
Summary of the invention
The object of the present invention is to provide the quick check method of sea life enquiry data and device, to solve the above problems.
Provide the quick check method of sea life enquiry data in an embodiment of the present invention, comprising: according to the type of sea life investigation key element, sea life enquiry data is classified; Described sorted every class sea life enquiry data is carried out to complete property checks, standardization checks and accuracy checks, realize the one-stop batch Quality Control of species name, and generate and check result; Unify sea life enquiry data form, specification species name, amendment misdata, improve the quality of data according to the described result that checks, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support.
Preferably, the described type according to sea life investigation key element, sea life enquiry data is classified, comprise: according to the type of sea life investigation key element, described sea life enquiry data is divided into: chlorophyll enquiry data, primary productivity enquiry data, microbiological investigation data, phytoplankton Investigation data, animal plankton enquiry data, bottom-dwelling enquiry data, intertidal organism enquiry data, fish planktonic organism enquiry data, nektonic organism enquiry data, Mangrove Communities enquiry data, sea grass bed Community Survey data, coral reef Community Survey data and coral fishes enquiry data.
Preferably, carry out complete property to sorted every class sea life enquiry data to check, comprise: the keyword parameter arranged according to the type according to sea life enquiry data, whether fill in complete by the sea life enquiry data retrieved corresponding to described keyword parameter, check the complete property of described sea life enquiry data; The special keyword parameter that the described keyword parameter wherein arranged for every class sea life enquiry data comprises general parameter and arranges according to every class sea life enquiry data; Wherein said general parameter comprises: flight number, observation unit, stand position number, longitude, latitude and monitoring the date; Described special keyword parameter comprise following one or more: the common first names of the parameter field of species Chinese name, species Latin, density, biomass, net type and setting and another name.
Preferably, carry out standardization to sorted every class sea life enquiry data to check, comprise: parameter name standardization checks, for the parameter name for checking arranges title and another name, retrieve described sea life enquiry data, when retrieving the sea life enquiry data consistent with the described title arranged or described another name, all return described title; The form of supplemental characteristic checks, whether the supplemental characteristic form in described sea life enquiry data is compared consistent with the standard format corresponding with it pre-set, if inconsistent, be the form consistent with described standard format by described supplemental characteristic format conversion; Data-parallel language standardization checks, the field name selecting polishing rest position is indicated according to user, in the polishing unit belonging to described field name, the mentioned null cell of described previous cell column is filled by the value of previous cell, if described previous cell is empty, then the row at described previous cell place are not filled, and wherein said previous cell is cell adjacent with the first mentioned null cell of these row in every column data.
Preferably, carry out accuracy to sorted every class sea life enquiry data to check, comprise: polishing checks, a unit is recorded as with the every bar data in described sea life enquiry data, by comparison every bar record erect-position, longitude and latitude, export the record of same erect-position different longitude and latitude, the different erect-position of same longitude and latitude; Erect-position lands and checks, unit is recorded as with every bar data, extract not identical erect-position, longitude, latitude data span data, and described spatial data is presented in the water front figure of China's Mainland as data point, wherein each described data point all associates with raw data; Data re-scheduling, arranges re-scheduling field according to biological data type and carries out re-scheduling; Biological name checks, species name in sea life enquiry data is mated with the species name in the sea life standard database generated in advance and sea life name standard specification sheet, comprising: the latin name in described sea life enquiry data is mated with the standard latin name in described sea life standard database, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise mated with the standard Chinese name in described sea life standard database by the Chinese name in described sea life enquiry data, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise by the matching instruction of user, realize biological name and check, and matching result is stored in sea life name standard specification sheet as a record, be used for species name inferior to sea life standard database and check; Realize biological name according to the result of coupling to check.
Preferably, the described result according to coupling realizes biological name and checks, comprise: return in original sea life enquiry data according to matching result and increase the biological standard Chinese name, standard latin name and the process record that check output newly, described process record comprises primeval life name and the amended standard biological name corresponding with it.
The embodiment of the present invention additionally provides a kind of sea life enquiry data and checks device fast, comprising: sort module, for the type according to sea life investigation key element, classifies to sea life enquiry data; Core inspection module, for carrying out described sorted every class sea life enquiry data, complete property checks, standardization checks and accuracy checks, and realizes the one-stop batch Quality Control of species name, and generation checks result; Application module, unify sea life enquiry data form, specification species name, amendment misdata for checking result described in basis, improve the quality of data, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support.
The quick check method of sea life enquiry data that the embodiment of the present invention provides and device, in the step that sea life enquiry data is checked, first according to the type of sea life investigation (monitoring) data, data are classified, the automatic data to classification carry out complete property afterwards, standardization and accuracy check, particularly existence is write lack of standardization, form disunity, synonym, formal name used at school lacks, the sea life kind title of the situations such as misspelling checks fast and batch is revised, based on the result specification biological data form checked, specification biological species title, improve the biological survey quality of data, for optimizing marine ecological investigation erect-position, evaluate marine eco-environment situation, prediction marine eco-environment variation tendency provides quality data support.The embodiment of the present invention provides a kind of unified and standard checking flow process and check content to checking of sea life enquiry data, and from the complete property of data, standardization and accuracy, sea life enquiry data is checked, particularly achieve checking fast and batch correction sea life kind title, check more comprehensively, accuracy is higher and speed fast, finds out the actual demand that the quick check method of sea life enquiry data of the embodiment of the present invention and device more can meet sea life enquiry data late time data and use thus.
Accompanying drawing explanation
Fig. 1 shows a kind of process flow diagram of the quick check method of sea life enquiry data in the embodiment of the present invention;
Fig. 2 shows the another kind of process flow diagram of the quick check method of sea life enquiry data in the embodiment of the present invention;
Fig. 3 shows ocean sea life enquiry data in the embodiment of the present invention and checks the structural representation of device fast;
Fig. 4 shows the erect-position figure generated according to erect-position landing assay.
The erect-position figure that Fig. 5 shows generating according to erect-position landing assay carries out revised schematic diagram.
Embodiment
Also by reference to the accompanying drawings the present invention is described in further detail below by specific embodiment.
Embodiments provide the quick check method of a kind of sea life enquiry data, as shown in Figure 1, main treatment scheme comprises:
Step S11: according to the type of sea life investigation key element, sea life enquiry data is classified;
Step S12: sorted every class sea life enquiry data is carried out to complete property checks, standardization checks and accuracy checks, realize the one-stop batch Quality Control of species name, and generation checks result, particularly checking fast and batch correction there is the sea life kind title of writing the situations such as lack of standardization, form disunity, synonym, formal name used at school disappearance, misspelling, checking result to generate.
Step S13: unify sea life enquiry data form, specification species name, amendment misdata according to checking result, improve the quality of data, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support.
The quick check method of sea life enquiry data that the embodiment of the present invention provides and device, in the step that sea life enquiry data is checked, first according to the type of sea life investigation (monitoring) data, data are classified, the automatic data to classification carry out complete property afterwards, standardization and accuracy check, particularly existence is write lack of standardization, form disunity, synonym, formal name used at school lacks, the sea life kind title of the situations such as misspelling checks fast and batch is revised, based on the result specification biological data form checked, specification biological species title, improve the biological survey quality of data, for optimizing marine ecological investigation erect-position, evaluate marine eco-environment situation, prediction marine eco-environment variation tendency provides quality data support.The embodiment of the present invention provides a kind of unified and standard checking flow process and check content to checking of sea life enquiry data, and from the complete property of data, standardization and accuracy, sea life enquiry data is checked, particularly achieve checking fast and batch correction sea life kind title, check more comprehensively, accuracy is higher and speed fast, finds out that the quick check method of sea life enquiry data of the embodiment of the present invention and device more can meet the actual demand of the quality control of sea life enquiry data late time data use procedure thus.
As illustrated the another kind of process flow diagram of the quick check method of sea life enquiry data in Fig. 2.
Because the type of the sea life enquiry data obtained is more, and for every class biological survey data, parameter involved by it is different, therefore the sea life enquiry data to obtaining is needed to classify, sea life enquiry data can be divided into chlorophyll enquiry data particularly, primary productivity enquiry data, microbiological investigation data, phytoplankton Investigation data, animal plankton enquiry data, bottom-dwelling enquiry data, intertidal organism enquiry data, fish planktonic organism enquiry data, nektonic organism enquiry data, Mangrove Communities enquiry data, sea grass bed Community Survey data, coral reef Community Survey data and coral fishes enquiry data.
After classifying to sea life enquiry data, according to the result of classification, automatically respectively every class data are carried out to complete property checks, standardization checks and accuracy checks.
Carry out complete property to sorted every class sea life enquiry data to check, comprise: the keyword parameter arranged according to the type according to sea life enquiry data, whether fill in complete by the sea life enquiry data retrieved corresponding to described keyword parameter, check the complete property of described sea life enquiry data; The special keyword parameter that the described keyword parameter wherein arranged for every class sea life enquiry data comprises general parameter and arranges according to every class sea life enquiry data; Wherein said general parameter comprises: flight number, observation unit, stand position number, longitude, latitude and monitoring the date; Described special keyword parameter comprise following one or more: the common first names of the parameter field of species Chinese name, species Latin, density, biomass, net type and setting and another name.
In complete property inspection, according to different biotypes, different keyword parameter fields being set, whether lacking by checking keyword parameter word, realize the complete property of biological survey data and check fast.In addition, other common first names, the another name of this key parameter must be comprised when critical field checks.In biological survey data, except the general key fields such as flight number, observation unit, station number, longitude, latitude, monitoring date, other different key parameter field is also had in often kind of biotype data, as in animal plankton, net type must be had, species Chinese name, species latin name, density field, it is macrozooplankton or microzooplankton data that net type can be used for differentiation.Wherein other conventional literary style of " species Chinese name " just has " biological species Chinese name ", " species Chinese formal name used at school ", " Chinese name ", " Chinese ", " Chinese formal name used at school " etc.
Automatically carry out standardization to sorted sea life enquiry data to check, the Problem can not only found in biological survey data is checked by standardization, and synchronously can carry out standardization processing, particularly for Data-parallel language problem, providing can the field of polishing of artificial selection, and a key polishing, namely people is freed from the duplication of labour, save the time, also prevent the mistake caused in artificial polishing process.
Check comprise as automatically carried out standardization to sorted sea life enquiry data in Fig. 2: parameter name standardization checks, the form of supplemental characteristic checks and Data-parallel language standardization checks.
Parameter name standardization checks, for the parameter name for checking arranges title and another name, retrieval sea life enquiry data, when retrieve with the title arranged or call consistent sea life enquiry data time, all return title.
As noted above, only " species Chinese name " a kind of parameter just has 6 kinds of literary styles, and when tables of data is a lot, standard parameter is just extremely necessary.By being equipped with a title and multiple another name to parameter name, when no matter retrieving standard name or another name, all title of return parameters, thus reach the object checking also Synchronization Model parameter and standard title fast.
Whether the form of supplemental characteristic checks, compare consistent by the supplemental characteristic form in sea life enquiry data with the standard format corresponding with it pre-set, if inconsistent, is the form consistent with standard format by supplemental characteristic format conversion.
During the form of supplemental characteristic checks, as longitude and latitude form just has " degree " form, " degree, point " form, " degree, minute, second " form, by established standards form, when running into non-standard data formats, being converted into standard data, then exporting.
Data-parallel language standardization checks, the field name selecting polishing rest position is indicated according to user, in the polishing unit belonging to field name, the mentioned null cell of previous cell column is filled by the value of previous cell, if previous cell is empty, then the row at previous cell place are not filled, and wherein previous cell is cell adjacent with the first mentioned null cell of these row in every column data.
Compared to other subject data, the distinctive form of biological survey data is to data multirow displays such as same erect-position different plant species and density, the information such as erect-position, longitude and latitude, date collected corresponding with it repeats multirow display, in Excel road test settlement tables of data, need manually to carry out drawing to those data needing multirow to show to copy polishing, the data of those erect-position point many levels just need to carry out more times polishing operation.Data-parallel language standardization checks, by artificially selecting the field name of a polishing rest position, it is a unit that polishing carries out with monitor spots, fills other mentioned null cell of these row in this unit by the value of previous cell, if previous cell is empty, then these row are not filled.While checking out non-polishing data, these data of polishing.
As Fig. 2, automatically accuracy is carried out to sorted sea life enquiry data in the embodiment of the present invention and check, comprising: polishing checks, erect-position lands checks, data re-scheduling and biological name check.
Polishing checks, and is to be recorded as a unit with the every bar data in sea life enquiry data, by comparison every bar record erect-position, longitude and latitude, exports the record of same erect-position different longitude and latitude, the different erect-position of same longitude and latitude.
Polishing is carried out to the data in tables of data, distinctive in biological survey data, the most general method is drawn by holder to copy in Excel, but being pulled through in journey in holder easily causes data to be not replicated by Sequence Filling, therefore same erect-position different longitude and latitude in data is just caused, or the phenomenon of the different erect-position of same longitude and latitude.If data fill in mistake, also this phenomenon can be caused.And be recorded as a unit with every bar data in this method, by comparison every bar record erect-position, longitude, latitude, export the record of same erect-position different longitude and latitude, the different erect-position of same longitude and latitude, meet the actual demand that polishing checks.
Erect-position lands and checks, and be recorded as unit with every bar data, extract not identical erect-position, longitude, latitude data span data, and spatial data is presented in the water front figure of China's Mainland as data point, wherein each data point all associates with raw data.
Although biological data record strip is very many, it is less that standing capacity compares Other subjects data.In erect-position landing checks, be recorded as unit with every bar data, extract not identical erect-position, longitude, latitude data span data, and using these data as in the China's Mainland water front figure that data point is presented at, click each point and can be associated with in raw data.Except the special datas such as intertidal organism, culture zone biology, if within data point appears at water front, then longitude and latitude lands.If spatial data is with the some integrated distribution of very regular gradient, is also likely the mistake caused because of the artificial polishing of data, need audits further.
Data re-scheduling, arranges re-scheduling field according to biological data type classification and carries out re-scheduling; The re-scheduling of biological survey data has its singularity, and only arranging the universal field such as erect-position, longitude, latitude, monitoring date is that critical field is carried out Repeatability checking and is nowhere near, and need arrange other different special field, just can carry out re-scheduling by biological data type.As in phytoplankton data, erect-position, longitude, latitude are set, outside the monitoring date, also need to arrange the fields such as sample type, level, species Chinese name, species latin name, density, indispensable.
Biological name checks, species name in sea life enquiry data is mated with the species name in the sea life standard database generated in advance and sea life name standard specification sheet, comprising: the latin name in sea life enquiry data is mated with the standard latin name in sea life standard database, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise mated with the standard Chinese name in sea life standard database by the Chinese name in sea life enquiry data, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise by the matching instruction of user, realize biological name and check, and matching result is stored in sea life name standard specification sheet as a record, be used for species name inferior to sea life standard database and check; Realize biological name according to the result of coupling to check.
Particularly, the data such as " sea life Sort Code (GB/T17826-1999) ", " Chinese Sea species diversity " " the biological register of Chinese Sea " are utilized to generate above-mentioned sea life standard database.
Realize biological name according to the result of coupling to check, comprise: return in original sea life enquiry data according to matching result and increase the biological standard Chinese name, standard latin name and the process record that check output newly, process record comprises primeval life name and the amended standard biological name corresponding with it.
In enquiry data, checking fast of biological name is a difficult point, in biological survey data, the record number of biological name is very many, and the problems such as writing of existing is lack of standardization, form disunity, synonym, formal name used at school disappearance, misspelling, compare with authoritative data even if take much time to complete one by one by professional and technical personnel, later stage is also very large to the workload of the correction of data and amendment, also easily occurs false retrieval, undetected and be difficult to the problem of tracing to the source.This method is by utilizing the data such as " sea life Sort Code (GB/T17826-1999) ", " Chinese Sea species diversity " " the biological register of Chinese Sea ", be formed into the Chinese Sea biological standard database of section, genus and species, the literary style of storehouse Plays latin name adopt generic name add kind of a name (and plant following classification grade sub-epitheton, become epitheton, modification adds word) form, except generic name initial caps, other lower-case letters, generic name and kind name write out full name.Species latin name is biolvgical name general in the world, and often kind of biology only has only legal, science, correct formal name used at school, is therefore mated with the standard latin name in standard database by the latin name in sea life enquiry data at first.The biolvgical name of various countries' use in the world at present, be made up of generic name and epitheton (being called kind of a name traditionally) two Latin words, except generic name initial caps, other lower-case letters, complete formal name used at school also will write abbreviation and the time of namer's surname or surname after epitheton, some biology also has subclassificatio group kind is following, as subspecies, mutation, modification etc., at this moment biological name employing generic name+epitheton+kind below classification grade add word.In biological survey data, the literary style lack of standardization of latin name is various, comprise that space number between word is more, capital and small letter is incorrect, namer and time abbreviation or do not fill in, whether generic name abridges, grade term literary style is inconsistent.If the matching result of latin name is identical, returns the standard Chinese name in standard database and standard latin name, complete biological name and check.
If the matching result of latin name is not identical, then the Chinese name in sea life enquiry data is mated with the standard Chinese name in standard database.Compared to the uniqueness of latin name, same species may have different Chinese names (synonym), and same Chinese name also may refer to different species (homonym).In biological survey data, Chinese name also exists before and after character and intercharacter has the Problems such as space, variant Chinese character, rarely used word, the nearly word of sound, nearly word form.If therefore the matching result of Chinese name is identical, returns the standard Chinese name in standard database and standard latin name, complete biological name and check.If not identical, then need manually to mate, thus realize biological name and check.
Artificial coupling will there is latin name abbreviation in species name, capital and small letter, space between word, exist before and after character in Chinese name and intercharacter has space, variant Chinese character, rarely used word, the nearly word of sound, the biological name that nearly word form etc. can not be matched in artificial discernible biological name and standard database is matched, return the standard Chinese name in standard database and standard latin name, and those can not be identified kind, the species name that can only identify genus and section carries out standardization processing, (table comprises original Chinese name in the lump the record of these pair record and standardization processing to be generated biological name standard criterion table, original latin name, standard Chinese name, the fields such as standard latin name), turning back to biological name checks in flow process, check for biological name.Matching result returns standard Chinese name in standard database or biological name standard criterion table and standard latin name.
Checking of biological name, its matching result will return and increase biological standard Chinese name, standard latin name and process record newly in primary investigation data, primeval life name and amended standard biological name is recorded in process record, not only complete checking of biological name fast, and batch is revised, doubt is had for correction result, can quickly through checking that process record is traced to the source.
The sea life enquiry data check method that this method provides, there is unified and standard checking flow process and check content, checking fast of sea life enquiry data can be realized, not achieve checking fast and batch correction sea life kind title, form normalized biological survey data, improve the biological survey quality of data, improve the accuracy rate of biological species name, biological latin name simultaneously, for marine biodiversity analysis, marine eco-environment quality assessment provide truly, biological survey data reliably.
Additionally provide a kind of sea life enquiry data in the embodiment of the present invention and checked device fast, mainly comprised as shown in Figure 3:
Sort module 21, for the type according to sea life investigation key element, classifies to sea life enquiry data;
Core inspection module 22, for carrying out described sorted every class sea life enquiry data, complete property checks, standardization checks and accuracy checks, comprise and checking fast and batch correction there is the sea life kind title of writing the situations such as lack of standardization, form disunity, synonym, formal name used at school disappearance, misspelling, check result to generate, and generation checks result;
Application module 23, unify sea life enquiry data form, specification species name, amendment misdata for checking result described in basis, improve the quality of data, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support.
The implementation method that the sea life enquiry data provided according to the embodiment of the present invention checks fast, also furthermore present the concrete example that sea life enquiry data checks fast, specifically comprises in the embodiment of the present invention:
Acquisition checks data source: choose national marine environmental monitoring bio-diversity task North China Sea Waters phytoplankton Investigation in summer data and check as sample data (as table 1), comprise 14 files altogether, 221 erect-positions, 3615 records.
Table 1 sample data file name
Optimum configurations: as shown in table 2, arranges phytoplankton parameter and standard title, standard measure of quantity, standard format, keyword parameter, re-scheduling field and another name.
Table 2. phytoplankton checks optimum configurations fast
In table 2, " ... " represents other unlisted phytoplankton Investigation parameter.
Check result:
Complete property checks: carried out complete property checked by the keyword parameter that arranges table 2 Suo Shi and another name thereof, check result display, phytoplankton data are complete.
Standardization checks: (1) parameter name standardization checks: carried out parameter name standardization checked by the parameter and standard title that arranges table 2 Suo Shi and another name thereof, it is lack of standardization all to there is parameter name in the file 1-14 checked in result indicator gauge 2, return canonical parameter title, synchronously complete parameter name standardization.(2) parameter format standardization checks: carried out parameter name standardization checked by the standard measure of quantity that arranges table 2 Suo Shi and standard format, suppose that in the file 2 in table 1, phytoplankton density data unit is " individual cell/L ", lack of standardization compared with the standard configuration in table 2, " 1 cell/L=1000/m3 " is set for metering transformational relation, carries out the unit conversion of data.(3) Data-parallel language standardization checks: if check result display, file 1,5,6,8,9,12,14 is totally 7 non-polishings of file data, chooses " drainage amount " for polishing rest position one key polishing data.
Accuracy checks: (1) polishing checks: if check result display, the longitude of erect-position A1B37YQ021, A1B37YQ029, A1B37YQ033 erect-position in file 10, latitude are data division or all by Sequence Filling, (2) erect-position lands and checks: if check result to generate erect-position figure, as shown in Figure 4, in circle, data are regular increase and depart from water front, erect-position longitude and latitude is suspicious, through checking and the raw data that erect-position point associates, confirm that in data, longitude and latitude presses Sequence Filling, to check result consistent with polishing, after erect-position correction as shown in Figure 5.(3) data re-scheduling: when checking out in file 5, A1B13YQ010 and A1B13YQ011 erect-position longitude and latitude is identical, in file 14, A1H00JQ004 and A1H00JQ003 erect-position longitude and latitude is identical, reference implementation scheme modifying longitude and latitude data; In file 5 in erect-position A1B13YQ005 Chaetoceros have two record and density is not identical, in the A1B37YQ075 that stands in file 11 narrow thin Chaetoceros have two record and density is not identical.(4) biological name checks, check result newly-increased in raw data " standard Chinese name, standard latin name, process record " three row, specification 1303 species name records altogether, show as processed in record, " 1. specification species Chinese name [Thalassiosira nordenskioldi Cleve--hailian seaweed is stepped in > promise], 2. specification species Latin [Thalassiosiranordenskioldi-->Thalassiosira nordenskioldii] ".
These are only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (6)

1. the quick check method of sea life enquiry data, is characterized in that, comprising:
According to the type of sea life investigation key element, sea life enquiry data is classified;
Described sorted every class sea life enquiry data is carried out to complete property checks, standardization checks and accuracy checks, realize the one-stop batch Quality Control of species name, and generate and check result;
Unify sea life enquiry data form, specification species name, amendment misdata, improve the quality of data according to the described result that checks, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support;
Described complete property checks and comprises: the keyword parameter arranged according to the type according to sea life enquiry data, whether fill in complete by the sea life enquiry data retrieved corresponding to described keyword parameter, check the complete property of described sea life enquiry data;
Described standardization checks and comprises: parameter name standardization checks, the form of supplemental characteristic checks and Data-parallel language standardization checks;
Wherein, carry out accuracy to sorted every class sea life enquiry data in said method to check, comprise: polishing checks, a unit is recorded as with the every bar data in described sea life enquiry data, by comparison every bar record erect-position, longitude and latitude, export the record of same erect-position different longitude and latitude, the different erect-position of same longitude and latitude;
Erect-position lands and checks, unit is recorded as with every bar data, extract not identical erect-position, longitude, latitude data span data, and described spatial data is presented in the water front figure of China's Mainland as data point, wherein each described data point all associates with raw data;
Data re-scheduling, arranges re-scheduling field according to biological data type and carries out re-scheduling;
Biological name checks, species name in sea life enquiry data is mated with the species name in the sea life standard database generated in advance and sea life name standard specification sheet, comprising: the latin name in described sea life enquiry data is mated with the standard latin name in described sea life standard database, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise mated with the standard Chinese name in described sea life standard database by the Chinese name in described sea life enquiry data, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise by the matching instruction of user, realize biological name and check, and matching result is stored in sea life name standard specification sheet as a record, be used for species name inferior to sea life standard database and check; Realize biological name according to the result of coupling to check.
2. method according to claim 1, is characterized in that, the described type according to sea life investigation key element, to the classification of sea life enquiry data, comprising:
According to the type of sea life investigation key element, described sea life enquiry data is divided into: chlorophyll enquiry data, primary productivity enquiry data, microbiological investigation data, phytoplankton Investigation data, animal plankton enquiry data, bottom-dwelling enquiry data, intertidal organism enquiry data, fish planktonic organism enquiry data, nektonic organism enquiry data, Mangrove Communities enquiry data, sea grass bed Community Survey data, coral reef Community Survey data and coral fishes enquiry data.
3. method according to claim 2, is characterized in that,
The described keyword parameter arranged for every class sea life enquiry data comprises general parameter and the special keyword parameter according to every class sea life enquiry data setting;
Wherein said general parameter comprises: flight number, observation unit, stand position number, longitude, latitude and monitoring the date;
Described special keyword parameter comprise following one or more: the common first names of the parameter field of species Chinese name, species Latin, density, biomass, net type and setting and another name.
4. method according to claim 2, is characterized in that, carries out during standardization checks to sorted every class sea life enquiry data:
Described parameter name standardization checks, comprise for the parameter name for checking arranges title and another name, retrieving described sea life enquiry data, when retrieving the sea life enquiry data consistent with the described title arranged or described another name, all returning described title;
The form of described supplemental characteristic checks, comprise and whether the supplemental characteristic form in described sea life enquiry data is compared consistent with the standard format corresponding with it pre-set, if inconsistent, be the form consistent with described standard format by described supplemental characteristic format conversion;
Described Data-parallel language standardization checks, comprise and indicate according to user the field name selecting polishing rest position, in the polishing unit belonging to described field name, the mentioned null cell of described previous cell column is filled by the value of previous cell, if described previous cell is empty, then the row at described previous cell place are not filled, and wherein said previous cell is cell adjacent with the first mentioned null cell of these row in every column data.
5. method according to claim 2, is characterized in that, the described result according to coupling realizes biological name and checks, and comprising:
Return in original sea life enquiry data according to matching result and increase the biological standard Chinese name, standard latin name and the process record that check output newly, described process record comprises primeval life name and the amended standard biological name corresponding with it.
6. sea life enquiry data checks device fast, it is characterized in that, comprising:
Sort module, for the type according to sea life investigation key element, classifies to sea life enquiry data;
Core inspection module, for carrying out described sorted every class sea life enquiry data, complete property checks, standardization checks and accuracy checks, and realizes the one-stop batch Quality Control of species name, and generation checks result;
Application module, unify sea life enquiry data form, specification species name, amendment misdata for checking result described in basis, improve the quality of data, for the research of marine eco-environment situation, prediction marine eco-environment variation tendency provide quality data support;
Described complete property checks and comprises: the keyword parameter arranged according to the type according to sea life enquiry data, whether fill in complete by the sea life enquiry data retrieved corresponding to described keyword parameter, check the complete property of described sea life enquiry data;
Described standardization checks and comprises: parameter name standardization checks, the form of supplemental characteristic checks and Data-parallel language standardization checks;
Wherein, the described module that checks is carried out accuracy to sorted every class sea life enquiry data and is checked, comprise: polishing checks, a unit is recorded as with the every bar data in described sea life enquiry data, by comparison every bar record erect-position, longitude and latitude, export the record of same erect-position different longitude and latitude, the different erect-position of same longitude and latitude;
Erect-position lands and checks, unit is recorded as with every bar data, extract not identical erect-position, longitude, latitude data span data, and described spatial data is presented in the water front figure of China's Mainland as data point, wherein each described data point all associates with raw data;
Data re-scheduling, arranges re-scheduling field according to biological data type and carries out re-scheduling;
Biological name checks, species name in sea life enquiry data is mated with the species name in the sea life standard database generated in advance and sea life name standard specification sheet, comprising: the latin name in described sea life enquiry data is mated with the standard latin name in described sea life standard database, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise mated with the standard Chinese name in described sea life standard database by the Chinese name in described sea life enquiry data, if identical, outputting standard Chinese name and standard latin name, realize biological name and check; Otherwise by the matching instruction of user, realize biological name and check, and matching result is stored in sea life name standard specification sheet as a record, be used for species name inferior to sea life standard database and check; Realize biological name according to the result of coupling to check.
CN201410471975.2A 2014-09-16 2014-09-16 The quick check method of sea life enquiry data and device Expired - Fee Related CN104268181B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410471975.2A CN104268181B (en) 2014-09-16 2014-09-16 The quick check method of sea life enquiry data and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410471975.2A CN104268181B (en) 2014-09-16 2014-09-16 The quick check method of sea life enquiry data and device

Publications (2)

Publication Number Publication Date
CN104268181A CN104268181A (en) 2015-01-07
CN104268181B true CN104268181B (en) 2016-03-02

Family

ID=52159703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410471975.2A Expired - Fee Related CN104268181B (en) 2014-09-16 2014-09-16 The quick check method of sea life enquiry data and device

Country Status (1)

Country Link
CN (1) CN104268181B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557577A (en) * 2016-11-27 2017-04-05 威海蓝印海洋生物科技有限公司 The quick check method of marine organisms survey data and device

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194142B (en) * 2017-03-31 2020-08-28 苏州艾隆信息技术有限公司 Drug information element compensation method and system
CN108036961B (en) * 2017-12-07 2020-10-30 国家海洋局南海环境监测中心 Marine environment monitoring multi-task data acquisition method, storage medium and electronic equipment
CN108520348B (en) * 2018-04-02 2021-10-26 重庆大学 Ecological index prediction method based on mangrove forest ecological big data
CN108983994B (en) * 2018-06-11 2021-10-29 山东省海洋资源与环境研究院 Intelligent input and standardized output system for marine organism identification in LIMS (LiMS)
CN108875056B (en) * 2018-06-28 2021-08-13 中国建设银行股份有限公司 Data checking method and device, electronic equipment and readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102651093A (en) * 2012-03-31 2012-08-29 上海海洋大学 Marine information management system based on time series outlier detection technology

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102651093A (en) * 2012-03-31 2012-08-29 上海海洋大学 Marine information management system based on time series outlier detection technology

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《建立中国海洋生物物种多样性信息库技术方案》;石雅君等;《海洋信息技术》;20040331(第3期);全文 *
《海洋生物分类数据录入质量控制的一种方法》;陈虹勋等;《生物多样性与人类未来——第二届全国生物多样性保护与持续利用研讨会论文集》;19961231;全文 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106557577A (en) * 2016-11-27 2017-04-05 威海蓝印海洋生物科技有限公司 The quick check method of marine organisms survey data and device

Also Published As

Publication number Publication date
CN104268181A (en) 2015-01-07

Similar Documents

Publication Publication Date Title
CN104268181B (en) The quick check method of sea life enquiry data and device
Riedel et al. Integrative taxonomy on the fast track-towards more sustainability in biodiversity research
Mitchell et al. Ancient DNA reveals elephant birds and kiwi are sister taxa and clarifies ratite bird evolution
Geiger et al. How to tackle the molecular species inventory for an industrialized nation—lessons from the first phase of the German Barcode of Life initiative GBOL (2012–2015)
CN102662930B (en) Corpus tagging method and corpus tagging device
Jacsó Calculating the h‐index and other bibliometric and scientometric indicators from Google Scholar with the Publish or Perish software
Heenan et al. Long-term monitoring of coral reef fish assemblages in the Western central pacific
Durkin et al. When mycologists describe new species, not all relevant information is provided (clearly enough)
Marhold et al. The future of botanical monography: Report from an international workshop, 12–16 March 2012, Smolenice, Slovak Republic
CN106557577A (en) The quick check method of marine organisms survey data and device
Singer et al. A survey of digitized data from US fish collections in the iDigBio data aggregator
CN107122395B (en) Data sampling method and device
Kutter et al. LEAP-UCD-2017 experiments (liquefaction experiments and analysis projects)
Brenskelle et al. Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets
Heřmánková et al. Inscriptions as data: digital epigraphy in macro-historical perspective
Shirai et al. Development of a system for the automated identification of herbarium specimens with high accuracy
CN105389482A (en) Massive data analysis method based on cloud platform
Saktura et al. SahulArch: A geochronological database for the archaeology of Sahul
CN106682871A (en) Method and device for determining resume grade
CN110162684B (en) Machine reading understanding data set construction and evaluation method based on deep learning
CN101975849B (en) Quick qualitatively and quantitatively optimizing method of phytoplankton
CN112800246B (en) Policy pedigree construction method and device and electronic equipment
Kahanamoku et al. Twelve thousand recent patellogastropods from a northeastern Pacific latitudinal gradient
Rimac et al. Environmental Gradients Shaping the Freshwater Bryophyte Communities of Croatia (Western Balkans)
Coca-De-La-Iglesia et al. A protocol to retrieve and curate spatial and climatic data from online biodiversity databases using R

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160302

Termination date: 20200916