CN116450632B - Geographic sample data quality evaluation method, device and storage medium - Google Patents

Geographic sample data quality evaluation method, device and storage medium Download PDF

Info

Publication number
CN116450632B
CN116450632B CN202310421521.3A CN202310421521A CN116450632B CN 116450632 B CN116450632 B CN 116450632B CN 202310421521 A CN202310421521 A CN 202310421521A CN 116450632 B CN116450632 B CN 116450632B
Authority
CN
China
Prior art keywords
sample data
quality
index
geographic
artificial intelligence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310421521.3A
Other languages
Chinese (zh)
Other versions
CN116450632A (en
Inventor
上官博屹
贺广均
冯鹏铭
金世超
符晗
陈千千
常江
梁颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Satellite Information Engineering
Original Assignee
Beijing Institute of Satellite Information Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Satellite Information Engineering filed Critical Beijing Institute of Satellite Information Engineering
Priority to CN202310421521.3A priority Critical patent/CN116450632B/en
Publication of CN116450632A publication Critical patent/CN116450632A/en
Application granted granted Critical
Publication of CN116450632B publication Critical patent/CN116450632B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The invention relates to a geographical sample data quality assessment method, equipment and a storage medium, wherein the geographical sample data quality assessment method comprises the following steps: analyzing quality characteristics of the geographic artificial intelligence sample data of the multi-application level, and establishing a sample data quality index system of the multi-application level; determining characteristics and quality specifications of a geographic artificial intelligence sample data set for quality assessment; determining a quality assessment specification of a geographic artificial intelligence sample data set for quality assessment; performing quality evaluation of the geographic artificial intelligence sample data to obtain a quality evaluation result; based on the quality assessment results, a geographic artificial intelligence sample data quality assessment report is generated. The invention can meet the quality evaluation requirements of the geographic artificial intelligence sample data of pixel level, target level and scene level multi-application level, and provide systematic reference for the quality evaluation of the geographic artificial intelligence sample data, thereby helping to improve the reliability of the sample data.

Description

Geographic sample data quality evaluation method, device and storage medium
Technical Field
The present invention relates to the field of geographic artificial intelligence, and in particular, to a method, an apparatus, and a storage medium for evaluating quality of geographic sample data.
Background
Currently, the dominant geographic artificial intelligence algorithms are mostly data-driven algorithms, the key part of which is training data, or what is called sample data. While the feature learning process of artificial intelligence algorithms is generally robust, useful models can be constructed using sample data with some noise or error. However, the quality of the sample data, such as uneven data distribution and sample label errors, still reduces the performance of the artificial intelligence model to some extent, thereby affecting the quality of the final output product. In making data-driven artificial intelligence decisions, it is important to obtain sample data that is verifiable and has good quality control. The sample data quality information helps potential data users decide whether and how to use sample data and supports predictive outcome error analysis based thereon. Providing sample data quality information may also improve the reliability of the sample data to increase the chance of sample data reuse. Thus, special attention is required to quality assessment and quality description of the geographic artificial intelligence sample data sharing process.
However, the existing geographic artificial intelligence sample data quality evaluation system has the defects of low integrity, poor applicability and weak scalability, and needs to be oriented to sample data quality requirements of different application levels such as pixel level, object level, scene level and the like, so that a more reliable geographic artificial intelligence sample quality evaluation method is researched.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention aims to provide a geographic sample data quality evaluation method, equipment and a storage medium, which can meet the quality evaluation requirement of multi-application-level geographic artificial intelligence sample data and provide systematic reference for the quality evaluation of the geographic artificial intelligence sample data, thereby helping to improve the reliability of the sample data.
In order to achieve the above object, the present invention provides a geographic sample data quality evaluation method, comprising the steps of:
s1, analyzing quality characteristics of geographic artificial intelligent sample data of a plurality of application levels, and establishing a sample data quality index system of the plurality of application levels;
s2, determining the characteristics and the quality specifications of a geographic artificial intelligence sample data set for quality evaluation;
s3, determining a quality evaluation specification of a geographic artificial intelligence sample data set for quality evaluation;
s4, performing quality evaluation of the geographic artificial intelligence sample data to obtain a quality evaluation result;
s5, generating a geographic artificial intelligence sample data quality evaluation report based on the quality evaluation result obtained in the step S4;
wherein the multi-application hierarchy includes at least a scene level, an object level, and a pixel level.
According to one aspect of the present invention, further comprising:
step S6, monitoring the quality evaluation process from the step S2 to the step S5 to generate feedback information;
and S7, improving the evaluation priority of the algorithms, models, tools, protocols and various quality indexes of the quality evaluation by using the feedback information.
According to one aspect of the present invention, in the step S1, the quality index system includes six quality dimensions of integrity, logical consistency, topic accuracy, location accuracy, time quality, and availability.
According to one aspect of the invention, the geographic artificial intelligence sample data quality assessment report comprises at least: geographic artificial intelligence sample dataset description information, quality assessment metadata information, quality assessment method information, quality assessment result information, and information that aids in the process and method of understanding and using quality information.
According to one aspect of the invention, in said step S1,
the scene-level geographic artificial intelligence sample data quality index system comprises:
scene category labeling deletion indexes and scene category labeling redundancy indexes of the integrity dimension;
image size information consistency index, image format information consistency index and image band information consistency index of the logic consistency dimension;
scene annotation category precision indexes and sample attribute precision indexes of theme precision dimensions;
a sample spatial position accuracy index of a position accuracy dimension;
sample time accuracy index and sample time validity index of time quality dimension;
a scene category balance index for the availability dimension;
the object-level geographic artificial intelligence sample data quality index system comprises:
an object category labeling deletion index, an object category labeling redundancy index, an object position labeling deletion index and an object position labeling redundancy index of the integrity dimension;
the method comprises the steps of determining a logical consistency dimension image size information consistency index, an image format information consistency index, an image band information consistency index, an object position labeling format consistency index and an object position labeling topology consistency index;
object labeling category precision indexes and sample attribute precision indexes of theme precision dimensions;
sample space position accuracy index of position accuracy dimension, object labeling position offset index and object labeling position overlapping index;
sample time accuracy index and sample time validity index of time quality dimension;
object class balance index of availability dimension;
the pixel-level geographic artificial intelligence sample data quality index system comprises:
pixel class labeling deletion indexes and pixel class labeling redundancy indexes of the integrity dimension;
the method comprises the steps of image size information consistency index, image format information consistency index, image band information consistency index, image-to-size information consistency index, image-to-space information consistency index, labeling image size consistency index and labeling image format consistency index of the logic consistency dimension;
the pixels of the theme precision dimension are marked with category precision indexes and sample attribute precision indexes;
a sample spatial position accuracy index of a position accuracy dimension;
sample time precision index of time quality dimension, sample time effectiveness index, image time consistency index;
a pixel class balance index for the usability dimension.
According to an aspect of the present invention, in the step S2, specifically includes:
step S21, detecting whether metadata of the geographic artificial intelligence sample data set is complete and whether the sample data set can be successfully found and accessed through the metadata;
s22, analyzing the specific application task type served by the sample data set through the metadata of the geographic artificial intelligence sample data set;
step S23, based on the specific application task type of the sample data set, judging that the application level of the sample data set belongs to a scene level, an object level or a pixel level, and judging that the type of the label in the sample data set belongs to a scene label, an object label or a pixel label;
step S24, determining and recording the quality evaluation purpose of the geographic artificial intelligence sample data set;
step S25, based on the quality evaluation purpose, determining the quality dimension and the corresponding quality index of the sample data set to be evaluated, and prioritizing the quality index to be evaluated.
According to an aspect of the present invention, in the step S3, specifically includes:
step S31, determining a quality assessment unit based on an application level of the geographical artificial intelligence sample data set,
wherein the quality evaluation unit comprises: the area is taken as a quality evaluation unit, the object is taken as a quality evaluation unit or the pixel is taken as a quality evaluation unit;
s32, analyzing a geographic artificial intelligence sample data set of a quality evaluation unit, and selecting a corresponding algorithm or model to evaluate a quality index corresponding to the sample data set;
and step S33, describing and recording the attribute and quality dimension of the quality index to be evaluated in the geographic artificial intelligence sample data set, setting of quality evaluation priority, and the selected evaluation method or model and corresponding reasons.
According to an aspect of the present invention, in the step S32, a full detection method is used for a sample data set having a smaller data amount, and a sampling detection method is used for a sample data set having a larger data amount.
According to an aspect of the present invention, in the step S4, specifically includes:
step S41, acquiring a sampling sample data unit or all sample data units to a quality evaluation sample data subset;
step S42, determining a true value or reference data corresponding to each sampled sample data unit or all sample data units, including the following approaches: indoor manual interpretation, existing spatial data products, and data collected in the field;
step S43, if the sampling detection method is adopted in the step S32, estimating a correlation quality evaluation result of the total sample data set based on the sampled sample data units and the reference data; if the full detection method is adopted in the step S32, the relevant quality evaluation result of the overall sample data set is obtained based on all the sample data units and the reference data.
According to an aspect of the present invention, there is provided an electronic apparatus including: one or more processors, one or more memories, and one or more computer programs; wherein the processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform a method for quality assessment of geographical sample data according to any one of the above claims.
According to an aspect of the present invention, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement a method for evaluating quality of geographical sample data according to any one of the above-mentioned aspects.
The invention provides a geographic sample data quality assessment method, equipment and a storage medium, which are characterized in that firstly, geographic artificial intelligence sample data quality information requirements of different application levels are analyzed, and a geographic artificial intelligence sample data quality index system is established aiming at pixel level, target level and scene level data; secondly, a normalized and scientific geographic artificial intelligence sample data quality evaluation flow is provided for sharing and interoperating geographic artificial intelligence sample data. The invention can meet the quality evaluation requirements of the geographic artificial intelligence sample data of pixel level, target level and scene level multi-application level, and provide systematic reference for the quality evaluation of the geographic artificial intelligence sample data, thereby helping to improve the reliability of the sample data.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
FIG. 1 schematically illustrates a flow chart of a method for geo-sample data quality assessment provided in accordance with one embodiment of the present invention;
FIG. 2 schematically illustrates an overall process diagram for geographic artificial intelligence sample data quality assessment in accordance with one embodiment of the invention;
FIG. 3 schematically illustrates a scene level geographic artificial intelligence sample data quality metrics architecture in accordance with an embodiment of the invention;
FIG. 4 schematically illustrates an object-level geographic artificial intelligence sample data quality metrics architecture in accordance with one embodiment of the invention;
FIG. 5 schematically illustrates a pixel-level geographic artificial intelligence sample data quality metrics architecture in accordance with one embodiment of the invention;
FIG. 6 schematically illustrates a geographic artificial intelligence sample data quality assessment flow diagram in accordance with one embodiment of the invention.
Detailed Description
The description of the embodiments of this specification should be taken in conjunction with the accompanying drawings, which are a complete description of the embodiments. In the drawings, the shape or thickness of the embodiments may be enlarged and indicated simply or conveniently. Furthermore, portions of the structures in the drawings will be described in terms of separate descriptions, and it should be noted that elements not shown or described in the drawings are in a form known to those of ordinary skill in the art.
Any references to directions and orientations in the description of the embodiments herein are for convenience only and should not be construed as limiting the scope of the invention in any way. The following description of the preferred embodiments will refer to combinations of features, which may be present alone or in combination, and the invention is not particularly limited to the preferred embodiments. The scope of the invention is defined by the claims.
As shown in fig. 1, 2 and 6, a geographic sample data quality evaluation method of the present invention includes the following steps:
s1, analyzing quality characteristics of geographic artificial intelligent sample data of a plurality of application levels, and establishing a sample data quality index system of the plurality of application levels;
s2, determining the characteristics and the quality specifications of a geographic artificial intelligence sample data set for quality evaluation;
s3, determining a quality evaluation specification of a geographic artificial intelligence sample data set for quality evaluation;
s4, performing quality evaluation of the geographic artificial intelligence sample data to obtain a quality evaluation result;
s5, generating a geographic artificial intelligence sample data quality evaluation report based on the quality evaluation result obtained in the step S4;
wherein the multi-application hierarchy includes at least a scene level, an object level, and a pixel level.
In the embodiment, quality evaluation is carried out on the geographic artificial intelligence sample data of different application levels including a pixel level, a target level and a scene level, the geographic artificial intelligence sample data quality information requirements of the different application levels are analyzed, and a geographic artificial intelligence sample data quality index system is established; the quality evaluation method is oriented to sharing and interoperation of the geographic artificial intelligence sample data, improves transparency and credibility of quality evaluation results, provides a normalized and scientific quality evaluation flow of the geographic artificial intelligence sample data, can meet the quality evaluation requirement of the geographic artificial intelligence sample data of multiple application levels, and provides systematic reference for quality evaluation of the geographic artificial intelligence sample data, thereby helping to improve reliability of the sample data.
Specifically, firstly, by analyzing the quality characteristics of the geographic artificial intelligence sample data of multiple application levels, establishing sample data quality index systems of different application levels, then determining the characteristics and quality specifications of the geographic artificial intelligence sample data sets for quality evaluation according to requirements, wherein the purpose is to determine the quality dimension and corresponding quality index required to be evaluated of the sample data sets and order, and then determine the quality evaluation specification of the geographic artificial intelligence sample data sets for quality evaluation, and the purpose is to determine a specific algorithm or model for evaluating the quality index corresponding to the sample data sets; finally, based on the step S2 and the step S3, the quality evaluation of the geographic artificial intelligence sample data is executed, a quality evaluation result is obtained, a corresponding quality evaluation report is generated, and the quality evaluation of the geographic artificial intelligence sample data of the multi-application level is completed.
As shown in fig. 1, 2 and 6, according to one embodiment of the present invention, preferably, the method further includes:
step S6, monitoring the quality evaluation process from the step S2 to the step S5 to generate feedback information;
and S7, improving the evaluation priority of the algorithms, models, tools, protocols and various quality indexes of the quality evaluation by using the feedback information.
In this embodiment, when steps S2 to S5 are performed, the above-mentioned process is completely monitored, feedback information is generated based on the monitored process, and the above-mentioned steps are corrected and improved by using the feedback information, so that the whole evaluation process is gradually perfected, wherein the modification of the algorithm, the model, the tool, and the protocol belongs to the improvement of step S3, and the improvement of the evaluation priority of each quality index belongs to the improvement of step S2.
According to one aspect of the invention, in step S1, the quality index hierarchy includes six quality dimensions of integrity, logical consistency, topic accuracy, location accuracy, time quality, and availability.
In this embodiment, the quality index system including six quality dimensions including integrity, logical consistency, subject accuracy, position accuracy, time quality, and availability is proposed for the multi-application hierarchy, and then each sample data quality index is proposed based on the six quality dimensions, so as to implement reliability evaluation on the sample data.
In one embodiment according to the present invention, preferably, the geographic artificial intelligence sample data quality assessment report comprises at least: geographic artificial intelligence sample dataset description information, quality assessment metadata information, quality assessment method information, quality assessment result information, and information that aids in the process and method of understanding and using quality information.
In one embodiment according to the present invention, preferably, in step S1,
as shown in fig. 3, the scene-level geographic artificial intelligence sample data quality index system includes:
scene category labeling deletion indexes and scene category labeling redundancy indexes of the integrity dimension;
image size information consistency index, image format information consistency index and image band information consistency index of the logic consistency dimension;
scene annotation category precision indexes and sample attribute precision indexes of theme precision dimensions;
a sample spatial position accuracy index of a position accuracy dimension;
sample time accuracy index and sample time validity index of time quality dimension;
a scene category balance index for the availability dimension;
as shown in fig. 4, the object-level geographic artificial intelligence sample data quality index system includes:
an object category labeling deletion index, an object category labeling redundancy index, an object position labeling deletion index and an object position labeling redundancy index of the integrity dimension;
the method comprises the steps of determining a logical consistency dimension image size information consistency index, an image format information consistency index, an image band information consistency index, an object position labeling format consistency index and an object position labeling topology consistency index;
object labeling category precision indexes and sample attribute precision indexes of theme precision dimensions;
sample space position accuracy index of position accuracy dimension, object labeling position offset index and object labeling position overlapping index;
sample time accuracy index and sample time validity index of time quality dimension;
object class balance index of availability dimension;
as shown in fig. 5, the pixel-level geographic artificial intelligence sample data quality index system comprises:
pixel class labeling deletion indexes and pixel class labeling redundancy indexes of the integrity dimension;
the method comprises the steps of image size information consistency index, image format information consistency index, image band information consistency index, image-to-size information consistency index, image-to-space information consistency index, labeling image size consistency index and labeling image format consistency index of the logic consistency dimension;
the pixels of the theme precision dimension are marked with category precision indexes and sample attribute precision indexes;
a sample spatial position accuracy index of a position accuracy dimension;
sample time precision index of time quality dimension, sample time effectiveness index, image time consistency index;
a pixel class balance index for the usability dimension.
As shown in fig. 6, in one embodiment of the present invention, preferably, in step S2, the method specifically includes:
step S21, detecting whether metadata of the geographic artificial intelligence sample data set is complete and whether the sample data set can be successfully found and accessed through the metadata;
s22, analyzing the specific application task type served by the sample data set through the metadata of the geographic artificial intelligence sample data set;
step S23, based on the specific application task type of the sample data set, judging that the application level of the sample data set belongs to a scene level, an object level or a pixel level, and judging that the type of the label in the sample data set belongs to a scene label, an object label or a pixel label;
step S24, determining and recording the quality evaluation purpose of the geographic artificial intelligence sample data set;
step S25, based on the quality evaluation purpose, determining the quality dimension and the corresponding quality index of the sample data set to be evaluated, and prioritizing the quality index to be evaluated.
In this embodiment, it is first necessary to determine the integrity of metadata of the sample data set, and whether the sample data set can be successfully found and accessed through the metadata, and if incomplete or inaccessible, metadata information needs to be supplemented and published; analyzing the specific application task types served by the sample data set through metadata, wherein the application service types comprise application levels for scene classification (corresponding to scene level application levels), target detection (corresponding to object level application levels) or semantic segmentation tasks (corresponding to pixel level application levels), and judging the application levels of the sample data set according to the specific application task types so as to determine that the types marked in the sample data set belong to scene marks (corresponding to scene level application levels), object marks (corresponding to object level application levels) or pixel marks (corresponding to pixel level application levels); based on the quality evaluation purpose, the quality dimension and the corresponding quality index to be evaluated of the sample data set are determined, and the quality index to be evaluated is prioritized.
As shown in fig. 6, in one embodiment of the present invention, preferably, in step S3, the method specifically includes:
step S31, determining a quality assessment unit based on an application level of the geographical artificial intelligence sample data set,
wherein the quality evaluation unit comprises: whether the region is a quality assessment unit (corresponding to a scene level application level), the object is a quality assessment unit (corresponding to an object level application level), or the pixel is a quality assessment unit (corresponding to a pixel level application level);
s32, analyzing a geographic artificial intelligence sample data set of a quality evaluation unit, and selecting a corresponding algorithm or model to evaluate a quality index corresponding to the sample data set;
step S33, describing and recording the attribute and quality dimension of the quality index to be evaluated in the geographic artificial intelligence sample dataset, the setting of the quality evaluation priority, the selected evaluation method or model and the corresponding reasons so as to ensure the reproducibility of the quality evaluation flow of the sample dataset.
In one embodiment of the present invention, in step S32, it is preferable to use a full detection method for a sample data set with a smaller data amount and to use a sampling detection method for a sample data set with a larger data amount.
In this embodiment, a data volume threshold is typically defined in the data set prior to performing the artificial intelligence sample data evaluation, and when the data volume in the data set is greater than the data volume threshold, the data volume is considered to be greater, then a sample detection method is used for the sample data set, including but not limited to a probabilistic sampling method and a non-probabilistic sampling method; when the data volume in the data set is smaller than or equal to the data volume threshold, the data volume is considered to be smaller, and then the full detection method is adopted for the sample data set, so that the high efficiency of sample data quality assessment is facilitated.
As shown in fig. 6, in one embodiment of the present invention, preferably, in step S4, the method specifically includes:
step S41, acquiring a sampling sample data unit or all sample data units to a quality evaluation sample data subset;
step S42, determining a true value or reference data corresponding to each sampled sample data unit or all sample data units, including the following approaches: indoor manual interpretation, existing spatial data products, and data collected in the field;
step S43, if the sampling detection method is adopted in the step S32, estimating a correlation quality evaluation result of the total sample data set based on the sampled sample data units and the reference data; if the full detection method is adopted in step S32, the relevant quality evaluation result of the overall sample data set is obtained based on all the sample data units and the reference data.
According to an aspect of the present invention, there is provided an electronic apparatus including: one or more processors, one or more memories, and one or more computer programs; wherein the processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform a method for quality assessment of geographical sample data according to any one of the above solutions.
According to an aspect of the present invention, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement a method for evaluating quality of geographical sample data according to any one of the above technical solutions.
The invention discloses a geographical sample data quality evaluation method, equipment and a storage medium, wherein the geographical sample data quality evaluation method comprises the following steps: s1, analyzing quality characteristics of geographic artificial intelligent sample data of a plurality of application levels, and establishing a sample data quality index system of the plurality of application levels; s2, determining the characteristics and the quality specifications of a geographic artificial intelligence sample data set for quality evaluation; s3, determining a quality evaluation specification of a geographic artificial intelligence sample data set for quality evaluation; s4, performing quality evaluation of the geographic artificial intelligence sample data to obtain a quality evaluation result; s5, generating a geographic artificial intelligence sample data quality evaluation report based on the quality evaluation result obtained in the step S4; firstly, analyzing the data quality information requirements of geographic artificial intelligence samples of different application levels, and establishing a geographic artificial intelligence sample data quality index system aiming at pixel level, target level and scene level data; secondly, a normalized and scientific geographical artificial intelligence sample data quality evaluation flow is provided for sharing and interoperating the geographical artificial intelligence sample data, the quality evaluation requirements of the geographical artificial intelligence sample data of pixel-level, target-level and scene-level multi-application levels can be met, systematic reference is provided for quality evaluation of the geographical artificial intelligence sample data, and accordingly reliability of the sample data is improved.
Furthermore, it should be noted that the present invention can be provided as a method, an apparatus, or a computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product on one or more computer-usable storage media having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
It is finally pointed out that the above description of the preferred embodiments of the invention, it being understood that although preferred embodiments of the invention have been described, it will be obvious to those skilled in the art that, once the basic inventive concepts of the invention are known, several modifications and adaptations can be made without departing from the principles of the invention, and these modifications and adaptations are intended to be within the scope of the invention. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the invention.

Claims (9)

1. A method for evaluating the quality of geographic sample data, comprising the steps of:
s1, analyzing quality characteristics of geographic artificial intelligent sample data of a plurality of application levels, and establishing a sample data quality index system of the plurality of application levels;
s2, determining the characteristics and the quality specifications of a geographic artificial intelligence sample data set for quality evaluation;
s3, determining a quality evaluation specification of a geographic artificial intelligence sample data set for quality evaluation;
s4, performing quality evaluation of the geographic artificial intelligence sample data to obtain a quality evaluation result;
s5, generating a geographic artificial intelligence sample data quality evaluation report based on the quality evaluation result obtained in the step S4;
wherein the multi-application hierarchy comprises at least a scene level, an object level, and a pixel level;
the scene-level geographic artificial intelligence sample data quality index system comprises:
scene category labeling deletion indexes and scene category labeling redundancy indexes of the integrity dimension;
image size information consistency index, image format information consistency index and image band information consistency index of the logic consistency dimension;
scene annotation category precision indexes and sample attribute precision indexes of theme precision dimensions;
a sample spatial position accuracy index of a position accuracy dimension;
sample time accuracy index and sample time validity index of time quality dimension;
a scene category balance index for the availability dimension;
the object-level geographic artificial intelligence sample data quality index system comprises:
an object category labeling deletion index, an object category labeling redundancy index, an object position labeling deletion index and an object position labeling redundancy index of the integrity dimension;
the method comprises the steps of determining a logical consistency dimension image size information consistency index, an image format information consistency index, an image band information consistency index, an object position labeling format consistency index and an object position labeling topology consistency index;
object labeling category precision indexes and sample attribute precision indexes of theme precision dimensions;
sample space position accuracy index of position accuracy dimension, object labeling position offset index and object labeling position overlapping index;
sample time accuracy index and sample time validity index of time quality dimension;
object class balance index of availability dimension;
the pixel-level geographic artificial intelligence sample data quality index system comprises:
pixel class labeling deletion indexes and pixel class labeling redundancy indexes of the integrity dimension;
the method comprises the steps of image size information consistency index, image format information consistency index, image band information consistency index, image-to-size information consistency index, image-to-space information consistency index, labeling image size consistency index and labeling image format consistency index of the logic consistency dimension;
the pixels of the theme precision dimension are marked with category precision indexes and sample attribute precision indexes;
a sample spatial position accuracy index of a position accuracy dimension;
sample time precision index of time quality dimension, sample time effectiveness index, image time consistency index;
a pixel class balance index for the usability dimension.
2. The geographical sample data quality assessment method of claim 1, further comprising:
step S6, monitoring the quality evaluation process from the step S2 to the step S5 to generate feedback information;
and S7, improving the evaluation priority of the algorithms, models, tools, protocols and various quality indexes of the quality evaluation by using the feedback information.
3. The method according to claim 1, wherein in the step S1, the quality index system includes six quality dimensions of integrity, logical consistency, topic accuracy, location accuracy, time quality, and availability.
4. The geo-sample data quality assessment method of claim 1, wherein the geo-artificial intelligence sample data quality assessment report comprises at least: geographic artificial intelligence sample dataset description information, quality assessment metadata information, quality assessment method information, quality assessment result information, and information that aids in the process and method of understanding and using quality information.
5. A geographical sample data quality assessment method according to claim 3, wherein in said step S2, specifically comprising:
step S21, detecting whether metadata of the geographic artificial intelligence sample data set is complete and whether the sample data set can be successfully found and accessed through the metadata;
s22, analyzing the specific application task type served by the sample data set through the metadata of the geographic artificial intelligence sample data set;
step S23, based on the specific application task type of the sample data set, judging that the application level of the sample data set belongs to a scene level, an object level or a pixel level, and judging that the type of the label in the sample data set belongs to a scene label, an object label or a pixel label;
step S24, determining and recording the quality evaluation purpose of the geographic artificial intelligence sample data set;
step S25, based on the quality evaluation purpose, determining the quality dimension and the corresponding quality index of the sample data set to be evaluated, and prioritizing the quality index to be evaluated.
6. The method for evaluating the quality of geographic sample data according to claim 5, wherein in said step S3, specifically comprising:
step S31, determining a quality assessment unit based on an application level of the geographical artificial intelligence sample data set,
wherein the quality evaluation unit comprises: the area is taken as a quality evaluation unit, the object is taken as a quality evaluation unit or the pixel is taken as a quality evaluation unit;
s32, analyzing a geographic artificial intelligence sample data set of a quality evaluation unit, and selecting a corresponding algorithm or model to evaluate a quality index corresponding to the sample data set;
step S33, describing and recording the attribute and quality dimension of the quality index to be evaluated in the geographic artificial intelligence sample dataset, the setting of the quality evaluation priority, the selected evaluation method or model and the corresponding reasons;
in the step S32, a full detection method is used for a sample data set having a small data amount, and a sampling detection method is used for a sample data set having a large data amount.
7. The method for evaluating the quality of geographic sample data according to claim 6, wherein in step S4, specifically comprising:
step S41, acquiring a sampling sample data unit or all sample data units to a quality evaluation sample data subset;
step S42, determining a true value or reference data corresponding to each sampled sample data unit or all sample data units, including the following approaches: indoor manual interpretation, existing spatial data products, and data collected in the field;
step S43, if the sampling detection method is adopted in the step S32, estimating a correlation quality evaluation result of the total sample data set based on the sampled sample data units and the reference data; if the full detection method is adopted in the step S32, the relevant quality evaluation result of the overall sample data set is obtained based on all the sample data units and the reference data.
8. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein the processor is connected to the memory, the one or more computer programs being stored in the memory, which processor, when the electronic device is running, executes the one or more computer programs stored in the memory to cause the electronic device to perform the geographical sample data quality assessment method of any one of claims 1-7.
9. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the geographical sample data quality assessment method of any one of claims 1 to 7.
CN202310421521.3A 2023-04-18 2023-04-18 Geographic sample data quality evaluation method, device and storage medium Active CN116450632B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310421521.3A CN116450632B (en) 2023-04-18 2023-04-18 Geographic sample data quality evaluation method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310421521.3A CN116450632B (en) 2023-04-18 2023-04-18 Geographic sample data quality evaluation method, device and storage medium

Publications (2)

Publication Number Publication Date
CN116450632A CN116450632A (en) 2023-07-18
CN116450632B true CN116450632B (en) 2023-12-19

Family

ID=87119927

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310421521.3A Active CN116450632B (en) 2023-04-18 2023-04-18 Geographic sample data quality evaluation method, device and storage medium

Country Status (1)

Country Link
CN (1) CN116450632B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2014201362A1 (en) * 2013-12-24 2015-07-09 Qfire Software Pty Ltd A Data Management System
CN107730115A (en) * 2017-10-17 2018-02-23 云南大学 A kind of method for evaluating quality of the multi-source location track data based on AHP
CN107766471A (en) * 2017-09-27 2018-03-06 中国农业大学 The organization and management method and device of a kind of multi-source data
CN111339215A (en) * 2019-05-31 2020-06-26 北京东方融信达软件技术有限公司 Structured data set quality evaluation model generation method, evaluation method and device
CN113485988A (en) * 2021-06-30 2021-10-08 东莞市小精灵教育软件有限公司 Data quality monitoring method and device and computer readable storage medium
CN114064618A (en) * 2020-07-31 2022-02-18 中国电信股份有限公司 Data quality evaluation method and system
CN114625820A (en) * 2022-02-16 2022-06-14 武汉大学 Sample library system and organization method for artificial intelligence remote sensing image interpretation
CN114926008A (en) * 2022-05-16 2022-08-19 天元大数据信用管理有限公司 Data quality evaluation method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11816077B2 (en) * 2021-03-02 2023-11-14 Saudi Arabian Oil Company Measuring data quality in a structured database through SQL

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2014201362A1 (en) * 2013-12-24 2015-07-09 Qfire Software Pty Ltd A Data Management System
CN107766471A (en) * 2017-09-27 2018-03-06 中国农业大学 The organization and management method and device of a kind of multi-source data
CN107730115A (en) * 2017-10-17 2018-02-23 云南大学 A kind of method for evaluating quality of the multi-source location track data based on AHP
CN111339215A (en) * 2019-05-31 2020-06-26 北京东方融信达软件技术有限公司 Structured data set quality evaluation model generation method, evaluation method and device
CN114064618A (en) * 2020-07-31 2022-02-18 中国电信股份有限公司 Data quality evaluation method and system
CN113485988A (en) * 2021-06-30 2021-10-08 东莞市小精灵教育软件有限公司 Data quality monitoring method and device and computer readable storage medium
CN114625820A (en) * 2022-02-16 2022-06-14 武汉大学 Sample library system and organization method for artificial intelligence remote sensing image interpretation
CN114926008A (en) * 2022-05-16 2022-08-19 天元大数据信用管理有限公司 Data quality evaluation method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
大金湖国家地质公园GIS地理空间元数据设计研究;施蓓琦 等;测绘与空间地理信息;第27卷(第04期);28-31, 48 *
开放政府数据评估框架、指标与方法研究;郑磊 等;图书情报工作;第60卷(第18期);43-55 *

Also Published As

Publication number Publication date
CN116450632A (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN113742387A (en) Data processing method, device and computer readable storage medium
CN111476191B (en) Artificial intelligent image processing method based on intelligent traffic and big data cloud server
CN111476192B (en) Intercepted image synthesis method based on intelligent traffic and big data cloud server
CN110471945B (en) Active data processing method, system, computer equipment and storage medium
CN112199559B (en) Data feature screening method and device and computer equipment
CN109684320B (en) Method and equipment for online cleaning of monitoring data
CN111160959B (en) User click conversion prediction method and device
CN112801434A (en) Method, device, equipment and storage medium for monitoring performance index health degree
CN115952081A (en) Software testing method, device, storage medium and equipment
CN114881343A (en) Short-term load prediction method and device of power system based on feature selection
CN114564345A (en) Server abnormity detection method, device, equipment and storage medium
CN112860736A (en) Big data query optimization method and device and readable storage medium
CN116450632B (en) Geographic sample data quality evaluation method, device and storage medium
CN112948262A (en) System test method, device, computer equipment and storage medium
CN102546235A (en) Performance diagnosis method and system of web-oriented application under cloud computing environment
KR102413588B1 (en) Object recognition model recommendation method, system and computer program according to training data
CN112182413B (en) Intelligent recommendation method and server based on big teaching data
CN111814759B (en) Method and device for acquiring face quality label value, server and storage medium
CN112885049B (en) Intelligent cable early warning system, method and device based on operation data
CN112528500B (en) Evaluation method and evaluation equipment for scene graph construction model
CN115098679A (en) Method, device, equipment and medium for detecting abnormality of text classification labeling sample
CN110955710B (en) Dirty data processing method and device in data exchange operation
CN112152968B (en) Network threat detection method and device
CA3170297A1 (en) Generating performance predictions with uncertainty intervals
CN112906805A (en) Image training sample screening and task model training method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant