CN109299062A - A kind of quality evaluating method and system towards document category digital resource metadata - Google Patents
A kind of quality evaluating method and system towards document category digital resource metadata Download PDFInfo
- Publication number
- CN109299062A CN109299062A CN201810707861.1A CN201810707861A CN109299062A CN 109299062 A CN109299062 A CN 109299062A CN 201810707861 A CN201810707861 A CN 201810707861A CN 109299062 A CN109299062 A CN 109299062A
- Authority
- CN
- China
- Prior art keywords
- metadata
- evaluation index
- digital resource
- data
- field
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of quality evaluating method and system towards document category digital resource metadata, the described method includes: S1, according to the self attributes of target literature class digital resource, the quality evaluation system of metadata in the target literature class digital resource is constructed;S2 carries out every verification to each metadata according to each evaluation index in the quality evaluation system;S3 calculates the total score of the metadata according to every weight for verifying result corresponding score and each evaluation index.The present invention is realized to the quality evaluation towards document category digital resource metadata, and quality evaluation precision is high.
Description
Technical field
The invention belongs to library science technical fields, more particularly, to one kind towards document category digital resource metadata
Quality evaluating method and system.
Background technique
Development, the propulsion of global IT application sequence lasts, the number of document category data resource are constantly progressive with science and technology
Amount and the scale of construction increase at an unprecedented rate.Metadata as describe these data resources significant data, how to be comprehensively
The inspection of system and the quality of evaluation resource metadata are directly concerning data subsequent use.
The quality evaluation of document category digital resource metadata is lacked at present a set of relatively complete, comprehensive, flexible
, the quality evaluating method that can be landed, the relevant evaluation method of most existing is only expounded in theoretic, only
It is introduced from the dimension of assay, the specific evaluation rule for file, record and field attribute is not provided, to text
The landing for offering the quality evaluating method of class digital resource metadata is implemented to lack substantive directive significance.
Summary of the invention
To overcome the above-mentioned existing quality evaluating method towards document category digital resource metadata only theoretically to carry out
The problem of illustrating, can not landing implementation at least is partially solved the above problem, and the present invention provides one kind towards document category number
The quality evaluating method and system of word resource metadata.
According to the first aspect of the invention, a kind of quality evaluating method towards document category digital resource metadata is provided,
Include:
S1 constructs first number in the target literature class digital resource according to the self attributes of target literature class digital resource
According to quality evaluation system;
S2 carries out every verification to each metadata according to each evaluation index in the quality evaluation system;
S3 calculates the metadata according to every weight for verifying result corresponding score and each evaluation index
Total score.
Specifically, the quality evaluation system includes in integrality, correctness, consistency, uniqueness and timeliness
One or more evaluation indexes;
Correspondingly, the step S2 is specifically included:
According to the Integrity Assessment index, whether the data entity verified in the metadata is lacked, data file is
No missing, whether field contents one of lack or more during whether data record lacks, whether data structure lacks and record
Kind;
According to the correctness evaluation index, verifies the legitimacy of the metadata, validity, with the presence or absence of messy code and is
It is no there are one of unified value substitution or a variety of;
According to the Conformance Assessment index, the mathematical logic consistency and/or content format one of the metadata are verified
Cause property;
According to the uniqueness evaluation index, the data record uniqueness and/or determinant attribute value of the metadata are verified
Uniqueness;
According to the timeliness index, the data content novelty and/or chained address validity of the metadata are verified.
Specifically, between the step S1 and S3 further include:
According to rank belonging to the self attributes, classify to the corresponding metadata of the self attributes;
Correspondingly, the step S2 further include:
According to the corresponding evaluation index of every one kind metadata, every one kind metadata is verified;
Wherein, every one kind metadata and the preparatory associated storage of the evaluation index.
Specifically, the rank according to belonging to the self attributes classifies to the corresponding metadata of the self attributes
The step of specifically include:
According to rank belonging to the self attributes, the corresponding metadata of the self attributes is divided into file-level member number
According to, record one of grade metadata and field level metadata or a variety of.
Specifically, according to the corresponding evaluation index of every one kind metadata, every one kind metadata is verified
The step of specifically include:
According to the correctness, integrality and timeliness evaluation index, the file directory of the file-level meta-data is verified
Legitimacy, file designation legitimacy, quantity of documents integrality and file reach one of timeliness or a variety of;
According to the correctness, integrality and timeliness evaluation index, the document classification of the record grade metadata is verified
One of legitimacy, document location legitimacy, file designation legitimacy, quantity of documents integrality and file generated timeliness or
It is a variety of;
According to the integrality, correctness, consistency, uniqueness and timeliness evaluation index, the field level member is verified
Record integrality, field integrality, field type legitimacy, field length legitimacy, the field format legitimacy, field of data
Whether rightness of business, field timeliness, field accuracy, field have whether messy code, field unified value substitution occur, data are patrolled
Collect consistency, content format consistency, record uniqueness, determinant attribute value uniqueness, data content novelty and link address
One of validity is a variety of.
Specifically, the step S3 further include:
According to the corresponding score of result and the corresponding weight of each evaluation index is verified, every one kind metadata is calculated
Score.
Specifically, the step S3 is specifically included:
The weight as a result, the corresponding score of verification result evaluation index corresponding with the verification result is verified for any
It is multiplied;
The corresponding multiplied result of all verification results is added, the total score of the metadata is obtained.
A kind of QA system towards document category digital resource metadata is provided according to a second aspect of the present invention, is wrapped
It includes:
Module is constructed, for the self attributes according to target literature class digital resource, constructs the target literature class number
The quality evaluation system of metadata in resource;
Verify module, for according to each evaluation index in the quality evaluation system, to each metadata into
Row is verified;
Computing module, for calculating the member according to the weight for verifying result corresponding score and each evaluation index
The total score of data.
According to the third aspect of the invention we, a kind of quality evaluation equipment towards document category digital resource metadata is provided,
Include:
At least one processor, at least one processor and bus;Wherein,
The processor and memory complete mutual communication by the bus;
The memory is stored with the program instruction that can be executed by the processor, and the processor calls described program to refer to
Order is able to carry out foregoing method.
According to the fourth aspect of the invention, a kind of non-transient computer readable storage medium is provided, for storing such as preceding institute
State the computer program of method.
The present invention provides a kind of quality evaluating method and system towards document category digital resource metadata, and this method passes through
Quality evaluation system is constructed according to the self attributes of target literature class digital resource, according to the quality evaluation system
In each evaluation index, every verification is carried out to each metadata, according to the total score for verifying result and calculating metadata, to realize
To the quality evaluation towards document category digital resource metadata, and quality evaluation precision is high.
Detailed description of the invention
Fig. 1 is the quality evaluating method overall flow provided in an embodiment of the present invention towards document category digital resource metadata
Schematic diagram;
Fig. 2 is the QA system overall structure provided in an embodiment of the present invention towards document category digital resource metadata
Schematic diagram;
Fig. 3 is the quality evaluation equipment overall structure provided in an embodiment of the present invention towards document category digital resource metadata
Schematic diagram.
Specific embodiment
With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below
Example is not intended to limit the scope of the invention for illustrating the present invention.
A kind of quality evaluating method towards document category digital resource metadata is provided in one embodiment of the invention,
Fig. 1 is the quality evaluating method overall flow schematic diagram provided in an embodiment of the present invention towards document category digital resource metadata,
This method comprises: S1 constructs metadata in target literature class digital resource according to the self attributes of target literature class digital resource
Quality evaluation system;
Wherein, target literature class digital resource is the document category digital resource for needing to carry out quality evaluation.Target literature class
The self attributes of digital resource include catalogue, name, quantity, classification and position of target literature class digital resource etc..Metadata
Also known as broker data or relaying data predominantly describe the information of data attribute for the data for describing data, such as refer to for supporting
Show the functions such as storage location, historical data, data-gathering and file record.Quality evaluation system includes multiple for carrying out
The evaluation index of quality evaluation.
S2 carries out every verification to each metadata according to each evaluation index in quality evaluation system;
Wherein, it verifies and refers to judge whether each metadata reaches each evaluation index in target literature class digital resource.It verifies
Based on verifying with computer check, supplemented by desk checking is verified, to obtain inspection result, survey report is exported.Pass through meter
Calculation machine program verifies most of metadata to according to evaluation index, obtains initial results, manually performs a small amount of inspection
And summarize check results.According to each evaluation index in quality evaluation system, every verification is carried out to each metadata.It examines
As a result reference can be provided for the improvement and promotion of target literature class digital resource quality, such as is returned on the basis of inspection result
In target literature of tracing back class digital resource metadata there are the problem of, thus faster find metadata various problems.
S3 calculates the total score of metadata according to every weight for verifying result corresponding score and evaluation index.
Wherein, each verifies the corresponding preparatory associated storage of score of result, each is verified, and result is corresponding to be commented
The preparatory associated storage of the weight of valence index, according to every weight for verifying result corresponding score and each evaluation index, Computing Meta
The total score of data.User can quickly distinguish the quality of metadata quality according to the total score of metadata.
The present embodiment by according to the self attributes of target literature class digital resource construct quality evaluation system, according to
Each evaluation index in the quality evaluation system carries out every verification to each metadata, according to verification result Computing Meta
The total score of data, to realize to the quality evaluation towards document category digital resource metadata, and quality evaluation precision is high.
On the basis of the above embodiments, in the present embodiment quality evaluation system include integrality, it is correctness, consistent
One of property, uniqueness and timeliness or a variety of evaluation indexes;Correspondingly, step S2 is specifically included: according to Integrity Assessment
Index, whether the data entity verified in metadata lacks, whether data file lacks, whether data record lacks, data knot
Structure whether lack and record in field contents one of whether lack or a variety of;According to correctness evaluation index, first number is verified
According to legitimacy, validity, with the presence or absence of messy code and with the presence or absence of unified value substitution one of or it is a variety of;It is commented according to consistency
Valence index verifies the mathematical logic consistency and/or content format consistency of metadata;According to uniqueness evaluation index, verify
The data record uniqueness and/or determinant attribute value uniqueness of metadata;According to timeliness index, verify in the data of metadata
Hold novelty and/or chained address validity.
Specifically, the quality evaluation system of document category data resource metadata include integrality, it is correctness, consistent
One of property, uniqueness and timeliness or a variety of evaluation indexes.Integrity Assessment index is the most basic guarantor of metadata quality
Whether barrier, including data entity lack, as not having this type data in existing data;Whether data file lacks, such as file
It can not obtain or damage;Whether data record lacks, as data record strip number be worth than expected it is on the low side;Whether data structure lacks
It loses, as whether the field attribute of data record lacks;Whether field contents lack in record, if field contents are one in null value
Kind is a variety of.
Correctness evaluation index is the consistent degree for describing data and real object, i.e. whether the field contents in record are deposited
In exception or mistake.Correctness evaluation property index includes the legitimacy of metadata, validity, with the presence or absence of messy code and whether there is
One of unified value substitution is a variety of.Wherein, the legitimacy of metadata refers to whether description field content meets data type
It is required that, length requirement, call format or other business needs etc., if whether field type legal, whether date format legal, word
Whether segment length is legal etc..For example, the postcode one of China is set to 6 integers and cannot be letter, E-mail address is centainly wrapped
Containing character "@" etc..The validity of metadata refer to description data whether within the scope of reasonable codomain or meet user definition
Condition.For example, the age is generally integer of the value between 1 to 120, time general value is after 1900 Christian eras and phase
Pass event has logic precedence relationship, and the content of enumeration type field cannot have the value except the field enumerated value range.Member
Data refer to whether description character type-word section content has messy code with the presence or absence of Confused-code.Metadata is substituted with the presence or absence of unified value
Refer to description field content whether by system batch fill in into identical value, be usually predominantly for character type field contents, than
Such as, " author " content of all records all extends this as " * ", and " authors' working unit " of all records all fills in "None" etc..Further include
Other situations of metadata error.When the correctness evaluation index to source data is verified, for obvious exception
Error value is difficult to find, generally requires machine check and combine with artificial verification.
The Conformance Assessment index of metadata includes mathematical logic consistency and/or content format consistency.Wherein, data
Logical consistency refers to whether the value of the identical attribute of business meaning is consistent in different systems or data set.Mathematical logic one
It is the incidence relation examined in data set between data value that cause property, which detects simplest method, is shared in confirmation data set
Whether data attribute has identical value.Result is verified to be indicated with the quantity for the entity for not being able to maintain data consistency.Content
Uniform format refers to whether same field content format in different records is unified, primarily directed to the character for having fixed format
Type-word section.Such as: date of birth, some records fill in 1980-12, and some records extend this as 1978/12 etc..Content format is unified
Property evaluation result is only unified or disunity, i.e., 100 points or 0 are in two kinds of situation.
The uniqueness evaluation index description data record and key data values of metadata are not repeated definition and use
Property, including data record uniqueness and/or determinant attribute value uniqueness.Wherein, data record uniqueness refers to that content is complete
Whether identical record occurs repeatedly.Determinant attribute value uniqueness, which refers to, does not allow duplicate determinant attribute value to occur repeatedly, than
Identification card number such as two employees is identical.
The timeliness index of metadata refers to the information novelty of description data record and the validity of chained address, including
Data content novelty and/or chained address validity.Wherein, data content novelty refers to that the metadata of upload or update is
It is no to be changed for generation or content in this update cycle.Journal article, data are updated with year granularity as the end of the year 2017 are newly-increased
In " publish year " field whether have 2016 before record (containing 2016).Chained address validity refer to for comprising external or
The content of internal links address verifies whether that related service information can be obtained by chained address.The present embodiment is based on quality
One of integrality, correctness, consistency, uniqueness and timeliness in assessment indicator system or a variety of evaluation indexes are right
Metadata is verified.
On the basis of the above embodiments, in the present embodiment between step S1 and S3 further include: according to belonging to self attributes
Rank, classify to the corresponding metadata of self attributes;Correspondingly, step S2 further include: according to every a kind of metadata pair
The evaluation index answered verifies every a kind of metadata;Wherein, every a kind of metadata and the preparatory associated storage of evaluation index.
Specifically, rank belonging to the self attributes of document category digital resource includes file-level, record rank and field
One of rank is a variety of.The rank according to belonging to self attributes classifies to the corresponding metadata of self attributes, itself
Rank belonging to attribute can be increased or decreased according to actual needs, so that metadata carries out the flexibility of checksum verification.
On the basis of the above embodiments, the rank according to belonging to the self attributes in the present embodiment, to it is described itself
The step of corresponding metadata of attribute is classified specifically includes: according to rank belonging to the self attributes, will it is described itself
The corresponding metadata of attribute is divided into file-level meta-data, record one of grade metadata and field level metadata or a variety of.
On the basis of the above embodiments, right according to the corresponding evaluation index of every one kind metadata in the present embodiment
The step of every one kind metadata is verified specifically includes: according to legitimacy, integrality and timeliness evaluation index, verifying
File directory legitimacy, file designation legitimacy, quantity of documents integrality and the file of the file-level meta-data reach timely
One of property is a variety of;According to correctness, integrality and timeliness evaluation index, the file of the record grade metadata is verified
One in legitimacy, document location legitimacy, file designation legitimacy, quantity of documents integrality and the file generated timeliness of classifying
Kind is a variety of;According to integrality, correctness, consistency, uniqueness and timeliness evaluation index, the field level metadata is verified
Record integrality, field integrality, field type legitimacy, field length legitimacy, field format legitimacy, field business
Whether legitimacy, field timeliness, field accuracy, field have whether messy code, field unified value substitution, mathematical logic one occur
Cause property, content format consistency, record uniqueness, determinant attribute value uniqueness, data content novelty and link address are effective
One of property is a variety of.
Specifically, the content verified to file-level meta-data is as shown in table 1, is verified record grade metadata
Content is as shown in table 2, and the content verified field level metadata is as shown in table 3.Classification is not verified, and verification is finer, from
And make quality evaluation more accurate.
1 file-level meta-data of table verifies content details table
Table 2 records grade metadata and verifies content details table
On the basis of the above embodiments, step S3 described in the present embodiment further include: according to corresponding point of result of verification
Number weight corresponding with each evaluation index calculates the score of every a kind of metadata.
Specifically, for any sort metadata, by the corresponding score of each verification result of such metadata multiplied by each verification
As a result it is added after the weight of corresponding evaluation index, obtains the score of such metadata.
On the basis of the various embodiments described above, step S3 described in the present embodiment is specifically included: for it is any verify as a result,
The corresponding score of verification result is multiplied with the weight of the corresponding evaluation index of verification result;All verifications are tied
The corresponding multiplied result of fruit is added, and the total score of the metadata is obtained.
3 field level metadata of table verifies content details table
A kind of quality evaluation system towards document category digital resource metadata is provided in another embodiment of the present invention
System, Fig. 2 are the QA system overall structure signal provided in an embodiment of the present invention towards document category digital resource metadata
Figure, the system include building module 1, verify module 2 and computing module 3;Wherein:
Module 1 is constructed for the self attributes according to target literature class digital resource, constructs target literature class digital resource
The quality evaluation system of middle metadata;
Wherein, target literature class digital resource is the document category digital resource for needing to carry out quality evaluation.Target literature class
The self attributes of digital resource include catalogue, name, quantity, classification and position of target literature class digital resource etc..Metadata
Also known as broker data or relaying data predominantly describe the information of data attribute for the data for describing data, such as refer to for supporting
Show the functions such as storage location, historical data, data-gathering and file record.Quality evaluation system includes multiple for carrying out
The evaluation index of quality evaluation.Module 1 is constructed according to the self attributes of target literature class digital resource, constructs target literature class number
The quality evaluation system of metadata in word resource.
It verifies module 2 to be used for according to each evaluation index in quality evaluation system, every core is carried out to each metadata
It looks into;
Wherein, it verifies and refers to judge whether each metadata reaches each evaluation index in target literature class digital resource.It verifies
Based on verifying with computer check, supplemented by desk checking is verified, to obtain inspection result, survey report is exported.Pass through meter
Calculation machine program verifies most of metadata to according to evaluation index, obtains initial results, manually performs a small amount of inspection
And summarize check results.Module 2 is verified according to each evaluation index in quality evaluation system, each metadata is carried out every
It verifies.Inspection result can provide reference for the improvement and promotion of target literature class digital resource quality, such as in inspection result
On the basis of recall target literature class digital resource in metadata there are the problem of, thus faster discovery the various of metadata ask
Topic.
Computing module 3 is used to calculate metadata according to every weight for verifying result corresponding score and evaluation index
Total score.
Wherein, each verifies the corresponding preparatory associated storage of score of result, each is verified, and result is corresponding to be commented
The preparatory associated storage of the weight of valence index, computing module 3 is according to every power for verifying result corresponding score and each evaluation index
Weight, calculates the total score of metadata.User can quickly distinguish the quality of metadata quality according to the total score of metadata.
The present embodiment by according to the self attributes of target literature class digital resource construct quality evaluation system, according to
Each evaluation index in the quality evaluation system carries out every verification to each metadata, according to verification result Computing Meta
The total score of data, to realize to the quality evaluation towards document category digital resource metadata, and quality evaluation precision is high.
On the basis of the above embodiments, in the present embodiment quality evaluation system include integrality, it is correctness, consistent
One of property, uniqueness and timeliness or a variety of evaluation indexes;Correspondingly, it verifies module to be specifically used for: be commented according to integrality
Valence index, whether the data entity verified in metadata lacks, whether data file lacks, whether data record lacks, data
Structure whether lack and record in field contents one of whether lack or a variety of;According to correctness evaluation index, member is verified
The legitimacies of data, validity, with the presence or absence of messy code and with the presence or absence of one of unified value substitution or a variety of;According to consistency
Evaluation index verifies the mathematical logic consistency and/or content format consistency of metadata;According to uniqueness evaluation index, core
Look into the data record uniqueness and/or determinant attribute value uniqueness of metadata;According to timeliness index, the data of metadata are verified
Content freshness and/or chained address validity.
On the basis of the above embodiments, categorization module in the present embodiment, is used for: the rank according to belonging to self attributes,
Classify to the corresponding metadata of self attributes;Correspondingly, verify module to be also used to: a kind of metadata is corresponding comments according to every
Valence index verifies every a kind of metadata;Wherein, every a kind of metadata and the preparatory associated storage of evaluation index.
On the basis of the above embodiments, categorization module is specifically used in the present embodiment: according to belonging to the self attributes
Rank, the corresponding metadata of the self attributes is divided into file-level meta-data, record grade metadata and field level metadata
One of or it is a variety of.
On the basis of the above embodiments, in the present embodiment verify module also particularly useful for: according to legitimacy, integrality and
One of timeliness evaluation index is a variety of, verifies the file directory legitimacy of the file-level meta-data, file designation is closed
Method, quantity of documents integrality and file reach one of timeliness or a variety of;It is commented according to correctness, integrality and timeliness
One of valence index is a variety of, verifies document classification legitimacy, the document location legitimacy, file of the record grade metadata
Name one of legitimacy, quantity of documents integrality and file generated timeliness or a variety of;According to integrality, correctness, one
One of cause property, uniqueness and timeliness evaluation index are a variety of, verify record integrality, the word of the field level metadata
Section integrality, field type legitimacy, field length legitimacy, field format legitimacy, field rightness of business, field timeliness
Whether property, field accuracy, field have whether messy code, field unified value substitution, mathematical logic consistency, content format one occur
One of cause property, record uniqueness, determinant attribute value uniqueness, data content novelty and link address validity are more
Kind.
On the basis of the above embodiments, computing module is also used in the present embodiment: according to the corresponding score of verification result
Weight corresponding with each evaluation index calculates the score of every a kind of metadata.
On the basis of the above embodiments, computing module is specifically used in the present embodiment: for any verification as a result, the core
The corresponding score of the fruit that comes to an end is multiplied with the weight of the corresponding evaluation index of verification result;By all verification results pair
The multiplied result answered is added, and the total score of the metadata is obtained.
The present embodiment provides a kind of quality evaluation equipment towards document category digital resource metadata, Fig. 3 is that the present invention is real
Apply example offer the quality evaluation equipment overall structure diagram towards document category digital resource metadata, the equipment include: to
A few processor 31, at least one processor 32 and bus 33;Wherein,
Processor 31 and memory 32 pass through bus 33 and complete mutual communication;
Memory 32 is stored with the program instruction that can be executed by processor 31, and the processor calls described program to instruct energy
Enough execute method provided by above-mentioned each method embodiment, for example, S1, according to itself belonging to for target literature class digital resource
Property, construct the quality evaluation system of metadata in target literature class digital resource;S2, according in quality evaluation system
Each evaluation index, every verification is carried out to each metadata;S3, according to the every corresponding score of result and evaluation index verified
Weight calculates the total score of metadata.
The present embodiment provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage medium
Computer instruction is stored, the computer instruction makes the computer execute method provided by above-mentioned each method embodiment, example
Such as include: S1, according to the self attributes of target literature class digital resource, constructs the matter of metadata in target literature class digital resource
Measure assessment indicator system;S2 carries out every verification to each metadata according to each evaluation index in quality evaluation system;
S3 calculates the total score of metadata according to every weight for verifying result corresponding score and evaluation index.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light
The various media that can store program code such as disk.
Quality evaluation equipment embodiment towards document category digital resource metadata described above is only schematic
, wherein the unit as illustrated by the separation member may or may not be physically separated, it is aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.Some or all of the modules therein can be selected to realize the mesh of this embodiment scheme according to the actual needs
's.Those of ordinary skill in the art are without paying creative labor, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should
Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers
It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation
Method described in certain parts of example or embodiment.
Finally, the present processes are only preferable embodiment, it is not intended to limit the scope of the present invention.It is all
Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in protection of the invention
Within the scope of.
Claims (10)
1. a kind of quality evaluating method towards document category digital resource metadata characterized by comprising
S1 constructs metadata in the target literature class digital resource according to the self attributes of target literature class digital resource
Quality evaluation system;
S2 carries out every verification to each metadata according to each evaluation index in the quality evaluation system;
S3, according to every weight for verifying result corresponding score and each evaluation index, calculate the metadata must
Point.
2. the method according to claim 1, wherein the quality evaluation system include integrality, it is correct
One of property, consistency, uniqueness and timeliness or a variety of evaluation indexes;
Correspondingly, the step S2 is specifically included:
According to the Integrity Assessment index, whether the data entity verified in the metadata is lacked, whether data file lacks
It loses, whether field contents one of lack or a variety of during whether data record lacks, whether data structure lacks and record;
According to the correctness evaluation index, verifies the legitimacy of the metadata, validity, with the presence or absence of messy code and whether deposits
Unified value substitution one of or it is a variety of;
According to the Conformance Assessment index, the mathematical logic consistency and/or content format consistency of the metadata are verified;
According to the uniqueness evaluation index, data record uniqueness and/or the determinant attribute value for verifying the metadata are unique
Property;
According to the timeliness index, the data content novelty and/or chained address validity of the metadata are verified.
3. according to the method described in claim 2, it is characterized in that, between the step S1 and S3 further include:
According to rank belonging to the self attributes, classify to the corresponding metadata of the self attributes;
Correspondingly, the step S2 further include:
According to the corresponding evaluation index of every one kind metadata, every one kind metadata is verified;
Wherein, every one kind metadata and the preparatory associated storage of the evaluation index.
4. according to the method described in claim 3, it is characterized in that, the rank according to belonging to the self attributes, to it is described from
The step of corresponding metadata of body attribute is classified specifically includes:
According to rank belonging to the self attributes, the corresponding metadata of the self attributes is divided into file-level meta-data, note
Record one of grade metadata and field level metadata or a variety of.
5. according to the method described in claim 4, it is characterized in that, according to the corresponding evaluation index of every one kind metadata,
The step of verifying every one kind metadata specifically includes:
According to the correctness, integrality and timeliness evaluation index, the file directory for verifying the file-level meta-data is legal
Property, file designation legitimacy, quantity of documents integrality and file reach one of timeliness or a variety of;
According to the correctness, integrality and timeliness evaluation index, the document classification for verifying the record grade metadata is legal
One of property, document location legitimacy, file designation legitimacy, quantity of documents integrality and file generated timeliness are more
Kind;
According to the integrality, correctness, consistency, uniqueness and timeliness evaluation index, the field level metadata is verified
Record integrality, field integrality, field type legitimacy, field length legitimacy, field format legitimacy, field business
Whether legitimacy, field timeliness, field accuracy, field have whether messy code, field unified value substitution, mathematical logic one occur
Cause property, content format consistency, record uniqueness, determinant attribute value uniqueness, data content novelty and link address are effective
One of property is a variety of.
6. according to the method described in claim 3, it is characterized in that, the step S3 further include:
According to the corresponding score of result and the corresponding weight of each evaluation index is verified, obtaining for every one kind metadata is calculated
Point.
7. wanting any method of 1-6 according to right, which is characterized in that the step S3 is specifically included:
For any verification as a result, the weight of the corresponding score of verification result evaluation index corresponding with the verification result carries out
It is multiplied;
The corresponding multiplied result of all verification results is added, the total score of the metadata is obtained.
8. a kind of QA system towards document category digital resource metadata characterized by comprising
It constructs module and constructs the target literature class digital resource for the self attributes according to target literature class digital resource
The quality evaluation system of middle metadata;
Module is verified, for carrying out core to each metadata according to each evaluation index in the quality evaluation system
It looks into;
Computing module, for calculating the metadata according to the weight for verifying result corresponding score and each evaluation index
Total score.
9. a kind of quality evaluation equipment towards document category digital resource metadata characterized by comprising
At least one processor, at least one processor and bus;Wherein,
The processor and memory complete mutual communication by the bus;
The memory is stored with the program instruction that can be executed by the processor, and the processor calls described program to instruct energy
Enough methods executed as described in claim 1 to 7 is any.
10. a kind of non-transient computer readable storage medium, which is characterized in that the non-transient computer readable storage medium is deposited
Computer instruction is stored up, the computer instruction makes the computer execute the method as described in claim 1 to 7 is any.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810707861.1A CN109299062A (en) | 2018-07-02 | 2018-07-02 | A kind of quality evaluating method and system towards document category digital resource metadata |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810707861.1A CN109299062A (en) | 2018-07-02 | 2018-07-02 | A kind of quality evaluating method and system towards document category digital resource metadata |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109299062A true CN109299062A (en) | 2019-02-01 |
Family
ID=65167844
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810707861.1A Pending CN109299062A (en) | 2018-07-02 | 2018-07-02 | A kind of quality evaluating method and system towards document category digital resource metadata |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109299062A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298564A (en) * | 2019-06-17 | 2019-10-01 | 迪普佰奥生物科技(上海)有限公司 | Biomedical product evaluation method, device, medium, terminal based on artificial intelligence |
CN111026742A (en) * | 2019-12-05 | 2020-04-17 | 东莞中国科学院云计算产业技术创新与育成中心 | Data quality evaluation method and device, computer equipment and storage medium |
CN111897803A (en) * | 2020-08-17 | 2020-11-06 | 国网辽宁省电力有限公司信息通信分公司 | Database integrity evaluation method for power industry business system |
CN112559510A (en) * | 2021-01-18 | 2021-03-26 | 山东省齐鲁大数据研究院 | Method and system for evaluating open data quality |
CN113127482A (en) * | 2019-12-31 | 2021-07-16 | 奇安信科技集团股份有限公司 | Data quality analysis method and device, computer equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530347A (en) * | 2013-10-09 | 2014-01-22 | 北京东方网信科技股份有限公司 | Internet resource quality assessment method and system based on big data mining |
CN104484448A (en) * | 2014-12-26 | 2015-04-01 | 浙江协同数据系统有限公司 | Assessment method for relational data quality |
CN107368957A (en) * | 2017-07-04 | 2017-11-21 | 广西电网有限责任公司电力科学研究院 | A kind of construction method of equipment condition monitoring quality of data evaluation and test system |
CN107748775A (en) * | 2017-10-17 | 2018-03-02 | 上海计算机软件技术开发中心 | A kind of data governing system based on the quality of data |
-
2018
- 2018-07-02 CN CN201810707861.1A patent/CN109299062A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103530347A (en) * | 2013-10-09 | 2014-01-22 | 北京东方网信科技股份有限公司 | Internet resource quality assessment method and system based on big data mining |
CN104484448A (en) * | 2014-12-26 | 2015-04-01 | 浙江协同数据系统有限公司 | Assessment method for relational data quality |
CN107368957A (en) * | 2017-07-04 | 2017-11-21 | 广西电网有限责任公司电力科学研究院 | A kind of construction method of equipment condition monitoring quality of data evaluation and test system |
CN107748775A (en) * | 2017-10-17 | 2018-03-02 | 上海计算机软件技术开发中心 | A kind of data governing system based on the quality of data |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110298564A (en) * | 2019-06-17 | 2019-10-01 | 迪普佰奥生物科技(上海)有限公司 | Biomedical product evaluation method, device, medium, terminal based on artificial intelligence |
CN111026742A (en) * | 2019-12-05 | 2020-04-17 | 东莞中国科学院云计算产业技术创新与育成中心 | Data quality evaluation method and device, computer equipment and storage medium |
CN113127482A (en) * | 2019-12-31 | 2021-07-16 | 奇安信科技集团股份有限公司 | Data quality analysis method and device, computer equipment and storage medium |
CN113127482B (en) * | 2019-12-31 | 2024-03-26 | 奇安信科技集团股份有限公司 | Data quality analysis method, device, computer equipment and storage medium |
CN111897803A (en) * | 2020-08-17 | 2020-11-06 | 国网辽宁省电力有限公司信息通信分公司 | Database integrity evaluation method for power industry business system |
CN111897803B (en) * | 2020-08-17 | 2023-10-20 | 国网辽宁省电力有限公司信息通信分公司 | Database integrity evaluation method for power industry service system |
CN112559510A (en) * | 2021-01-18 | 2021-03-26 | 山东省齐鲁大数据研究院 | Method and system for evaluating open data quality |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109299062A (en) | A kind of quality evaluating method and system towards document category digital resource metadata | |
US10417120B2 (en) | Pluggable fault detection tests for data pipelines | |
CA2773919C (en) | Systems and methods for creating intuitive context for analysis data | |
US20160078113A1 (en) | Validating code of an extract, transform and load (etl) tool | |
US10268749B1 (en) | Clustering sparse high dimensional data using sketches | |
CN107844588A (en) | A kind of processing method of data dictionary, device, storage medium and processor | |
Müller et al. | Goodness-of-fit tests for the cure rate in a mixture cure model | |
AU2018235926A1 (en) | Property graph data model representing system architecture | |
CN112597062B (en) | Military software structured quality data extraction method and device and software testing device | |
CN105868956A (en) | Data processing method and device | |
CN108460068A (en) | Method, apparatus, storage medium and the terminal that report imports and exports | |
CN113868498A (en) | Data storage method, electronic device, device and readable storage medium | |
CN110134596A (en) | The generation method and terminal device of test document | |
CN112699142A (en) | Cold and hot data processing method and device, electronic equipment and storage medium | |
CN113268665A (en) | Information recommendation method, device and equipment based on random forest and storage medium | |
CN109101431A (en) | A kind of testing case management, computer readable storage medium and terminal device | |
CN110532612A (en) | The operation data processing method and processing device of ship power system | |
CN110134646A (en) | The storage of knowledge platform service data and integrated approach and system | |
US10146809B2 (en) | Mining of policy data source description based on file, storage and application meta-data | |
WO2019192310A1 (en) | Group network identification method and device, computer device, and computer-readable storage medium | |
CN109584091B (en) | Generation method and device of insurance image file | |
Lacroix et al. | Lessons learnt in industrial data platform integration | |
EP2731021A1 (en) | Apparatus, program, and method for reconciliation processing in a graph database | |
CN116701714A (en) | Data storage method, device, equipment and medium based on multi-way tree | |
CN115952224A (en) | Heterogeneous report integration method, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190201 |