CN114090720A - Detection data sharing model, method and data processing method thereof - Google Patents

Detection data sharing model, method and data processing method thereof Download PDF

Info

Publication number
CN114090720A
CN114090720A CN202111445132.1A CN202111445132A CN114090720A CN 114090720 A CN114090720 A CN 114090720A CN 202111445132 A CN202111445132 A CN 202111445132A CN 114090720 A CN114090720 A CN 114090720A
Authority
CN
China
Prior art keywords
data
structured
unstructured
redundancy
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111445132.1A
Other languages
Chinese (zh)
Inventor
钱基业
陈咏涛
李小平
向菲
朱珠
王谦
赵宇琪
李永福
龙英凯
陈伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
State Grid Corp of China SGCC
Original Assignee
Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
State Grid Corp of China SGCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd, State Grid Corp of China SGCC filed Critical Electric Power Research Institute of State Grid Chongqing Electric Power Co Ltd
Priority to CN202111445132.1A priority Critical patent/CN114090720A/en
Publication of CN114090720A publication Critical patent/CN114090720A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/328Management therefor

Abstract

The invention provides a detection data sharing model, a detection data sharing method and a data processing method thereof, wherein the sharing model comprises unstructured data, structured data, quantitative evaluation indexes and a data processing scheme; the experimental detection data is stored through unstructured data and/or structured data; the quantitative evaluation index is used as the basis for experimental detection data conversion; the data processing scheme is used for realizing the conversion of experimental detection data; the quantitative evaluation indexes comprise the structuralization rate and the data redundancy, and the data processing scheme is realized and the data sharing model is optimized by reducing the data redundancy. The invention provides wide compatibility from the input of the model, and is convenient for equipment to access various detection data; from the output of the model, sufficient usability is provided, and a user can conveniently obtain various types of detection data.

Description

Detection data sharing model, method and data processing method thereof
Technical Field
The invention relates to the technical field of experimental detection, in particular to a detection data sharing model, a detection data sharing method and a data processing method.
Background
The laboratory demands for sharing detection data are increasingly vigorous, and supporting technologies such as sensors and the Internet of things are rapidly developed, so that the digital process of the laboratory is promoted to be accelerated. The construction of a networking test detection platform for managing and controlling a plurality of test detection devices requires that a large amount of detection data in various formats be accessed to a unified platform, and provides a way for users to conveniently acquire the data. The experimental detection data storage and sharing mode is a key technology for driving the platform to operate. The related data detected by the experiment has unstructured data such as images, videos and waveforms which need to be further processed besides the structured data which can be directly accessed. The experiment detection platform needs to be designed with a reasonable data model, is widely compatible with the data, improves the storage of unstructured data through technical means, promotes the extraction of structured data from the unstructured data, and provides a sharing mode convenient for obtaining data for users.
Disclosure of Invention
In view of the above, the present invention provides a detection data sharing model, a detection data sharing method, and a data processing method thereof, which implement wide data compatibility through an unstructured data portion, ensure sufficient availability of data through a structured data portion, promote storage optimization of unstructured data through quantization indexes, extract structured data from the unstructured data, and optimize a data sharing model.
To facilitate understanding, the present invention provides an embodiment of a detection data sharing model, comprising: unstructured data, structured data, quantitative evaluation indexes and data processing schemes;
the experimental detection data is stored through unstructured data and/or structured data; the quantitative evaluation index is used as a basis for experimental detection data conversion; the data processing scheme is used for realizing the conversion of experimental detection data; the quantitative evaluation index comprises a structuring rate and data redundancy, and a data processing scheme and a data sharing model are optimized by reducing the data redundancy.
Preferably, the unstructured data and the structured data are both stored in a table of a database, and the unstructured data is stored as one field of the structured data.
Preferably, the structured data comprises data information, detection equipment information, tested article information, detection personnel information and unstructured data information;
the unstructured data are used for realizing wide compatibility of a data sharing model and storing experimental detection data in the form of image, video and waveform files;
experimental detection data are expressed and stored in the form of unstructured data and/or structured data, the unstructured data are mainly directly obtained from detection equipment, and the structured data are directly obtained from the detection equipment and/or extracted from the unstructured data;
the shared model stores the detection equipment information, the tested object information and the detection personnel information in an unstructured or structured form except for experimental detection data;
in implementation, the structured data can be regarded as special unstructured data according to application scenarios, and the unstructured data can be used as fields of the structured data.
Preferably, the calculation method of the structuring rate is as follows:
given experimental test data, if all represented by structured data, the total number of fields required is
Figure BDA0003383784680000021
Wherein the number of fields of the realized structured data representation is NΔ(ii) a The number of fields for which no structured representation is implemented is
Figure BDA0003383784680000031
NΔAnd
Figure BDA0003383784680000032
are all non-negative integers, and NΔAnd
Figure BDA0003383784680000033
not 0 simultaneously, the structuring rate is:
Figure BDA0003383784680000034
the data redundancy calculation method comprises the following steps:
for experimental data, VΔ≧ 0 represents the size of the structured data,
Figure BDA0003383784680000035
represents the size of unstructured data, and VΔAnd
Figure BDA0003383784680000036
not 0 at the same time, sum of sizes of structured data and unstructured data
Figure BDA0003383784680000037
The data redundancy is as follows:
Figure BDA0003383784680000038
wherein alpha is>0 is a scaling factor, control
Figure BDA0003383784680000039
The value is in a reasonable interval and always satisfies the constraint relation
Figure BDA00033837846800000310
The value range of the data redundancy R is more than or equal to 0 and less than or equal to 1; the smaller the R value is, the smaller the redundancy of the data is, and the higher the structuring degree is; the larger R represents the greater redundancy of the data, the higher the degree of unstructured.
Preferably, when the experimental test data are all structured data,
Figure BDA00033837846800000311
s is 1, R is 0; when the experimental test data are all unstructured data,
Figure BDA00033837846800000312
VΔ=0,S=0,R=1。
preferably, the data processing scheme mainly reduces data redundancy through a technical means, assists in storage optimization, and extracts structured data from the unstructured data; the data management and control platform automatically monitors the data redundancy, and when the data redundancy is larger than a set threshold, a data processing scheme is selected according to the requirement.
Preferably, the data processing scheme comprises:
scheme one, reduction of unstructured data: the unstructured data are reduced, the size of the unstructured data can be directly reduced, and the data redundancy is reduced;
and a second scheme of extracting a structured field from unstructured data: extracting structured fields from unstructured data, i.e. by increasing the number N of structured fieldsΔThe structuring rate S is increased, and the size V of the structured data is increasedΔReducing the data redundancy R;
scheme three, scheme one and scheme two are combined for use.
The invention also provides a data processing method for detecting the data sharing model, which comprises the following steps:
s1, storing the experimental detection data as unstructured data or structured data;
s2, calculating the data redundancy, if the data redundancy is larger than the set threshold, executing the step S3, and if not, ending the process;
s3, selecting the data processing scheme to be executed, updating the unstructured data or the structured data, and returning to the step S2.
The invention also provides a using method of the detection data sharing model, which comprises the following steps:
calculating the structuralization rate and the data redundancy rate according to the space occupied by the database table fields and the number of the fields of the database table;
determining relevant information of the data processing scheme, and storing the relevant information in a data processing rule field;
extracting a structured field of experimental detection data from the unstructured data, and storing the structured field in an unstructured data information area;
if all the required structured fields are extracted from the unstructured data, the unstructured data fields may be deleted from the table.
Due to the adoption of the technical scheme, the invention has the following advantages: from the input of the model, wide compatibility is provided, and the equipment can be conveniently accessed to various detection data; from the output of the model, sufficient usability is provided, and a user can conveniently obtain various types of detection data. The data sharing method has the advantages that the wide compatibility of data is realized through the unstructured data part, the sufficient availability of the data is guaranteed through the structured data part, the unstructured data storage optimization is promoted through the quantization indexes, the structured data are extracted from the unstructured data, and the data sharing model is optimized.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings.
FIG. 1 is a diagram illustrating a detection data sharing model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a data processing method for detecting a data sharing model according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a method for detecting a data sharing model according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and examples, it being understood that the examples described are only some of the examples and are not intended to limit the invention to the embodiments described herein. All other embodiments available to those of ordinary skill in the art are intended to be within the scope of the embodiments of the present invention.
For ease of understanding, referring to FIG. 1, the present invention provides an embodiment of a detection data sharing model, comprising: unstructured data, structured data, quantitative evaluation indexes and data processing schemes;
the experimental detection data is stored through unstructured data and/or structured data; the quantitative evaluation index is used as the basis for experimental detection data conversion; the data processing scheme is used for realizing the conversion of experimental detection data; the quantitative evaluation indexes comprise the structuralization rate and the data redundancy, and the data processing scheme is realized and the data sharing model is optimized by reducing the data redundancy.
In this embodiment, both the unstructured data and the structured data are stored in a table of the database, and the unstructured data is stored as one field of the structured data.
In this embodiment, the structured data includes data information, detection device information, test article information, detection person information, and unstructured data information;
the unstructured data are used for realizing wide compatibility of a data sharing model and storing experimental detection data in the form of images, videos and waveform files;
experimental detection data are expressed and stored in the form of unstructured data and/or structured data, the unstructured data are mainly directly obtained from detection equipment, and the structured data are directly obtained from the detection equipment and/or extracted from the unstructured data;
except experimental detection data, the shared model stores detection equipment information, tested article information and detection personnel information in an unstructured or structured mode;
in implementation, the structured data can be regarded as special unstructured data according to application scenarios, and the unstructured data can be used as fields of the structured data.
In this embodiment, the method for calculating the structuralization rate includes:
given experimental test data, if all represented by structured data, the total number of fields required is
Figure BDA0003383784680000061
Wherein the number of fields of the realized structured data representation is NΔ(ii) a The number of fields for which no structured representation is implemented is
Figure BDA0003383784680000062
NΔAnd
Figure BDA0003383784680000063
are all non-negative integers, and NΔAnd
Figure BDA0003383784680000064
not 0 simultaneously, the structuring rate is:
Figure BDA0003383784680000065
the data redundancy calculation method comprises the following steps:
for experimental data, VΔ≧ 0 represents the size of the structured data,
Figure BDA0003383784680000071
represents the size of unstructured data, and VΔAnd
Figure BDA0003383784680000072
not 0 at the same time, sum of sizes of structured data and unstructured data
Figure BDA0003383784680000073
The data redundancy is:
Figure BDA0003383784680000074
wherein alpha is>0 is a scaling factor, control
Figure BDA0003383784680000075
The value is in a reasonable interval and always satisfies the constraint relation
Figure BDA0003383784680000076
The value range of the data redundancy R is more than or equal to 0 and less than or equal to 1; the smaller the R value is, the smaller the redundancy of the data is, and the higher the structuring degree is; the larger R represents the greater redundancy of the data, the higher the degree of unstructured.
In the embodiment, when the experimental detection data are all structured data,
Figure BDA0003383784680000077
s is 1, R is 0; when the experimental test data are all unstructured data,
Figure BDA0003383784680000078
VΔ=0,S=0,R=1。
in the embodiment, the data processing scheme mainly reduces data redundancy through a technical means, assists in storage optimization, and extracts structured data from unstructured data; the data management and control platform automatically monitors the data redundancy, and when the data redundancy is larger than a set threshold, a data processing scheme is selected according to the requirement.
In this embodiment, the data processing scheme includes:
scheme one, reduction of unstructured data: the unstructured data are reduced, the size of the unstructured data can be directly reduced, and the data redundancy is reduced;
for example, meter readings of experimental test data, stored by videoFor unstructured data, only key frames of meter reading are extracted from the video, so that the size of the unstructured data can be greatly reduced
Figure BDA0003383784680000079
Thereby reducing data redundancy.
And a second scheme of extracting a structured field from unstructured data: extracting structured fields from unstructured data, i.e. by increasing the number N of structured fieldsΔThe structuring rate S is increased, and the size V of the structured data is increasedΔReducing the data redundancy R;
for example, the meter reading of the experimental result is stored as unstructured data by using a key frame or an image of a video, the meter reading is extracted and filled in a corresponding structured field through manual or OCR recognition, and the data redundancy is reduced by increasing the number of the structured fields and increasing the size of the structured data; the user can directly retrieve the meter reading field through the data sharing model.
Scheme three, scheme one and scheme two are combined for use.
The combined use of both schemes for reduction of unstructured data and extraction of structured fields from unstructured data.
For easy understanding, referring to fig. 2, the present invention further provides an embodiment of a data processing method for detecting a data sharing model, where the data processing method includes:
s101, storing experimental detection data as unstructured data or structured data;
s102, calculating data redundancy, if the data redundancy is larger than a set threshold, executing a step S103, and ending the process if the data redundancy is not larger than the set threshold;
s103, selecting a data processing scheme to be executed, updating unstructured data or structured data, and returning to the step S102.
For ease of understanding, referring to fig. 3, the present invention also provides an embodiment of a method of detecting use of a data sharing model, the method of use comprising:
s201: calculating the structuralization rate and the data redundancy rate according to the space occupied by the database table fields and the number of the fields of the database table;
s202: determining relevant information of the data processing scheme, and storing the relevant information in a data processing rule field;
s203: extracting a structured field of experimental detection data from the unstructured data, and storing the structured field in an unstructured data information area;
s204: if all the required structured fields are extracted from the unstructured data, the unstructured data fields may be deleted from the table.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions and/or portions thereof that contribute to the prior art may be embodied in the form of a software product that can be stored on a computer-readable storage medium including any mechanism for storing or transmitting information in a form readable by a computer (e.g., a computer).
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims (9)

1. A detection data sharing model is characterized by comprising unstructured data, structured data, quantitative evaluation indexes and a data processing scheme;
the experimental detection data is stored through unstructured data and/or structured data; the quantitative evaluation index is used as a basis for experimental detection data conversion; the data processing scheme is used for realizing the conversion of experimental detection data; the quantitative evaluation index comprises a structuring rate and data redundancy, and a data processing scheme and a data sharing model are optimized by reducing the data redundancy.
2. The sharing model of claim 1, wherein the unstructured data and the structured data are both stored in tables of a database, the unstructured data being stored as a field of the structured data.
3. The sharing model of claim 1, wherein the structured data comprises data information, test equipment information, test item information, test personnel information, and unstructured data information;
the unstructured data are used for realizing wide compatibility of a data sharing model and storing experimental detection data in the form of image, video and waveform files;
experimental detection data are expressed and stored in the form of unstructured data and/or structured data, the unstructured data are mainly directly obtained from detection equipment, and the structured data are directly obtained from the detection equipment and/or extracted from the unstructured data;
the shared model stores the detection equipment information, the tested object information and the detection personnel information in an unstructured or structured form except for experimental detection data;
in implementation, the structured data can be regarded as special unstructured data according to application scenarios, and the unstructured data can be used as fields of the structured data.
4. The sharing model of claim 1, wherein the structuring rate is calculated by:
given experimental test data, if all represented by structured data, the total number of fields required is
Figure FDA00033837846700000213
Wherein the number of fields of the realized structured data representation is NΔ(ii) a Not realized structureThe number of fields represented by the expression is
Figure FDA0003383784670000021
NΔAnd
Figure FDA0003383784670000022
are all non-negative integers, and NΔAnd
Figure FDA0003383784670000023
not 0 simultaneously, the structuring rate is:
Figure FDA0003383784670000024
the data redundancy calculation method comprises the following steps:
for experimental data, VΔ≧ 0 represents the size of the structured data,
Figure FDA0003383784670000025
represents the size of unstructured data, and VΔAnd
Figure FDA0003383784670000026
not 0 at the same time, sum of sizes of structured data and unstructured data
Figure FDA0003383784670000027
The data redundancy is as follows:
Figure FDA0003383784670000028
wherein alpha > 0 is a scaling factor, control
Figure FDA0003383784670000029
The value is in a reasonable interval and always satisfies the constraint relation
Figure FDA00033837846700000210
The value range of the data redundancy R is more than or equal to 0 and less than or equal to 1; the smaller the R value is, the smaller the redundancy of the data is, and the higher the structuring degree is; the larger R represents the greater redundancy of the data, the higher the degree of unstructured.
5. The sharing model of claim 4, wherein when the experimental survey data is structured data,
Figure FDA00033837846700000211
s is 1, R is 0; when the experimental test data are all unstructured data,
Figure FDA00033837846700000212
VΔ=0,S=0,R=1。
6. the sharing model of claim 1, wherein the data processing scheme is mainly to reduce data redundancy by technical means, to facilitate storage optimization, and to extract structured data from the unstructured data; the data management and control platform automatically monitors the data redundancy, and when the data redundancy is larger than a set threshold, a data processing scheme is selected according to the requirement.
7. The sharing model of claim 1, wherein the data processing scheme comprises:
scheme one, reduction of unstructured data: the unstructured data are reduced, the size of the unstructured data can be directly reduced, and the data redundancy is reduced;
and a second scheme of extracting a structured field from unstructured data: extracting structured fields from unstructured data, i.e. by increasing the number N of structured fieldsΔThe structuring rate S is increased, and the size V of the structured data is increasedΔReducing the data redundancy R;
scheme three, scheme one and scheme two are combined for use.
8. A data processing method for detecting a data sharing model according to any one of claims 1 to 7, wherein the processing method comprises:
s1, storing the experimental detection data as unstructured data or structured data;
s2, calculating the data redundancy, if the data redundancy is larger than the set threshold, executing the step S3, and if not, ending the process;
s3, selecting the data processing scheme to be executed, updating the unstructured data or the structured data, and returning to the step S2.
9. A method for use of the detection data sharing model according to any of claims 1-7, wherein the method for use comprises:
calculating the structuralization rate and the data redundancy rate according to the space occupied by the database table fields and the number of the fields of the database table;
determining relevant information of the data processing scheme, and storing the relevant information in a data processing rule field;
extracting a structured field of experimental detection data from the unstructured data, and storing the structured field in an unstructured data information area;
if all the required structured fields are extracted from the unstructured data, the unstructured data fields may be deleted from the table.
CN202111445132.1A 2021-11-30 2021-11-30 Detection data sharing model, method and data processing method thereof Pending CN114090720A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111445132.1A CN114090720A (en) 2021-11-30 2021-11-30 Detection data sharing model, method and data processing method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111445132.1A CN114090720A (en) 2021-11-30 2021-11-30 Detection data sharing model, method and data processing method thereof

Publications (1)

Publication Number Publication Date
CN114090720A true CN114090720A (en) 2022-02-25

Family

ID=80305997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111445132.1A Pending CN114090720A (en) 2021-11-30 2021-11-30 Detection data sharing model, method and data processing method thereof

Country Status (1)

Country Link
CN (1) CN114090720A (en)

Similar Documents

Publication Publication Date Title
US9158744B2 (en) System and method for automatically extracting multi-format data from documents and converting into XML
KR101955732B1 (en) Associating captured image data with a spreadsheet
US8452808B2 (en) Automatic generation of virtual database schemas
EP3709212A1 (en) Image processing method and device for processing image, server and storage medium
WO2019051945A1 (en) Insurance data checking method and apparatus, computer device, and storage medium
CN111815421B (en) Tax policy processing method and device, terminal equipment and storage medium
WO2022222943A1 (en) Department recommendation method and apparatus, electronic device and storage medium
WO2022105172A1 (en) Pdf document cross-page table merging method and apparatus, electronic device and storage medium
US20210334309A1 (en) Classification device, classification method, generation method, classification program, and generation program
CN113094509B (en) Text information extraction method, system, device and medium
US20150227714A1 (en) Medical information analysis apparatus and medical information analysis method
CN110019948B (en) Method and apparatus for outputting information
US20090049104A1 (en) Method and system for configuring a variety of medical information
US20150106478A1 (en) File handlers supporting dynamic data streams
CN104123074A (en) Target area estimation apparatus, method and program
US9342589B2 (en) Data classifier system, data classifier method and data classifier program stored on storage medium
JP2014099114A (en) Determination program, determination method, and determination device
JP6565661B2 (en) Image processing system, image similarity determination method, and image similarity determination program
US10963690B2 (en) Method for identifying main picture in web page
CN112069774A (en) Data mapping method and device, electronic terminal and storage medium
CN114090720A (en) Detection data sharing model, method and data processing method thereof
US8126841B2 (en) Storage and retrieval of variable data
CN115759040A (en) Electronic medical record analysis method, device, equipment and storage medium
CN109918367B (en) Structured data cleaning method and device, electronic equipment and storage medium
CN113239215A (en) Multimedia resource classification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination