CN113609848A - Industrial product quality safety supervision method and device - Google Patents

Industrial product quality safety supervision method and device Download PDF

Info

Publication number
CN113609848A
CN113609848A CN202110969469.6A CN202110969469A CN113609848A CN 113609848 A CN113609848 A CN 113609848A CN 202110969469 A CN202110969469 A CN 202110969469A CN 113609848 A CN113609848 A CN 113609848A
Authority
CN
China
Prior art keywords
data
information
target
structured
industrial product
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110969469.6A
Other languages
Chinese (zh)
Inventor
张君维
丰苏
马志远
李静
王欢
王庆春
于大东
田文涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Center Of State Administration Of Market Supervision
Original Assignee
Information Center Of State Administration Of Market Supervision
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Center Of State Administration Of Market Supervision filed Critical Information Center Of State Administration Of Market Supervision
Priority to CN202110969469.6A priority Critical patent/CN113609848A/en
Publication of CN113609848A publication Critical patent/CN113609848A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method and a device for supervising the quality of an industrial product, which are used for preprocessing acquired associated data of the industrial product to acquire structured data and unstructured data; carrying out data extraction on the structured data to obtain structured data entities and relations among the entities; storing the structured entities and the relation between the entities in a triple form to obtain first information; entity recognition is carried out on the unstructured data based on the target model, an enterprise knowledge triple with the quality problem is obtained, and the enterprise knowledge triple with the quality problem is stored through second information; according to the method and the system, the target knowledge graph is generated according to the first information and the second information, the data in industrial quality safety supervision is subjected to entity extraction and relationship construction, visual processing of the knowledge graph is achieved, the relationship among the data is visually and vividly displayed, the problem of data isolated island is solved, and full-chain supervision on the quality of industrial products is facilitated.

Description

Industrial product quality safety supervision method and device
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a method and a device for supervising the quality safety of industrial products.
Background
With the rapid improvement of the industrial production level, more and more industrial product enterprises develop. The increase of enterprises and the increasing of product types enable people to go through daily life more and more, and the quality safety supervision of industrial products is the key point for ensuring the safety of people.
The current industrial product quality safety supervision comprises multiple processes of risk monitoring, public opinion analysis, supervision spot check, post-processing and the like, information communication and sharing among the processes are relatively lacked, an information isolated island exists, and the efficiency and the accuracy of tracing and positioning industrial product quality problems are reduced.
Disclosure of Invention
Aiming at the problems, the invention provides a method and a device for supervising the quality safety of industrial products, which solve the problem of information isolated island in the process of supervising the quality safety of industrial products.
In order to achieve the purpose, the invention provides the following technical scheme:
a method of industrial product quality supervision, comprising:
preprocessing the acquired industrial product associated data to acquire structured data and unstructured data;
performing data extraction on the structured data to obtain structured data entities and relations among the entities;
storing the structured entities and the relation between the entities in a triple form to obtain first information;
entity recognition is carried out on the unstructured data based on a target model, an enterprise knowledge triple with a quality problem is obtained, and the enterprise knowledge triple with the quality problem is stored as second information;
and generating a target knowledge graph according to the first information and the second information.
Optionally, the preprocessing the obtained industrial product related data to obtain structured data and unstructured data includes:
collecting industrial product associated data, wherein the industrial product associated data at least comprises enterprise information of an enterprise information platform, product supervision information published by an industrial product quality safety supervision platform and a website, spot check public information and public opinion report data of a news website;
screening the industrial product associated data to obtain initial structured and unstructured data;
and preprocessing the initial structured data and the initial unstructured data to obtain structured data and unstructured data.
Optionally, the method further comprises:
acquiring a target data set, wherein the target training set is a data set obtained by extracting and labeling texts of public opinion report data and spot check public data, and the target data set comprises a training set, a test set and a verification set;
and carrying out neural network training on the target training set to obtain a target model, wherein the neural network structure comprises a forward long-term memory artificial neural network and a backward long-term memory artificial neural network.
Optionally, the entity recognition of the unstructured data based on the target model includes:
carrying out clause formatting processing on the unstructured data to obtain a clause result;
and inputting the sentence dividing result into the target model for entity recognition.
Optionally, the generating a target knowledge-graph according to the first information and the second information includes:
performing data format conversion on the triples in the first information and the second information to obtain target data, wherein the target data comprises a node name, a node label, an initial node, a termination node and a relationship;
and carrying out visualization processing on the target data to obtain a target knowledge graph.
An industrial product quality supervision apparatus comprising:
the preprocessing unit is used for preprocessing the acquired industrial product associated data to acquire structured data and unstructured data;
the extraction unit is used for carrying out data extraction on the structured data to obtain the structured data entities and the relation among the entities;
the first storage unit is used for storing the structured entities and the relations among the entities in a triple form to obtain first information;
the identification unit is used for carrying out entity identification on the unstructured data based on a target model, obtaining enterprise knowledge triples with quality problems and storing the enterprise knowledge triples with the quality problems as second information;
and the generating unit is used for generating a target knowledge graph according to the first information and the second information.
Optionally, the pre-processing unit comprises:
the system comprises an acquisition subunit, a display subunit and a display unit, wherein the acquisition subunit is used for acquiring industrial product associated data, and the industrial product associated data at least comprises enterprise information of an enterprise information platform, product supervision information published by an industrial product quality safety supervision platform and a website, spot check public information and public opinion report data of a news website;
the screening subunit is used for screening the industrial product associated data to obtain initial structured and unstructured data;
and the preprocessing subunit is used for preprocessing the initial structured data and the initial unstructured data to obtain structured data and unstructured data.
Optionally, the apparatus further comprises:
the data acquisition unit is used for acquiring a target data set, wherein the target training set is a data set obtained by performing text extraction and labeling on public opinion report data and spot check public notice data, and the target data set comprises a training set, a test set and a verification set;
and the training unit is used for carrying out neural network training on the target training set to obtain a target model, and the neural network structure comprises a forward long-term and short-term memory artificial neural network and a backward long-term and short-term memory artificial neural network.
Optionally, the identification unit includes:
the format processing subunit is used for performing clause formatting processing on the unstructured data to obtain a clause result;
and the recognition subunit is used for inputting the sentence dividing result into the target model for entity recognition.
Optionally, the generating unit includes:
the conversion subunit is configured to perform data format conversion on the triples in the first information and the second information to obtain target data, where the target data includes a node name, a node label, a start node, a stop node, and a relationship;
and the visualization processing subunit is used for performing visualization processing on the target data to obtain a target knowledge graph.
Compared with the prior art, the invention provides the industrial product quality supervision method and the device, the acquired industrial product associated data is preprocessed, and structured data and unstructured data are acquired; carrying out data extraction on the structured data to obtain structured data entities and relations among the entities; storing the structured entities and the relation between the entities in a triple form to obtain first information; entity recognition is carried out on the unstructured data based on the target model, an enterprise knowledge triple with the quality problem is obtained, and the enterprise knowledge triple with the quality problem is stored through second information; according to the method and the system, the target knowledge graph is generated according to the first information and the second information, the data in industrial quality safety supervision is subjected to entity extraction and relationship construction, visual processing of the knowledge graph is achieved, the relationship among the data is visually and vividly displayed, the problem of data isolated island is solved, and full-chain supervision on the quality of industrial products is facilitated.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for supervising quality of an industrial product according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a knowledge-graph-based industrial product quality safety full-chain supervision architecture according to an embodiment of the present invention;
FIG. 3 is a flow chart of knowledge extraction based on two models of BilSTM and CRF according to the embodiment of the present invention;
fig. 4 is a schematic structural diagram of an industrial product quality monitoring apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first" and "second," and the like in the description and claims of the present invention and the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not set forth for a listed step or element but may include steps or elements not listed.
The embodiment of the invention provides an industrial product quality supervision method, and particularly relates to an industrial product quality safety full-chain supervision method based on a knowledge graph. The method can solve the problem of information isolated island in the process of monitoring and managing the quality safety of industrial products, efficiently extracts information about enterprises in structural and non-structural data, performs label processing and relation extraction on the data, constructs a knowledge graph aiming at the quality safety monitoring of the full-chain industrial products, realizes the visualization of the knowledge graph in the environment of Neo4j (Neo4j is a high-performance NOSQL graphic database, and stores structural data on the network instead of a table), and visually displays the monitoring and managing results.
Referring to fig. 1, which shows a schematic flow chart of an industrial product quality supervision method provided by an embodiment of the present invention, the method may include the following steps:
s101, preprocessing the acquired industrial product related data to acquire structured data and unstructured data.
The industrial product associated data at least comprises enterprise information of an enterprise information platform, product supervision information published by an industrial product quality safety supervision platform and a website, spot check public information and public opinion report data of a news website. The preprocessing comprises data screening, associated data acquisition, formatting and the like. Structured data can be highly organized and well-formatted data, which is a type of data that can be placed into tables and spreadsheets. Unstructured data may be anything other than structured data, which does not conform to any predefined model, stored in a non-relational database. The data are divided into structured data and unstructured data, so that the data can be processed subsequently, for example, the structured data can be directly subjected to key information extraction, and the unstructured data needs to be extracted by a training model.
S102, performing data extraction on the structured data to obtain the structured data entities and the relation among the entities.
S103, storing the structured entities and the relationship among the entities in a triple form to obtain first information.
In the embodiment of the invention, the structured data can be enterprise information, enterprise product information, industrial product spot check information and the like, the information is subjected to data extraction to obtain corresponding entities and relationships among the entities, and then a triple is formed and stored as first information, wherein the first information is part of data of a finally generated knowledge graph.
And S104, carrying out entity identification on the unstructured data based on the target model, obtaining enterprise knowledge triples with quality problems, and storing the enterprise knowledge triples with quality problems as second information.
In the embodiment of the present invention, the unstructured data may include public opinion and quality security supervision formula data, which is formatted first, and then the data is identified through a pre-generated target model, so as to obtain an enterprise knowledge triple with a quality problem as second information, that is, the second information is also part of data of a finally generated knowledge graph.
And S105, generating a target knowledge graph according to the first information and the second information.
After the corresponding triple is obtained through the structured data and is used as first information and the corresponding triple is obtained through the unstructured data and is used as second information, a target knowledge graph is generated, the target knowledge graph can be visually displayed, a series of information of products with quality problems can be accurately searched, the production information, the quality data, the news formula and other risk conditions of industrial products are structured and identical, and supervision is facilitated.
The embodiment of the invention provides an industrial product quality supervision method, which comprises the steps of preprocessing acquired industrial product associated data to acquire structured data and unstructured data; carrying out data extraction on the structured data to obtain structured data entities and relations among the entities; storing the structured entities and the relation between the entities in a triple form to obtain first information; entity recognition is carried out on the unstructured data based on the target model, an enterprise knowledge triple with the quality problem is obtained, and the enterprise knowledge triple with the quality problem is stored through second information; according to the method and the system, the target knowledge graph is generated according to the first information and the second information, the data in industrial quality safety supervision is subjected to entity extraction and relationship construction, visual processing of the knowledge graph is achieved, the relationship among the data is visually and vividly displayed, the problem of data isolated island is solved, and full-chain supervision on the quality of industrial products is facilitated.
In the embodiment of the invention, enterprise-product-quality safety supervision data of an enterprise information platform, an industrial product quality safety supervision platform, a website and a news website are investigated and collected to obtain the enterprise related basic information, the local results of industrial product quality safety supervision and the related text data such as report news and the like, and the data are used as industrial product related data. Then screening the associated data of the industrial products to obtain initial structured and unstructured data; and preprocessing the initial structured data and the initial unstructured data to obtain structured data and unstructured data.
Specifically, basic information tables such as an enterprise information table, a product information table, an industrial product production license information table, an enterprise product spot check information enterprise production license table and the like are extracted from the industrial product associated data as structural data. Extracting unstructured data from the associated data of the industrial product can be realized by using a Language Technology Platform (LTP) and a sequencesplitter _ split () method in a pyltp library to divide a text into sentences, wherein the text is input into a text segment containing a plurality of sentences and output into a sentence list.
Because the non-structural data realizes the extraction of the related information through the target model, the embodiment of the invention also provides a method for generating the target model, which comprises the following steps:
acquiring a target data set, wherein the target training set is a data set obtained by extracting and labeling texts of public opinion report data and spot check public data, and the target data set comprises a training set, a test set and a verification set;
and carrying out neural network training on the target training set to obtain a target model, wherein the neural network structure comprises a forward long-term memory artificial neural network and a backward long-term memory artificial neural network.
When a target training set is generated, randomly selecting part of public sentiment and public data of industrial product associated data to perform text extraction and labeling, labeling enterprise entities related in texts to be used as data sets, dividing the data sets into a training set, a test set and a verification set according to a certain proportion (such as 6: 2), and training a BilSTM-CRF model by combining with an MSRA corpus to obtain knowledge extraction about enterprise names with quality problems. And taking the trained target model as an identification model, uniformly identifying and processing formatted public opinion and news data to obtain enterprise knowledge triples related to quality problems, and storing the extracted event information into a knowledge map for quality safety supervision of industrial products.
Specifically, entity labeling can be performed on randomly extracted text data by using an open source tool YEDDA, word vectors are labeled as B-ORG and I-ORG by using a BIO labeling method, and other irrelevant components are labeled as O. The neural network structure selected in the embodiment of the invention is LSTM (Long Short-Term Memory) and comprises a forward Long-Term Memory artificial neural network and a backward Long-Term Memory artificial neural network. The BilSTM consists of forward LSTM and backward LSTM, text word vectors in positive order and reverse order are input, context information is extracted from sentences, and the entity of which type the information corresponds to is analyzed and judged.
Wherein, the output sequence of the hidden layer of the forward neural network is as follows:
Figure BDA0003225108870000081
the output sequence of the backward hidden layer is a forward reverse sequence:
Figure BDA0003225108870000082
using the two output sequences as the output of the hidden layer to obtain the feature matrix Pk,Pk∈Rn×mAnd k is the number of the labels.
For any sequence X, CRF obtains the optimal predicted sequence Y through the relation of adjacent labels, and obtains a score function:
Figure BDA0003225108870000083
wherein A represents a transition score matrix, AijIndicating the score for label i to transfer to j.
After training and debugging, the model can identify and extract all enterprise entities related in data sources such as news manuscripts with unlimited genres and structures, result reports and the like, and the identified and extracted enterprise entities are basically consistent with results of manual marking.
In an embodiment of the present invention, the entity recognition of the unstructured data based on the target model includes:
carrying out clause formatting processing on the unstructured data to obtain a clause result;
and inputting the sentence dividing result into the target model for entity recognition.
Sentence formatting is carried out on article corpora to be identified, a sentence result is used as an input use model for identification, duplication of the identified result is removed, non-enterprise entities are eliminated, and structured data are obtained.
When the extracted and identified triples are stored, for entities such as enterprise names, unified social credit codes, product names, product categories, product nominal trademarks, product production batch numbers, production license numbers, spot check results, unqualified items and the like, the relation between the entities is extracted by a Bootstrap method, and if the relation between the enterprise names and the unified social credit codes is 'unified social credit codes'.
In an embodiment of the present invention, the generating a target knowledge-graph according to the first information and the second information includes:
performing data format conversion on the triples in the first information and the second information to obtain target data, wherein the target data comprises a node name, a node label, an initial node, a termination node and a relationship;
and carrying out visualization processing on the target data to obtain a target knowledge graph.
Specifically, when a visual target knowledge graph is generated according to the obtained triples, the data format of the obtained triples is converted into the following form:
ID-Name-LABLE and start-Name-end-Name-relation, under JDE and Neo4j environment, data are imported through LOAD statement in cypher grammar, visualization based on Neo4j environment knowledge map is realized, and node data can be searched through match statement to obtain enterprise, product related information and enterprise product quality safety supervision data.
In the embodiment of the invention, the knowledge map technology is applied to the industrial product quality safety full-chain supervision, the data in the industrial product quality safety supervision is subjected to entity extraction and relationship construction, the Neo4j environment is used for realizing visualization, the relationship among the data is visually and vividly displayed, the problem of data isolated island is solved, and the full-chain supervision on the industrial product quality is facilitated. Compared with the existing industrial product quality safety supervision method, the method has the advantages of high efficiency, full chain, visualization and the like; compared with the existing quality safety supervision method, the knowledge map technology is introduced, the BiLSTM and CRF models based on the statistical machine learning technology are used for extracting knowledge of complex and redundant unstructured data, the accuracy rate is high, the effect is good, the time consumption is short, and a large amount of labor cost is saved.
The practical application example is illustrated, and referring to fig. 2, a schematic diagram of a knowledge-graph-based industrial product quality safety full-chain supervision architecture according to an embodiment of the present invention is shown, in a processing flow of the knowledge-graph-based industrial product quality safety full-chain supervision method, firstly, enterprise information, industrial product production information, license information, enterprise product spot-check information, and public opinion data are systematically screened and collected. Then, the extraction of relevant knowledge is performed, referring to fig. 3, which shows a flow chart of knowledge extraction based on two models of bilst and CRF, where CRF layer refers to CRF layer, LSTM's output layer refers to output layer of Long-Short Term Memory network, back LSTM refers to Backward LSTM, Forward LSTM refers to Forward LSTM, Look-up layer refers to search layer, One host vector refers to One valid code, specifically referring to the description of the subsequent embodiments, and fig. 3 combines the methods of Bi-directional Long Short-Term Memory and CRF (Conditional Random Fields) to perform entity extraction on semi-structured data and unstructured data, so as to achieve high accuracy and high time efficiency. Structured data is obtained and stored in the form of triples. And on the basis of the environment of Neo4j and JDE, data are stored in a csv file form, and the knowledge graph is visualized through LOAD statements. And effective information query of industrial products is realized through the incidence relation among the data. The method mainly comprises data acquisition, knowledge extraction of unstructured data, relation extraction and visualization of industrial product supervision data based on a knowledge graph.
Collecting a data set:
the method comprises the steps of obtaining enterprise detailed information, industrial product information, spot check information, enterprise product historical public opinions and other data on an enterprise check platform, a national market supervision and management bureau, a China quality news network and a microblog platform to obtain structured and unstructured data. Carrying out duplicate removal and missing item processing on structured data including enterprise information, business license information production license information, spot check result information and the like, and using the structured data as partial data of a knowledge graph; and formatting public opinion and quality safety supervision public notice data acquired from a Chinese quality news network and a microblog platform.
Knowledge extraction of unstructured data:
unstructured data are extracted from data acquired by a Chinese quality news network and a microblog platform, and a Sentence Split () method in a pyltp library is used for separating texts into sentences, wherein the texts are input into a section of texts containing a plurality of sentences and output into a sentence list.
And carrying out entity labeling on the randomly extracted text data by using an open source tool YEDDA, and labeling the word vectors as B-ORG and I-ORG and other irrelevant components as O by using a BIO labeling method. The BilSTM consists of forward and backward LSTMs, text word vectors in positive and negative order are input, and context information is extracted from sentences to analyze and judge which kind of entity the information corresponds to. Wherein, the output sequence of the hidden layer of the forward neural network is as follows:
Figure BDA0003225108870000111
the output sequence of the backward hidden layer is a forward reverse sequence:
Figure BDA0003225108870000112
using the two output sequences as the output of the hidden layer to obtain the feature matrix Pk,Pk∈Rn×mAnd k is the number of the labels.
For any sequence X, CRF obtains the optimal predicted sequence Y through the relation of adjacent labels, and obtains a score function:
Figure BDA0003225108870000113
wherein A represents a transition score matrix, AijIndicating the score for label i to transfer to j.
After training and debugging, the model can identify and extract all enterprise entities related in data sources such as news manuscripts with unlimited genres and structures, result reports and the like, and the identified and extracted enterprise entities are basically consistent with results of manual marking.
Sentence formatting is carried out on article corpora to be recognized, a sentence dividing result is used as an input use model for recognition, duplication removal is carried out on the recognized result, non-enterprise entities are eliminated, and structured data, namely enterprise information related to public sentiment, is obtained.
Extracting a relation based on a Bootstrap method:
and performing relation extraction on entities such as enterprise names, unified social credit codes, corporate legal persons, product names, product categories, product nominal trademarks, product production batch numbers, production license numbers, spot check results, unqualified items and the like by a Bootstrap method. First, given some subject dictionaries, e.g., the relationship of business names to unified social credit codes is "unified social credit code is". Through the text generation new rule, the text given by us is traversed, sentences containing entity groups are found, and corresponding modes are summarized according to the text contents, for example, the unified social code of a certain limited company in a certain region is 3754825489292648X. And continuously iterating according to the newly obtained and old modes to further obtain more entity relation data.
Realizing the visualization of industrial product supervision data based on the knowledge graph:
and converting the data tables in the relational database into a triple form. For a node: the storage is performed in three columns of ID-Name-LABLE. For example, ID: g0001, Name: everything some welding equipment limited, ble: name of the business. For the relationship: the storage is performed by three columns of start _ name (start node) -end _ name (end node) -relation, for example, start _ name: somewhere, or somewhere, or somewhere, or somewhere, or somewhere, the inside, or the welding equipment, or the inside: 51810482711864131X, relationship: the unified social credit code is.
And storing the data into a CSV file format, and importing the data through a LOAD CSV method in Cypher syntax in JDE and Neo4j environments. The method can load not only the local CSV, but also the remote CSV file. For example, by "load CSV from 'CSV file' As line create (a: company name { name: line. line [1] }); "to create a company name node by" load CSV from 'CSV file' As line create (a: company ID { name: line. line [1] }); "create unified social credit code node, through" load csv with headers from "file:// file. csv" As line match (from: company name { name: line.start _ name }), (to: company ID { name: line.end _ name }) merge (from) - [ r: rel { relation: line.relation } ] - > (to); "to create a relationship of the company name node to the unified social credit code node. The visualization of the knowledge graph of the industrial product quality safety full-chain supervision data is realized.
Under the environment of Neo4j, by searching node data through match syntax, enterprises, product related information and enterprise product quality safety supervision data can be quickly obtained, and contents and events which are linked and potentially linked with the current node can be found.
The embodiment of the invention can solve the problems of relatively lack of information communication and sharing among the data of the whole chain of industrial product quality safety supervision and island problems, and realize the association among the data of establishment information, product declaration, production license issuing information, industrial product public opinion information, product quality spot check information and the like of a company. The efficiency of industrial product quality safety supervision can also be improved, the full chain quality safety supervision on industrial products is strengthened, and the tracing and positioning of product quality problems can be faster and more accurate. The utilization of the data of the whole chain for the quality safety supervision of industrial products is maximized.
Based on the foregoing embodiments, an embodiment of the present invention further provides an apparatus for monitoring quality of an industrial product, referring to fig. 4, including:
the preprocessing unit 10 is configured to preprocess the acquired industrial product related data to obtain structured data and unstructured data;
an extraction unit 20, configured to perform data extraction on the structured data to obtain structured data entities and relationships between the entities;
the first storage unit 30 is configured to store the structured entities and the relationships between the entities in a triple form, so as to obtain first information;
the identification unit 40 is configured to perform entity identification on the unstructured data based on a target model, obtain an enterprise knowledge triple with a quality problem, and store the enterprise knowledge triple with the quality problem as second information;
a generating unit 50, configured to generate a target knowledge graph according to the first information and the second information.
Further, the preprocessing unit includes:
the system comprises an acquisition subunit, a display subunit and a display unit, wherein the acquisition subunit is used for acquiring industrial product associated data, and the industrial product associated data at least comprises enterprise information of an enterprise information platform, product supervision information published by an industrial product quality safety supervision platform and a website, spot check public information and public opinion report data of a news website;
the screening subunit is used for screening the industrial product associated data to obtain initial structured and unstructured data;
and the preprocessing subunit is used for preprocessing the initial structured data and the initial unstructured data to obtain structured data and unstructured data.
Further, the apparatus further comprises:
the data acquisition unit is used for acquiring a target data set, wherein the target training set is a data set obtained by performing text extraction and labeling on public opinion report data and spot check public notice data, and the target data set comprises a training set, a test set and a verification set;
and the training unit is used for carrying out neural network training on the target training set to obtain a target model, and the neural network structure comprises a forward long-term and short-term memory artificial neural network and a backward long-term and short-term memory artificial neural network.
Further, the identification unit includes:
the format processing subunit is used for performing clause formatting processing on the unstructured data to obtain a clause result;
and the recognition subunit is used for inputting the sentence dividing result into the target model for entity recognition.
Optionally, the generating unit includes:
the conversion subunit is configured to perform data format conversion on the triples in the first information and the second information to obtain target data, where the target data includes a node name, a node label, a start node, a stop node, and a relationship;
and the visualization processing subunit is used for performing visualization processing on the target data to obtain a target knowledge graph.
The embodiment of the invention provides an industrial product quality supervision device, wherein a preprocessing unit preprocesses acquired industrial product associated data to acquire structured data and unstructured data; the extraction unit performs data extraction on the structured data to obtain the entities of the structured data and the relation between the entities; the first storage unit stores the structured entities and the relations among the entities in a triple form to obtain first information; the identification unit carries out entity identification on the unstructured data based on the target model to obtain enterprise knowledge triples with quality problems, and stores the enterprise knowledge triples with quality problems as second information; the generation unit generates the target knowledge graph according to the first information and the second information, and the invention performs entity extraction and relationship construction on data in industrial quality safety supervision, realizes visual processing of the knowledge graph, visually and vividly displays the relationship between the data, solves the problem of data isolated island, and is beneficial to full-chain supervision on the quality of industrial products.
Based on the foregoing embodiments, embodiments of the invention provide a computer-readable storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the steps of the industrial product quality supervision method according to any one of the above.
Embodiments of the present invention further provide an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to implement the steps of the supervision method between industrial products according to any one of the above items.
Specifically, please refer to the description of the foregoing embodiments, which will not be described in detail herein.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for supervising the quality of an industrial product, comprising:
preprocessing the acquired industrial product associated data to acquire structured data and unstructured data;
performing data extraction on the structured data to obtain structured data entities and relations among the entities;
storing the structured entities and the relation between the entities in a triple form to obtain first information;
entity recognition is carried out on the unstructured data based on a target model, an enterprise knowledge triple with a quality problem is obtained, and the enterprise knowledge triple with the quality problem is stored as second information;
and generating a target knowledge graph according to the first information and the second information.
2. The method according to claim 1, wherein the preprocessing the acquired industrial product related data to obtain structured data and unstructured data comprises:
collecting industrial product associated data, wherein the industrial product associated data at least comprises enterprise information of an enterprise information platform, product supervision information published by an industrial product quality safety supervision platform and a website, spot check public information and public opinion report data of a news website;
screening the industrial product associated data to obtain initial structured and unstructured data;
and preprocessing the initial structured data and the initial unstructured data to obtain structured data and unstructured data.
3. The method of claim 1, further comprising:
acquiring a target data set, wherein the target training set is a data set obtained by extracting and labeling texts of public opinion report data and spot check public data, and the target data set comprises a training set, a test set and a verification set;
and carrying out neural network training on the target training set to obtain a target model, wherein the neural network structure comprises a forward long-term memory artificial neural network and a backward long-term memory artificial neural network.
4. The method of claim 1, wherein the entity identifying the unstructured data based on the object model comprises:
carrying out clause formatting processing on the unstructured data to obtain a clause result;
and inputting the sentence dividing result into the target model for entity recognition.
5. The method of claim 1, wherein generating a target knowledge-graph from the first information and the second information comprises:
performing data format conversion on the triples in the first information and the second information to obtain target data, wherein the target data comprises a node name, a node label, an initial node, a termination node and a relationship;
and carrying out visualization processing on the target data to obtain a target knowledge graph.
6. An industrial product quality supervision apparatus, comprising:
the preprocessing unit is used for preprocessing the acquired industrial product associated data to acquire structured data and unstructured data;
the extraction unit is used for carrying out data extraction on the structured data to obtain the structured data entities and the relation among the entities;
the first storage unit is used for storing the structured entities and the relations among the entities in a triple form to obtain first information;
the identification unit is used for carrying out entity identification on the unstructured data based on a target model, obtaining enterprise knowledge triples with quality problems and storing the enterprise knowledge triples with the quality problems as second information;
and the generating unit is used for generating a target knowledge graph according to the first information and the second information.
7. The apparatus of claim 6, wherein the pre-processing unit comprises:
the system comprises an acquisition subunit, a display subunit and a display unit, wherein the acquisition subunit is used for acquiring industrial product associated data, and the industrial product associated data at least comprises enterprise information of an enterprise information platform, product supervision information published by an industrial product quality safety supervision platform and a website, spot check public information and public opinion report data of a news website;
the screening subunit is used for screening the industrial product associated data to obtain initial structured and unstructured data;
and the preprocessing subunit is used for preprocessing the initial structured data and the initial unstructured data to obtain structured data and unstructured data.
8. The apparatus of claim 6, further comprising:
the data acquisition unit is used for acquiring a target data set, wherein the target training set is a data set obtained by performing text extraction and labeling on public opinion report data and spot check public notice data, and the target data set comprises a training set, a test set and a verification set;
and the training unit is used for carrying out neural network training on the target training set to obtain a target model, and the neural network structure comprises a forward long-term and short-term memory artificial neural network and a backward long-term and short-term memory artificial neural network.
9. The apparatus of claim 6, wherein the identification unit comprises:
the format processing subunit is used for performing clause formatting processing on the unstructured data to obtain a clause result;
and the recognition subunit is used for inputting the sentence dividing result into the target model for entity recognition.
10. The apparatus of claim 6, wherein the generating unit comprises:
the conversion subunit is configured to perform data format conversion on the triples in the first information and the second information to obtain target data, where the target data includes a node name, a node label, a start node, a stop node, and a relationship;
and the visualization processing subunit is used for performing visualization processing on the target data to obtain a target knowledge graph.
CN202110969469.6A 2021-08-23 2021-08-23 Industrial product quality safety supervision method and device Pending CN113609848A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110969469.6A CN113609848A (en) 2021-08-23 2021-08-23 Industrial product quality safety supervision method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110969469.6A CN113609848A (en) 2021-08-23 2021-08-23 Industrial product quality safety supervision method and device

Publications (1)

Publication Number Publication Date
CN113609848A true CN113609848A (en) 2021-11-05

Family

ID=78309200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110969469.6A Pending CN113609848A (en) 2021-08-23 2021-08-23 Industrial product quality safety supervision method and device

Country Status (1)

Country Link
CN (1) CN113609848A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252201A (en) * 2023-11-17 2023-12-19 山东山大华天软件有限公司 Knowledge-graph-oriented discrete manufacturing industry process data extraction method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489560A (en) * 2019-06-19 2019-11-22 民生科技有限责任公司 The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology
CN111428054A (en) * 2020-04-14 2020-07-17 中国电子科技网络信息安全有限公司 Construction and storage method of knowledge graph in network space security field
CN111488465A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Knowledge graph construction method and related device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110489560A (en) * 2019-06-19 2019-11-22 民生科技有限责任公司 The little Wei enterprise portrait generation method and device of knowledge based graphical spectrum technology
CN111428054A (en) * 2020-04-14 2020-07-17 中国电子科技网络信息安全有限公司 Construction and storage method of knowledge graph in network space security field
CN111488465A (en) * 2020-04-14 2020-08-04 税友软件集团股份有限公司 Knowledge graph construction method and related device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姜宇星 等: "基于大数据的市场监管知识图谱研究", 《江苏科技信息》, no. 18, 30 June 2020 (2020-06-30), pages 10 - 14 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252201A (en) * 2023-11-17 2023-12-19 山东山大华天软件有限公司 Knowledge-graph-oriented discrete manufacturing industry process data extraction method and system
CN117252201B (en) * 2023-11-17 2024-02-27 山东山大华天软件有限公司 Knowledge-graph-oriented discrete manufacturing industry process data extraction method and system

Similar Documents

Publication Publication Date Title
CN110968699B (en) Logic map construction and early warning method and device based on fact recommendation
CN107491531B (en) Chinese network comment sensibility classification method based on integrated study frame
CN110910243B (en) Property right transaction method based on reconfigurable big data knowledge map technology
CN111709235A (en) Text data statistical analysis system and method based on natural language processing
CN112632989B (en) Method, device and equipment for prompting risk information in contract text
CN105117426B (en) A kind of intellectual coded searching method of customs
CN113505242A (en) Method and system for automatically embedding knowledge graph
CN113656805B (en) Event map automatic construction method and system for multi-source vulnerability information
CN113051365A (en) Industrial chain map construction method and related equipment
CN112163424A (en) Data labeling method, device, equipment and medium
CN105740353A (en) Calculation method and system for relevance degree of individual share and article
CN110674164A (en) Method for natural language query and intelligent report generation facing main data
CN112182145A (en) Text similarity determination method, device, equipment and storage medium
CN115545671A (en) Method and system for structured processing of laws and regulations
US11295078B2 (en) Portfolio-based text analytics tool
CN116542800A (en) Intelligent financial statement analysis system based on cloud AI technology
CN113379432B (en) Sales system customer matching method based on machine learning
CN117435777B (en) Automatic construction method and system for industrial chain map
CN113609848A (en) Industrial product quality safety supervision method and device
CN117271557A (en) SQL generation interpretation method, device, equipment and medium based on business rule
CN117077668A (en) Risk image display method, apparatus, computer device, and readable storage medium
CN111104422A (en) Training method, device, equipment and storage medium of data recommendation model
CN113254623B (en) Data processing method, device, server, medium and product
Hu et al. A classification model of power operation inspection defect texts based on graph convolutional network
CN109033133A (en) Event detection and tracking based on Feature item weighting growth trend

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination