CN112256840A - Device for carrying out industrial internet discovery and extracting information by improving transfer learning model - Google Patents

Device for carrying out industrial internet discovery and extracting information by improving transfer learning model Download PDF

Info

Publication number
CN112256840A
CN112256840A CN202011256306.5A CN202011256306A CN112256840A CN 112256840 A CN112256840 A CN 112256840A CN 202011256306 A CN202011256306 A CN 202011256306A CN 112256840 A CN112256840 A CN 112256840A
Authority
CN
China
Prior art keywords
model
industrial internet
classification
sentence
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011256306.5A
Other languages
Chinese (zh)
Inventor
林飞
汪致伦
王丹
易永波
古元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Act Technology Development Co ltd
Original Assignee
Beijing Act Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Act Technology Development Co ltd filed Critical Beijing Act Technology Development Co ltd
Priority to CN202011256306.5A priority Critical patent/CN112256840A/en
Publication of CN112256840A publication Critical patent/CN112256840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

A device for improving a transfer learning model to discover industrial Internet and extract information relates to the technical field of information. The system consists of a web crawler, a text cleaning module, a content classification execution module, an improved transfer learning model and an entity identification module; the invention does not need massive texts with labels for training, thereby saving a great deal of labor cost; and secondly, the method is not influenced by word segmentation, and more relevant text features can be obtained for website classification and key service information extraction of the industrial internet platform website.

Description

Device for carrying out industrial internet discovery and extracting information by improving transfer learning model
Technical Field
The invention relates to the technical field of information, in particular to the technical field of information security.
Background
With the accelerated progress of the manufacturing industry from the digitalization stage to the networking stage, the industrial internet platform in China is rapidly started, and the timely discovery and management of the platform information become a problem which is urgently needed to be solved at present. The types of websites in the internet are numerous, the first problem which is faced at present is how to automatically find the industrial internet platform website from a large number of websites, and the second problem is how to extract key platform information from the platform website content.
At present, the industrial internet platform information is mainly collected manually, and manpower and time cost are wasted, so that the method for automatically discovering and extracting the platform information is very urgent.
In recent years, rapid development of artificial intelligence technology has made little progress in the field of natural language processing, in which text classification is used for text with different characteristics, and named entity recognition technology is mainly used for information extraction and text data structuring.
The current website classification method is mainly based on the traditional machine learning algorithm and the deep learning mode, and the traditional machine learning algorithm, such as the invention patent CN106168968A, determines the website category by calculating the weight of the data matched to the dictionary. Due to the difficulty in constructing the dictionary and the numerous types of websites, the conventional algorithm is difficult to accurately classify the websites according to the dictionary. The method based on deep learning, such as the invention patent CN110442823A, requires a large number of training samples to train parameters of the neural network, and the process of collecting a large number of samples is long, and consumes a large amount of human resources.
In the prior art, a named entity identification method is mainly an entity identification method based on traditional machine learning and an entity identification method based on deep learning. The entity identification method based on the traditional machine learning, such as the invention patent CN111274804A, model learning is carried out on labeled data through statistics, the data to be predicted are sent to model prediction, the model calculates the entity with the maximum possibility by utilizing a Viterbi algorithm, and the method has the biggest defect that the semantics cannot be understood and the method cannot be competent for the task of complex entity identification. The named entity recognition method based on deep learning, such as patent CN111126068A of the invention, builds a neural network model to learn semantic features, and can learn more complex semantics, but needs a large amount of labeled data to learn, and the data labeling work is very time-consuming and labor-consuming.
Based on the characteristics of high complexity, high implementation cost and large labor consumption of the prior art, the device for discovering the industrial Internet and extracting information by the improved migration learning model improves the migration learning model, improves the calculation efficiency of the migration learning model by sharing the layered calculation parameters of the migration learning model, can perform rapid classified modeling on classified industrial Internet sample data to obtain an industrial Internet classification model, then obtains real-time data by network information capture and data cleaning, inputs the real-time data into the industrial Internet classification model for classification to obtain the industrial Internet classification of the real-time data, then captures key information of the real-time data to obtain updated industrial Internet sample data, and updates the updated industrial Internet sample data into the classified industrial sample data, the invention can automatically complete the classification and information capture of the industrial internet in the whole process, and gradually amend and enrich the classified industrial internet sample data, thereby achieving the continuous evolution and improvement of the industrial internet classification model. The invention has the characteristics of high efficiency and real-time performance.
General technical description of the use
A transfer learning model: the transfer learning model used in the patent application refers to structBERT, which is an NLP pre-training model proposed by the Alibardamol institute, and makes related improvements on the basis of the traditional BERT. The author thinks that the pretraining task of Bert ignores the language structure information, so that structBert adds two language structure-based training targets to the original MaskLM training target of Bert: word order and sentence order tasks.
Named entity recognition: named entity recognition refers to the recognition of specific objects in text, the semantic categories of which are usually predefined before recognition, such as people, addresses, organizations, etc. Named entity recognition is not just an independent information extraction task, it also plays a key role in many large NLP applications such as information retrieval, automatic text summarization, question and answer systems, machine translation, and knowledge base building.
Disclosure of Invention
In view of the defects of the prior art, the device for discovering the industrial internet and extracting the information by the improved transfer learning model provided by the invention consists of a web crawler, a text cleaning module, a content classification execution module, the improved transfer learning model and an entity identification module;
the web crawler is responsible for crawling web page content and sending the web page content and the web page address to the text cleaning module;
the text cleaning module is responsible for removing noise characters in a text formed by the webpage content and the webpage address to generate clean webpage information, and the text cleaning module sends the clean webpage information to the content classification execution module; the noise characters include: html tags, stop words, forwarding symbols, urls and marking information;
the content classification execution module comprises an industrial internet classification model, and the industrial internet classification model is obtained by performing language training on classified internet sample data through an improved transfer learning model; the industrial internet classification model consists of classification labels of classified internet sample data and the probability that the content of the classified internet sample data belongs to each classification label;
the algorithm of the improved transfer learning model is represented as: 1) the method comprises the steps of using a structBERT to represent each word of each sentence in a text, then using a bidirectional Transformer to learn the represented text, wherein the Transformer is a standard program in the structBERT, each layer of parameters of the traditional Transformer are independent, when the number of layers is increased, the number of the parameters is obviously increased, and the model shares the parameters of all the layers and learns the parameter quantity of one layer; 2) the word representation of the improved StructBERT is represented by a word vector, a segment vector and a position vector together; the first word of the word vector is used for a subsequent classification task, the segment vector is used for distinguishing two sentences, and the position vector is used for representing word position information; 3) semantic features are learned through four training tasks: i) a masked language model, ii) a predict next sentence task, iii) a word order task, iv) a sentence structure task; the hidden language model task means that the model predicts that 15% of words are randomly hidden in the training process, 80% of the words in the 15% of words are replaced by mask symbols, 10% of the words are not replaced, and 10% of the words are replaced by other words; the model learns the semantic information of the text through the task; predicting next sentence task in order for the model to learn the relationship between sentences, assuming that the input of training is sentences S1 and S2, and S2 has half the probability of being the next sentence of S1, the two sentences are input, and the model predicts whether S2 is the next sentence of S1; the word sequence task selects a part of 3 subsequences with the length of 5% from the unmasked sequences, the word sequences in the subsequences are disordered, and the model reconstructs the original word sequences, so that the model learns the word sequence relation in sentences; a sentence structure task, wherein a sentence pair is given (S1, S2), the context and the independence of S2 and S1 are judged; in sampling, for a sentence S, the next sentence of the probability sampling S of 1/3 constitutes a sentence pair, the previous sentence of the probability sampling S of 1/3 constitutes a sentence pair, and the probability of 1/3 randomly samples sentence constituting sentence pairs of another document;
the content classification execution module compares clean webpage information with the industrial internet classification model, discards clean webpage information which is not classified by the industrial internet and sends the clean webpage information belonging to the industrial internet classification to the entity identification module;
the entity identification module comprises an entity category model, the entity category model is obtained by performing language training on classified industrial internet sample data with entity category labels through an improved transfer learning model, and the entity category model is composed of the classification labels of the classified industrial internet sample data with the entity category labels and the probability that the content of the classified industrial internet sample data with the entity category labels belongs to each classification label;
the entity identification module compares clean webpage information with an entity type model, outputs content in the clean webpage information and an entity type label corresponding to the content in the clean webpage information, and generates updated classified industrial internet data with the entity type label;
the entity identification module incorporates the updated categorized industrial internet data with the entity category label into categorized industrial internet sample data with the entity category label.
Advantageous effects
Compared with the traditional text classification and information extraction technology, the method does not need massive texts with labels for training, and saves a large amount of labor cost; and secondly, the method is not influenced by word segmentation, and more relevant text features can be obtained for website classification and key service information extraction of the industrial internet platform website.
Drawings
FIG. 1 is a system block diagram of the present invention.
Detailed Description
The device for realizing industrial internet discovery and information extraction of the improved transfer learning model provided by the invention with reference to fig. 1 is composed of a web crawler 1, a text cleaning module 2, a content classification execution module 3, an improved transfer learning model 4 and an entity recognition module 5;
the web crawler 1 is responsible for crawling web page contents and sending the web page contents and the web page addresses 10 to the text cleaning module 2;
the text cleaning module 2 is responsible for removing noise characters in the text formed by the webpage content and the webpage address 10 to generate clean webpage information, and the text cleaning module 2 sends the clean webpage information to the content classification execution module 3; the noise characters include: html tags, stop words, forwarding symbols, urls and marking information;
the content classification execution module 3 comprises an industrial internet classification model 41, and the industrial internet classification model 41 is obtained by performing language training on classified internet sample data 40 through an improved transfer learning model 4; the industrial internet classification model 41 is composed of classification labels of the classified internet sample data 40 and probabilities that the contents of the classified internet sample data 40 belong to each classification label;
the algorithm of the improved migration learning model 4 is represented as: 1) the method comprises the steps of using a structBERT to represent each word of each sentence in a text, then using a bidirectional Transformer to learn the represented text, wherein the Transformer is a standard program in the structBERT, each layer of parameters of the traditional Transformer are independent, when the number of layers is increased, the number of the parameters is obviously increased, and the model shares the parameters of all the layers and learns the parameter quantity of one layer; 2) the word representation of the improved StructBERT is represented by a word vector, a segment vector and a position vector together; the first word of the word vector is used for a subsequent classification task, the segment vector is used for distinguishing two sentences, and the position vector is used for representing word position information; 3) semantic features are learned through four training tasks: i) a masked language model, ii) a predict next sentence task, iii) a word order task, iv) a sentence structure task; the hidden language model task means that the model predicts that 15% of words are randomly hidden in the training process, 80% of the words in the 15% of words are replaced by mask symbols, 10% of the words are not replaced, and 10% of the words are replaced by other words; the model learns the semantic information of the text through the task; predicting next sentence task in order for the model to learn the relationship between sentences, assuming that the input of training is sentences S1 and S2, and S2 has half the probability of being the next sentence of S1, the two sentences are input, and the model predicts whether S2 is the next sentence of S1; the word sequence task selects a part of 3 subsequences with the length of 5% from the unmasked sequences, the word sequences in the subsequences are disordered, and the model reconstructs the original word sequences, so that the model learns the word sequence relation in sentences; a sentence structure task, wherein a sentence pair is given (S1, S2), the context and the independence of S2 and S1 are judged; in sampling, for a sentence S, the next sentence of the probability sampling S of 1/3 constitutes a sentence pair, the previous sentence of the probability sampling S of 1/3 constitutes a sentence pair, and the probability of 1/3 randomly samples sentence constituting sentence pairs of another document;
the content classification execution module 3 compares the clean webpage information with the industrial internet classification model 41, discards the clean webpage information which is not classified by the industrial internet and sends the clean webpage information belonging to the industrial internet classification to the entity identification module 5;
the entity identification module 5 comprises an entity category model 51, the entity category model 51 is obtained by language training of the classified industrial internet sample data 50 with the entity category label through the improved transfer learning model 4, and the entity category model 51 is composed of the classification label of the classified industrial internet sample data 50 with the entity category label and the probability that the content of the classified industrial internet sample data 50 with the entity category label belongs to each classification label;
the entity identification module 5 compares the clean webpage information with the entity type model 51, outputs the content in the clean webpage information and the entity type label corresponding to the content in the clean webpage information, and generates the updated classified industrial internet data 52 with the entity type label;
the entity identification module 5 incorporates the updated entity class tagged classified industrial internet data 52 into the entity class tagged classified industrial internet sample data 50.

Claims (1)

1. The device for carrying out industrial internet discovery and extracting information by improving the transfer learning model is characterized by consisting of a web crawler, a text cleaning module, a content classification execution module, an improved transfer learning model and an entity identification module;
the web crawler is responsible for crawling web page content and sending the web page content and the web page address to the text cleaning module;
the text cleaning module is responsible for removing noise characters in a text formed by the webpage content and the webpage address to generate clean webpage information, and the text cleaning module sends the clean webpage information to the content classification execution module; the noise characters include: html tags, stop words, forwarding symbols, urls and marking information;
the content classification execution module comprises an industrial internet classification model, and the industrial internet classification model is obtained by performing language training on classified internet sample data through an improved transfer learning model; the industrial internet classification model consists of classification labels of classified internet sample data and the probability that the content of the classified internet sample data belongs to each classification label;
the algorithm of the improved transfer learning model is represented as: 1) the method comprises the steps of using a structBERT to represent each word of each sentence in a text, then using a bidirectional Transformer to learn the represented text, wherein the Transformer is a standard program in the structBERT, each layer of parameters of the traditional Transformer are independent, when the number of layers is increased, the number of the parameters is obviously increased, and the model shares the parameters of all the layers and learns the parameter quantity of one layer; 2) the word representation of the improved StructBERT is represented by a word vector, a segment vector and a position vector together; the first word of the word vector is used for a subsequent classification task, the segment vector is used for distinguishing two sentences, and the position vector is used for representing word position information; 3) semantic features are learned through four training tasks: i) a masked language model, ii) a predict next sentence task, iii) a word order task, iv) a sentence structure task; the hidden language model task means that the model predicts that 15% of words are randomly hidden in the training process, 80% of the words in the 15% of words are replaced by mask symbols, 10% of the words are not replaced, and 10% of the words are replaced by other words; the model learns the semantic information of the text through the task; predicting next sentence task in order for the model to learn the relationship between sentences, assuming that the input of training is sentences S1 and S2, and S2 has half the probability of being the next sentence of S1, the two sentences are input, and the model predicts whether S2 is the next sentence of S1; the word sequence task selects a part of 3 subsequences with the length of 5% from the unmasked sequences, the word sequences in the subsequences are disordered, and the model reconstructs the original word sequences, so that the model learns the word sequence relation in sentences; a sentence structure task, wherein a sentence pair is given (S1, S2), the context and the independence of S2 and S1 are judged; in sampling, for a sentence S, the next sentence of the probability sampling S of 1/3 constitutes a sentence pair, the previous sentence of the probability sampling S of 1/3 constitutes a sentence pair, and the probability of 1/3 randomly samples sentence constituting sentence pairs of another document;
the content classification execution module compares clean webpage information with the industrial internet classification model, discards clean webpage information which is not classified by the industrial internet and sends the clean webpage information belonging to the industrial internet classification to the entity identification module;
the entity identification module comprises an entity category model, the entity category model is obtained by performing language training on classified industrial internet sample data with entity category labels through an improved transfer learning model, and the entity category model is composed of the classification labels of the classified industrial internet sample data with the entity category labels and the probability that the content of the classified industrial internet sample data with the entity category labels belongs to each classification label;
the entity identification module compares clean webpage information with an entity type model, outputs content in the clean webpage information and an entity type label corresponding to the content in the clean webpage information, and generates updated classified industrial internet data with the entity type label;
the entity identification module incorporates the updated categorized industrial internet data with the entity category label into categorized industrial internet sample data with the entity category label.
CN202011256306.5A 2020-11-12 2020-11-12 Device for carrying out industrial internet discovery and extracting information by improving transfer learning model Pending CN112256840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011256306.5A CN112256840A (en) 2020-11-12 2020-11-12 Device for carrying out industrial internet discovery and extracting information by improving transfer learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011256306.5A CN112256840A (en) 2020-11-12 2020-11-12 Device for carrying out industrial internet discovery and extracting information by improving transfer learning model

Publications (1)

Publication Number Publication Date
CN112256840A true CN112256840A (en) 2021-01-22

Family

ID=74265439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011256306.5A Pending CN112256840A (en) 2020-11-12 2020-11-12 Device for carrying out industrial internet discovery and extracting information by improving transfer learning model

Country Status (1)

Country Link
CN (1) CN112256840A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451433A (en) * 2017-06-27 2017-12-08 中国科学院信息工程研究所 A kind of information source identification method and apparatus based on content of text
CN111078978A (en) * 2019-11-29 2020-04-28 上海观安信息技术股份有限公司 Web credit website entity identification method and system based on website text content
CN111428981A (en) * 2020-03-18 2020-07-17 国电南瑞科技股份有限公司 Deep learning-based power grid fault plan information extraction method and system
CN111739520A (en) * 2020-08-10 2020-10-02 腾讯科技(深圳)有限公司 Speech recognition model training method, speech recognition method and device
CN111767732A (en) * 2020-06-09 2020-10-13 上海交通大学 Document content understanding method and system based on graph attention model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107451433A (en) * 2017-06-27 2017-12-08 中国科学院信息工程研究所 A kind of information source identification method and apparatus based on content of text
CN111078978A (en) * 2019-11-29 2020-04-28 上海观安信息技术股份有限公司 Web credit website entity identification method and system based on website text content
CN111428981A (en) * 2020-03-18 2020-07-17 国电南瑞科技股份有限公司 Deep learning-based power grid fault plan information extraction method and system
CN111767732A (en) * 2020-06-09 2020-10-13 上海交通大学 Document content understanding method and system based on graph attention model
CN111739520A (en) * 2020-08-10 2020-10-02 腾讯科技(深圳)有限公司 Speech recognition model training method, speech recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DRUGAI: "ICLR2020|StructBERT:融合语言结构的BERT模型", pages 2, Retrieved from the Internet <URL:https://blog.csdn.net/u012325865/article/details/106464621?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522170659530716800213024812%2522%252C%2522scm%2522%253A%252220140713.130102334.pc%255Fall.%2522%257D&request_id=170659530716800213024812&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~first_rank_ecpm_v1~rank_v31_ecpm-11-106464621-null-null.142^v99^pc_search_result_base6&utm_term=structbert&spm=1018.2226.3001.4187> *

Similar Documents

Publication Publication Date Title
CN110110054B (en) Method for acquiring question-answer pairs from unstructured text based on deep learning
CN108875051B (en) Automatic knowledge graph construction method and system for massive unstructured texts
CN111159395B (en) Chart neural network-based rumor standpoint detection method and device and electronic equipment
CN112989841B (en) Semi-supervised learning method for emergency news identification and classification
CN106886580B (en) Image emotion polarity analysis method based on deep learning
CN110009430B (en) Cheating user detection method, electronic device and computer readable storage medium
CN108984775B (en) Public opinion monitoring method and system based on commodity comments
CN111767725A (en) Data processing method and device based on emotion polarity analysis model
CN110633366A (en) Short text classification method, device and storage medium
CN112199606B (en) Social media-oriented rumor detection system based on hierarchical user representation
CN113434688B (en) Data processing method and device for public opinion classification model training
CN113806547B (en) Deep learning multi-label text classification method based on graph model
CN115292568B (en) Civil news event extraction method based on joint model
CN116150509B (en) Threat information identification method, system, equipment and medium for social media network
CN111651566A (en) Multi-task small sample learning-based referee document dispute focus extraction method
CN112257444A (en) Financial information negative entity discovery method and device, electronic equipment and storage medium
CN113378024B (en) Deep learning-oriented public inspection method field-based related event identification method
CN112579730A (en) High-expansibility multi-label text classification method and device
CN111400617B (en) Social robot detection data set extension method and system based on active learning
CN117520561A (en) Entity relation extraction method and system for knowledge graph construction in helicopter assembly field
CN116775880A (en) Multi-label text classification method and system based on label semantics and transfer learning
CN112256840A (en) Device for carrying out industrial internet discovery and extracting information by improving transfer learning model
CN115878800A (en) Double-graph neural network fusing co-occurrence graph and dependency graph and construction method thereof
CN113806538B (en) Label extraction model training method, device, equipment and storage medium
CN115129875A (en) Building accident report classification system and method based on graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination