CN112651513A - Information extraction method and system based on zero sample learning - Google Patents

Information extraction method and system based on zero sample learning Download PDF

Info

Publication number
CN112651513A
CN112651513A CN202011527869.3A CN202011527869A CN112651513A CN 112651513 A CN112651513 A CN 112651513A CN 202011527869 A CN202011527869 A CN 202011527869A CN 112651513 A CN112651513 A CN 112651513A
Authority
CN
China
Prior art keywords
information extraction
server
information
training
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011527869.3A
Other languages
Chinese (zh)
Inventor
洪万福
钱智毅
黄海龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Yuanting Information Technology Co ltd
Original Assignee
Xiamen Yuanting Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Yuanting Information Technology Co ltd filed Critical Xiamen Yuanting Information Technology Co ltd
Priority to CN202011527869.3A priority Critical patent/CN112651513A/en
Publication of CN112651513A publication Critical patent/CN112651513A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention relates to an information extraction method and system based on zero sample learning, wherein the method comprises the following steps: s1: a user generates an information extraction request after packaging information extraction related resources through a client, and sends the information extraction request to a server; s2: after receiving the information extraction request, the server extracts related resources according to the information in the information extraction request to perform simulated training, and returns state information of each stage in the training process; s3: the server evaluates the model obtained after the simulation training; s4: a user sends a state query request to a server through a client, and the server queries a training state according to a machine learning uniqueness ID in the received state query request; s5: and the user sends an automatic stopping request to the server through the client, and the server stops corresponding simulation training according to the machine learning uniqueness ID. According to the invention, a user does not need to manually label the new type of information, and can extract information based on zero samples only by importing the existing information resources.

Description

Information extraction method and system based on zero sample learning
Technical Field
The invention relates to the field of information extraction, in particular to an information extraction method and system based on zero sample learning.
Background
With the arrival of a new wave of artificial intelligence wave in the year, machine learning and deep learning related technologies are applied to various industries and fields. Because the data growth speed is fast and the types are diversified, the information overload problem is becoming more serious, and therefore how to quickly and accurately acquire the required key information becomes a main problem facing nowadays. The information extraction technology extracts important information contained in a text by extracting fact information such as entities, relationships, events and the like of a specified type from a natural language text. The current information extraction methods are mainly divided into supervised learning methods and unsupervised learning methods, and although both methods can complete the information extraction task, the greatest disadvantage is that a large amount of manually labeled training data is needed, and a large amount of labor cost is consumed. Therefore, learning even zero samples from a few examples to derive features of an object from only a semantic description of the object remains a key challenge for information extraction. Despite recent advances in important areas such as vision and language, standard supervised deep learning does not provide a satisfactory solution for fast learning new concepts from zero samples, a small number of samples.
In order to solve these drawbacks, some technologies for reducing training samples and improving training efficiency have been developed in the industry, such as machine reasoning and pattern learning technologies, but such technologies still require a certain number of training samples to train specific information of a specific class in a model, so as to implement information classification, prediction and extraction on test samples in the test samples.
Disclosure of Invention
In order to solve the above problems, the present invention provides an information extraction method and system based on zero sample learning.
The specific scheme is as follows:
an information extraction method based on zero sample learning comprises the following steps:
s1: a user generates an information extraction request after packaging information extraction related resources through a client, and sends the information extraction request to a server;
s2: after receiving an information extraction request sent by a client, a server extracts related resources according to the information in the information extraction request to perform simulated training, and returns state information of each stage in the training process;
s3: the server evaluates the model obtained after the simulation training;
s4: a user sends a state query request to a server through a client, the server queries a training state according to a machine learning uniqueness ID in the received state query request, and if the training state is finished, a trained model is returned; if the training state is not finished, returning to the training condition;
s5: the user sends an automatic stop request to the server through the client, and the server stops the simulation training corresponding to the machine learning unique ID according to the machine learning unique ID in the received automatic stop request.
Further, the information extraction related resources comprise a data set, a model, an algorithm and parameters in the machine learning process on one hand; another aspect includes meta-information of the information extraction, the meta-information including a machine-learned unique ID.
Further, the process of simulation training comprises the following steps:
s201: preprocessing a data set in the information extraction related resources;
s202: vectorizing coding is carried out on the preprocessed data set;
s203: performing zero sample learning on the data set subjected to the vector quantization coding through a learning engine, specifically: when the test data set is not determined in the data set, the learning engine automatically generates a corresponding test data set in a mode of splitting data in the data set; and selecting a proper algorithm model from a plurality of algorithm models built in the learning engine to perform simulation training according to three factors of the size of the test data set, the distribution characteristics of the test data set and the load balance of the server.
Further, the evaluation is compared through two types of indexes, one is an algorithm index including accuracy, recall, an F1 value, AUC and a confusion matrix, and the other is a performance index including total time consumption, training time consumption, CPU usage, GPU usage, memory consumption, hard disk IO and network IO.
Further, the results and intermediate information generated during the simulated training process can be sent to a display interface for display after being called.
An information extraction system based on zero sample learning comprises a client and a server, wherein the client and the server both comprise a processor, a memory and a computer program stored in the memory and capable of running on the processor, and the processor executes the computer program to realize the steps of the method of the embodiment of the invention.
By adopting the technical scheme, the invention realizes a general information extraction scheme, and the information extraction based on zero samples can be realized by only importing the existing information resources without manually marking the new information.
Drawings
Fig. 1 is a flowchart illustrating a first embodiment of the present invention.
Detailed Description
To further illustrate the various embodiments, the invention provides the accompanying drawings. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the invention and, together with the description, serve to explain the principles of the embodiments. Those skilled in the art will appreciate still other possible embodiments and advantages of the present invention with reference to these figures.
The invention will now be further described with reference to the accompanying drawings and detailed description.
The first embodiment is as follows:
the embodiment of the invention provides an information extraction method based on zero sample learning, which is used for extracting information in a certain specific field, and as shown in fig. 1, the method comprises the following steps:
s1: and the user generates an information extraction request after packaging the information extraction related resources through the client, and sends the information extraction request to the server.
The information extraction related resources are related resources of a machine learning process in the information extraction process and comprise data, models, algorithms, parameters and the like. The embodiment specifically comprises two parts, wherein one part is meta-information extracted from information, and the meta-information comprises machine learning uniqueness ID, name, description, creator, creation time, authority and the like; secondly, extracting related resources for information, comprising: data sets, evaluators (optional), parameter options (scaling), etc., in this embodiment the data sets are text data of the user's field of interest.
An example of the data format of the information extraction request is as follows:
Figure BDA0002851180980000041
Figure BDA0002851180980000051
s2: after receiving an information extraction request sent by a client, the server extracts related resources according to the information in the information extraction request to perform simulated training, and returns state information of each stage in the training process. The state information includes the machine learning unique ID and whether the simulated training process was successfully started (in this embodiment, whether the learning engine was successfully started).
The process of simulated training comprises the following steps:
s201: and preprocessing the data set in the information extraction related resource.
The preprocessing in this embodiment includes data cleaning, data integration, data specification, data transformation, and the like.
S202: vectorizing coding is carried out on the preprocessed data set.
Vectorization coding comprises word vector coding, semantic coding, entity coding and the like on text information, feature coding and the like on picture video audio data.
S203: and performing zero sample learning on the data set subjected to the vector quantization coding through a learning engine.
The zero sample learning process includes that when a test data set is not determined in the data set, a learning engine automatically generates a corresponding test data set in a mode of splitting data in the data set; selecting a proper algorithm model from a plurality of algorithm models built in the learning engine to perform simulation training according to three factors of the size of the test data set, the distribution characteristics of the test data set and the load balance of the server; and storing the trained model after training.
The Learning engine is based on an Adaptive Learning (Adaptive Learning) theory, is internally provided with a plurality of algorithm models, and comprises Bi-LSTM (Bidirectional-Long Short Term Memory), IDCNN (scaled convolutional neural Networks), BERT-LSTM (Bidirectional Encoder replication from Short Term Memory, BERT language model lengthening Short Term Memory network) and the like, wherein the Bi-LSTM is suitable for the conditions of large data volume and class label equalization, and has low requirements on a server; compared with Bi-LSTM, IDCNN is suitable for general data volume, and for BERT-LSTM, the IDCNN can be suitable for data sets distributed by various types of label data under extremely low data volume, but the requirement on a server is high.
The results and intermediate information generated by the process of simulated training can be sent to a display interface for display after being called.
S3: and the server evaluates the model obtained after the simulation training.
The evaluation in this embodiment includes evaluation of the effectiveness and performance of the application of the model. Specifically, a comparative analysis mode is adopted, and comparison is mainly performed based on two types of indexes: the first is algorithm indexes including accuracy, recall, F1 value, AUC, confusion matrix and the like; the second is performance index, which includes total time consumption, training time consumption, CPU utilization, GPU utilization, memory consumption, hard disk IO, network IO, and the like. The cross comparison results of the indexes can be displayed in real time through a visual interface.
S4: a user sends a state query request to a server through a client, the server queries a training state according to a machine learning uniqueness ID in the received state query request, and if the training state is finished, a trained model is returned; if the training status is not complete (obtained by training progress bar percentage), then the training situation is returned, such as the training progress percentage.
An example data format for the status query request is as follows:
Figure BDA0002851180980000071
s5: the user sends an automatic stop request to the server through the client, and the server stops the simulation training corresponding to the machine learning unique ID according to the machine learning unique ID in the received automatic stop request.
An example of the data format of the autostop request is as follows:
Figure BDA0002851180980000072
compared with the prior art, the embodiment has the following advantages:
the cross-machine learning platform optimization can be realized, and the application range is wider;
secondly, adaptive learning iteration can be rapidly carried out when a user uploads information data resources in a new industry field;
thirdly, the method has high availability and high expansibility, and only information resources in related industry fields need to be uploaded when the method is applied in a large scale, and a user side does not need to be adjusted;
fourthly, the most advanced information extraction algorithm is built in, and the method can be directly put into production and use.
Example two:
the invention also provides an information extraction system based on zero sample learning, which comprises a client and a server, wherein the client and the server both comprise a memory, a processor and a computer program which is stored in the memory and can run on the processor, and the steps in the above method embodiment of the first embodiment of the invention are realized when the processor executes the computer program.
Further, as an executable scheme, the client may be a mobile phone, a desktop computer, a notebook computer, or other computing devices, and the server may be a desktop computer, a notebook computer, a cloud server, or other computing devices. The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is the controlling center for the clients and servers, with various interfaces and lines connecting the various parts of the overall clients and servers.
The memory may be used to store the computer programs and/or modules, and the processor may implement the various functions of the client and server by running or executing the computer programs and/or modules stored in the memory and invoking the data stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system and an application program required by at least one function; the storage data area may store data created according to the use of the mobile phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. An information extraction method based on zero sample learning is characterized by comprising the following steps:
s1: a user generates an information extraction request after packaging information extraction related resources through a client, and sends the information extraction request to a server;
s2: after receiving an information extraction request sent by a client, a server extracts related resources according to the information in the information extraction request to perform simulated training, and returns state information of each stage in the training process;
s3: the server evaluates the model obtained after the simulation training;
s4: a user sends a state query request to a server through a client, the server queries a training state according to a machine learning uniqueness ID in the received state query request, and if the training state is finished, a trained model is returned; if the training state is not finished, returning to the training condition;
s5: the user sends an automatic stop request to the server through the client, and the server stops the simulation training corresponding to the machine learning unique ID according to the machine learning unique ID in the received automatic stop request.
2. The information extraction method based on zero sample learning according to claim 1, characterized in that: the information extraction related resources comprise a data set, a model, an algorithm and parameters in the machine learning process on one hand; another aspect includes meta-information of the information extraction, the meta-information including a machine-learned unique ID.
3. The information extraction method based on zero sample learning according to claim 1, characterized in that: the process of simulated training comprises the following steps:
s201: preprocessing a data set in the information extraction related resources;
s202: vectorizing coding is carried out on the preprocessed data set;
s203: performing zero sample learning on the data set subjected to the vector quantization coding through a learning engine, specifically: when the test data set is not determined in the data set, the learning engine automatically generates a corresponding test data set in a mode of splitting data in the data set; and selecting a proper algorithm model from a plurality of algorithm models built in the learning engine to perform simulation training according to three factors of the size of the test data set, the distribution characteristics of the test data set and the load balance of the server.
4. The information extraction method based on zero sample learning according to claim 1, characterized in that: the evaluation is compared through two types of indexes, one is an algorithm index which comprises accuracy, recall rate, an F1 value, AUC and a confusion matrix, and the other is a performance index which comprises total time consumption, training time consumption, CPU utilization rate, GPU utilization rate, memory consumption, hard disk IO and network IO.
5. The information extraction method based on zero sample learning according to claim 1, characterized in that: the results and intermediate information generated by the process of simulated training can be sent to a display interface for display after being called.
6. An information extraction system based on zero sample learning is characterized in that: comprising a client and a server, each comprising a processor, a memory and a computer program stored in said memory and running on said processor, said processor implementing the steps of the method according to any one of claims 1 to 5 when executing said computer program.
CN202011527869.3A 2020-12-22 2020-12-22 Information extraction method and system based on zero sample learning Pending CN112651513A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011527869.3A CN112651513A (en) 2020-12-22 2020-12-22 Information extraction method and system based on zero sample learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011527869.3A CN112651513A (en) 2020-12-22 2020-12-22 Information extraction method and system based on zero sample learning

Publications (1)

Publication Number Publication Date
CN112651513A true CN112651513A (en) 2021-04-13

Family

ID=75359175

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011527869.3A Pending CN112651513A (en) 2020-12-22 2020-12-22 Information extraction method and system based on zero sample learning

Country Status (1)

Country Link
CN (1) CN112651513A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840644A (en) * 2022-05-17 2022-08-02 抖音视界(北京)有限公司 Method, apparatus, device and medium for managing machine learning system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447277A (en) * 2018-10-19 2019-03-08 厦门渊亭信息科技有限公司 A kind of general machine learning is super to join black box optimization method and system
CN111274814A (en) * 2019-12-26 2020-06-12 浙江大学 Novel semi-supervised text entity information extraction method
WO2020191282A2 (en) * 2020-03-20 2020-09-24 Futurewei Technologies, Inc. System and method for multi-task lifelong learning on personal device with improved user experience
CN111913563A (en) * 2019-05-07 2020-11-10 广东小天才科技有限公司 Man-machine interaction method and device based on semi-supervised learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109447277A (en) * 2018-10-19 2019-03-08 厦门渊亭信息科技有限公司 A kind of general machine learning is super to join black box optimization method and system
CN111913563A (en) * 2019-05-07 2020-11-10 广东小天才科技有限公司 Man-machine interaction method and device based on semi-supervised learning
CN111274814A (en) * 2019-12-26 2020-06-12 浙江大学 Novel semi-supervised text entity information extraction method
WO2020191282A2 (en) * 2020-03-20 2020-09-24 Futurewei Technologies, Inc. System and method for multi-task lifelong learning on personal device with improved user experience

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张鲁宁 等: "零样本学习研究进展" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840644A (en) * 2022-05-17 2022-08-02 抖音视界(北京)有限公司 Method, apparatus, device and medium for managing machine learning system

Similar Documents

Publication Publication Date Title
CN109564575B (en) Classifying images using machine learning models
CN113610239B (en) Feature processing method and feature processing system for machine learning
CN109492772B (en) Method and device for generating information
CN108140143A (en) Regularization machine learning model
EP3743827A1 (en) Training image and text embedding models
WO2018086401A1 (en) Cluster processing method and device for questions in automatic question and answering system
CN111667022A (en) User data processing method and device, computer equipment and storage medium
CN105144164A (en) Scoring concept terms using a deep network
WO2022048363A1 (en) Website classification method and apparatus, computer device, and storage medium
CN111061881A (en) Text classification method, equipment and storage medium
US11373117B1 (en) Artificial intelligence service for scalable classification using features of unlabeled data and class descriptors
CN107818491A (en) Electronic installation, Products Show method and storage medium based on user's Internet data
CN114329029B (en) Object retrieval method, device, equipment and computer storage medium
CN113934851A (en) Data enhancement method and device for text classification and electronic equipment
CN110069558A (en) Data analysing method and terminal device based on deep learning
CN112651513A (en) Information extraction method and system based on zero sample learning
CN111259975B (en) Method and device for generating classifier and method and device for classifying text
CN117251761A (en) Data object classification method and device, storage medium and electronic device
CN116484105A (en) Service processing method, device, computer equipment, storage medium and program product
CN115391589A (en) Training method and device for content recall model, electronic equipment and storage medium
CN114417944B (en) Recognition model training method and device, and user abnormal behavior recognition method and device
CN111091198A (en) Data processing method and device
CN114298118B (en) Data processing method based on deep learning, related equipment and storage medium
CN117093717B (en) Similar text aggregation method, device, equipment and storage medium thereof
CN113886547B (en) Client real-time dialogue switching method and device based on artificial intelligence and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination