WO2023101364A1

WO2023101364A1 - Neural network training method

Info

Publication number: WO2023101364A1
Application number: PCT/KR2022/019082
Authority: WO
Inventors: 정원경; 문현석; 박찬준; 손준영; 이설화; 이정우; 임희석
Original assignee: 엘지이노텍 주식회사
Priority date: 2021-11-30
Filing date: 2022-11-29
Publication date: 2023-06-08
Also published as: KR20230081294A

Abstract

A neural network training method according to an embodiment comprises the steps of: training a first neural network on the basis of first input data; training a second neural network on the basis of second input data by applying a parameter value of the first neural network according to a training result of the first neural network; and training a third neural network on the basis of third input data by applying the parameter value of the first neural network, wherein the first input data includes a patent domain document, the second input data includes multiple patent documents and first label data obtained by labeling key sentences corresponding to the respective multiple patent documents, and the third input data includes multiple key sentences and second label data obtained by labeling key noun phrases respectively corresponding to the multiple key sentences.

Description

How neural networks learn

Embodiments relate to a learning method of a neural network, an artificial intelligence device, and a computer program stored in a computer-readable storage medium.

A patent is an element that protects intellectual property rights (IP), and is a system that is applied in most disciplines, such as business management, economics, computer engineering, and mechanical engineering. It gives inventors a monopoly on the economic value of their inventions by granting them the right not to infringe their intellectual property, thereby motivating them to disclose new technologies and ideas. With the development of technology, many patents have recently been registered in various fields.

Here, patent registration can be granted only if it is judged that the stated claims have a novel and creative subject matter compared to the prior art. Accordingly, it is essential to conduct a superficial prior patent analysis in order to draw up the claims at the initial stage of drafting a patent application.

Prior patent analysis is a process of analyzing whether a claim specifying the scope of an invention for which an inventor seeks protection infringes on the scope of other previously registered patents. However, such prior patent analysis is a task that requires a large amount of human labor and specialized knowledge. That is, the prior patent analysis process includes data collection, information search, and technology understanding, and for this purpose, knowledge of complex fields is required.

In addition, the patent documents to be analyzed contain many technical and legal patent domain-specific terms. Accordingly, a high level of human resources, time and cost are generally required to analyze patent documents. This shows the need for an assistant tool that can be utilized in the patent processing and analysis process. Specifically, it shows the urgency and importance of developing a technology capable of automating the analysis of patent documents that can effectively alleviate the difficulties and limitations in terms of time and cost.

The embodiment provides a neural network learning method capable of automatically analyzing patent documents, an artificial intelligence device, and a computer program stored in a computer readable storage medium.

In addition, the embodiment provides a neural network learning method capable of increasing the analysis accuracy of patent documents, an artificial intelligence device, and a computer program stored in a computer readable storage medium.

In addition, the embodiment provides a neural network learning method, an artificial intelligence device, and a computer program stored in a computer readable storage medium capable of increasing patent analysis performance by using a neural network model specialized for analyzing patent documents.

In addition, in the embodiment, a learning method of a neural network capable of automatically analyzing a detailed description section including technical details, not a summary section or claims section of a patent document, an artificial intelligence device, and a computer program stored in a computer readable storage medium provides

In addition, the embodiment summarizes patent documents using a summary algorithm to provide a neural network learning method, an artificial intelligence device, and a computer program stored in a computer readable storage medium that can increase the learning efficiency and reasoning efficiency of the neural network model.

In addition, in the embodiment, using a neural network model specialized in patent document analysis, key sentences and key noun phrases included in patent documents can be automatically provided.

In addition, in the embodiment, a neural network learning method, an artificial intelligence device, and a computer program stored in a computer readable storage medium capable of providing key noun phrases implicitly expressing key sentences in the patent document, rather than words included in the patent document, are provided. do.

The technical problems to be achieved in the present invention are not limited to the above-mentioned technical problems, and other technical problems not mentioned will be clearly understood by those skilled in the art from the description below. You will be able to.

A neural network learning method according to an embodiment includes learning a first neural network based on first input data; learning a second neural network based on second input data by applying a parameter value of the first neural network according to a learning result of the first neural network; and applying a parameter value of the first neural network to train a third neural network based on third input data, wherein the first input data includes a document in a patent domain, and the second input data includes first label data labeling a plurality of patent documents and key sentences corresponding to each of the plurality of patent documents, and the third input data corresponds to a plurality of key sentences and the plurality of key sentences, respectively and second label data for labeling the core noun phrase.

In addition, the step of training the first neural network includes acquiring a plurality of corpora by using the document of the patent domain, and includes learning the first neural network by using the plurality of corpora.

In addition, the step of learning the second neural network includes applying the parameter values of the first neural network as initial values of the parameter values of the second neural network, and proceeding with learning based on the second input data, 2 Updating the parameter values of the neural network.

In addition, the step of learning the second neural network includes the step of upscaling the second input data, and the upscaling step includes first labeling a key sentence corresponding to each of the plurality of patent documents. copying label data; and adding the copied label data as first label data of each patent document.

In addition, the step of learning the third neural network includes applying a parameter value of the first neural network as an initial value of a parameter value of the third neural network, learning based on the third input data, and performing the learning. 3 Updating the parameter values of the neural network.

In addition, the step of training the second neural network includes obtaining summary data summarizing each of a plurality of patent documents included in the second input data.

In addition, the step of training the first neural network may include changing a parameter value of the neural network learned in a document of a general domain to a parameter value corresponding to a document of the patent domain.

In addition, the first to third neural networks include a deep learning neural network (DNN).

In addition, the model structures of the first to third neural networks are identical to each other.

In addition, the model structure of the first to third neural networks includes a T5 model.

On the other hand, the artificial intelligence device according to the embodiment includes a first neural network for extracting key sentences included in a patent document; and a second neural network for extracting a key noun phrase from the key sentence extracted through the first neural network, wherein the parameter value of the first neural network is an initial parameter value of a learning neural network learned to process a document in the patent domain. In a state where it is applied as a value, it is learned and updated based on the first input data, and the parameter value of the second neural network is learned based on the second input data in a state where the parameter value of the learning neural network is applied as an initial value updated, the first input data includes a plurality of patent documents and first label data labeling key sentences corresponding to each of the plurality of patent documents, and the second input data includes a plurality of key sentences and and second label data for labeling core noun phrases respectively corresponding to the plurality of core sentences.

In addition, the first neural network extracts the key sentence from summary data summarizing the patent document.

In an embodiment, the first neural network learned from documents in the general domain is updated to a model specialized in documents in the patent domain. This means that the document of the patent domain includes technical terms that are not included in the document of the general domain, and when analyzing the patent document using the first neural network learned from the document of the general domain, the learning accuracy or Inference accuracy may be lowered. Accordingly, in the embodiment, the parameter value of the neural network is updated according to the document of the patent domain. Through this, in the embodiment, the learning accuracy of the patent document and the inference accuracy thereof may be increased, and thus user satisfaction may be improved.

In addition, in the embodiment, learning of the second neural network used to extract key sentences from the patent document is performed. At this time, in the embodiment, for the learning of the second neural network 420, the second input data does not proceed with learning with the entire range of the patent document included in the second input data, and the second neural network 420 uses summary data summarizing the patent document. The neural network 420 is trained. In an embodiment, when learning the second neural network 420 with the summary data through an experiment, the number of sentences included in the patent document is effectively reduced, thereby increasing learning efficiency. Furthermore, in the embodiment, additional performance of the second neural network 420 is obtained by primarily filtering out content that adversely affects the learning of the second neural network 420 from among the entire range of content of the patent document included in the second input data. to enable improvement.

And, in the embodiment, unnecessary data in the document can be effectively reduced through the sentence summary as described above, and after setting the minimum ratio so that important sentences are included as much as possible, as the learning of the second neural network 420 proceeds thereafter, the above The learning time of the second neural network 420 can be effectively reduced.

In addition, in the embodiment, the parameter value obtained from the first neural network 410 is applied as an initial value of the parameter value of the second neural network 420 . Preferably, in the embodiment, for learning of the second neural network 420, a final updated parameter value of the first neural network 410 is extracted, and this is used as an initial value of the parameter value of the second neural network 420. apply And, in the embodiment, by applying the parameter values of the second neural network 420 as initial values to the parameter values of the first neural network 410, the second neural network 420 is specialized in analyzing documents in the patent domain. can proceed more efficiently, and the resulting learning accuracy or learning time can be drastically reduced.

In addition, in the embodiment, learning of the third neural network used to extract key noun phrases from key sentences of patent documents is performed.

In this case, in the embodiment, a generation method rather than a sentence extraction method is applied to the third neural network 430 . For example, in the embodiment, some words in the core sentence included in the third input data are not included in the core noun phrase as its label data, but the core noun phrase includes new words not included in the core sentence. do. Through this, in the embodiment, it is possible to more flexibly recognize the key noun phrase in the core sentence. This allows key noun phrases to be newly created based on the content of the entire sentence rather than finding them within the sentence. Through this, in the embodiment, the key noun phrase is generated in a direction semantically similar to the standard key noun phrase, and through this, the error rate can be lowered compared to the extraction method. In this way, in the embodiment, compared to the extraction method of extracting some words included in a key sentence as a key noun phrase, it is possible to generate a key noun phrase that is semantically richer and implicitly expresses the entire document content, thereby improving user satisfaction. can

1 is a diagram illustrating an artificial intelligence system according to an embodiment.

Figure 2 is a block diagram of an example of the artificial intelligence device of Figure 1;

3 is a diagram showing the structure of a neural network according to an embodiment.

4 is a diagram for explaining a learning process of a first neural network according to an embodiment.

5 is a diagram for explaining a learning process of a second neural network according to an embodiment.

6 is a diagram for explaining a learning process of a third neural network according to an embodiment.

7 is a flowchart for explaining step by step an inference method according to an embodiment.

Hereinafter, the embodiments disclosed in this specification will be described in detail with reference to the accompanying drawings, but the same or similar components are given the same reference numerals regardless of reference numerals, and redundant descriptions thereof will be omitted. The suffixes "module" and "unit" for components used in the following description are given or used together in consideration of ease of writing the specification, and do not have meanings or roles that are distinct from each other by themselves. In addition, in describing the embodiments disclosed in this specification, if it is determined that a detailed description of a related known technology may obscure the gist of the embodiment disclosed in this specification, the detailed description thereof will be omitted. In addition, the accompanying drawings are only for easy understanding of the embodiments disclosed in this specification, the technical idea disclosed in this specification is not limited by the accompanying drawings, and all changes included in the spirit and technical scope of the present invention , it should be understood to include equivalents or substitutes.

Terms including ordinal numbers, such as first and second, may be used to describe various components, but the components are not limited by the terms. These terms are only used for the purpose of distinguishing one component from another.

It is understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle. It should be. On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle.

Singular expressions include plural expressions unless the context clearly dictates otherwise.

In this application, terms such as "comprise" or "have" are intended to designate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

1 is a diagram illustrating an artificial intelligence system according to an embodiment, and FIG. 2 is a block diagram of an example of the artificial intelligence device of FIG. 1 . Here, the artificial intelligence system is a patent document analysis system that proceeds with learning through deep learning in order to analyze patent documents, and analyzes input data (eg, patent documents to be analyzed) based on the learning results. can also be said Furthermore, the artificial intelligence system proceeds with learning specific to the patent domain. The artificial intelligence system may also be referred to as a core information extraction system of a patent document that extracts a key sentence from a patent document by reflecting the learning result and extracts a key noun phrase from the extracted key sentence.

Hereinafter, with reference to FIG. 1, an artificial intelligence system according to an embodiment will be briefly described.

The artificial intelligence system of the embodiment includes an electronic device 100 , a database 200 and an artificial intelligence device 300 .

The electronic device 100 may be referred to as a user terminal. The electronic device 100 may access an application or website providing a patent document analysis function according to an embodiment. The electronic device 100 may input information about a target to be analyzed in the accessed application or website.

In this case, the information to be analyzed may be a patent document number (eg, at least one of an application number, a registration number, and a publication number). Alternatively, the analysis target information may be a search keyword for searching patent documents in a specific technical field. Alternatively, the user may input a specific patent document to be analyzed through the electronic device 100 as the analysis target information.

The electronic device 100 may include all terminals capable of accessing applications or web sites through a network.

For example, the electronic device 100 includes a smart phone, a tablet PC, a mobile phone, a video phone, an e-book reader, a desktop PC, a laptop PC, a netbook computer, a workstation, a server, a PDA, a portable multimedia player (PMP), It may include at least one of an MP3 player, medical device, camera, or wearable device. Wearable devices are accessories (e.g. watches, rings, bracelets, anklets, necklaces, eyeglasses, contact lenses, or headmounted-devices (HMDs)), integrated into textiles or clothing (e.g. electronic garments), or body-attached. type (eg, a skin pad or tattoo), or at least one of a bio implantable circuit.

For example, the electronic device 100 may include a television, a digital video disk (DVD) player, an audio device, a refrigerator, an air conditioner, a vacuum cleaner, an oven, a microwave oven, a washing machine, an air purifier, a set top box, a home automation control panel, and a security device. It may include at least one of a control panel, a media box (eg Samsung HomeSyncTM, Apple TVTM, or Google TVTM), a game console (eg XboxTM, PlayStationTM), an electronic dictionary, an electronic key, a camcorder, or an electronic photo frame.

For example, the electronic device 100 may be various medical devices (eg, various portable medical measuring devices (blood glucose meter, heart rate monitor, blood pressure monitor, body temperature monitor, etc.), MRA (magnetic resonance angiography), MRI (magnetic resonance imaging) ), CT (computed tomography), camera, or ultrasonicator, etc.), navigation device, satellite navigation system (GNSS (global navigation satellite system)), EDR (event data recorder), FDR (flight data recorder), automobile infotainment device, ship Electronic equipment (e.g. navigation systems for ships, gyrocompasses, etc.), avionics, security devices, head units for vehicles, industrial or domestic robots, drones, ATMs in financial institutions, POS in stores (point of sales), or IoT devices (e.g., light bulbs, various sensors, sprinkler devices, fire alarms, thermostats, street lights, toasters, exercise equipment, hot water tanks, heaters, boilers, etc.). .

The database 200 may be connected to the electronic device 100 or the artificial intelligence device 300 through a network.

The database 200 may receive the analysis target information and deliver a patent document corresponding to the analysis target information to the artificial intelligence device 300 or the electronic device 100 .

For example, the database 200 may receive the analysis target information from the electronic device 100 . To this end, analysis target information input into the electronic device 100 may be provided to the database 200 through a network. In addition, the database 200 may search for a patent document corresponding to the received analysis target information and deliver the searched patent document to either the electronic device 100 or the artificial intelligence device 300 .

As another example, the database 200 may receive analysis target information from the artificial intelligence device 300 . To this end, analysis target information input to the electronic device 100 may be provided to the artificial intelligence device 300 through a network. In addition, the artificial intelligence device 300 may access the database 200 through a network and receive a patent document corresponding to the analysis target information.

The database 200 includes at least one database, and can store a plurality of patent documents filed accordingly.

Preferably, the database 200 may include a plurality of databases. For example, the database 200 may include first through Nth databases.

And, the first database may be a database operated by the Korea Intellectual Property Office (KIPO) or by a person who received a service from the Korea Intellectual Property Office (KIPO). And, the first database may store patent documents filed with the Korean Intellectual Property Office as the receiving office.

For example, the second database may be a database operated by the United States Patents and Trademark Office (USPTO) or by a person who received a service from the United States Patent and Trademark Office. And, the second database may store patent documents filed with the United States Patent and Trademark Office as a receiving office.

The third database may be a database operated by the European Patent Office (EPO) or by a person who has received services from the European Patent Office. Further, the third database may store patent documents filed with the European Patent Office as a receiving office.

The fourth database may be a database operated by the Japan Patent Office (JPO) or by a person who has received services from the European Patent Office. Further, the fourth database may store patent documents filed with the Japan Patent Office as the receiving office.

In addition to this, the database 200 is a database for storing patent documents filed with the patent offices of China, Taiwan, Germany, England, France, India, Canada, Australia, Singapore, Mexico, and other countries as receiving offices. can include more.

Here, the scope of the rights of the present invention is not limited by the number or type of databases constituted by the database 200, and it is obvious to those skilled in the art to which the present invention belongs.

The artificial intelligence device 300 may perform an operation of receiving a patent document from the database 200 or the electronic device 100 and extracting core features by analyzing the provided patent document.

For example, the artificial intelligence device 300 may refer to a device that performs an operation of learning an artificial neural network using a machine learning algorithm or analyzing a patent document using the learned artificial neural network.

The artificial intelligence device 300 stores a plurality of artificial neural networks. For example, the artificial intelligence device 300 may store first to third artificial neural networks. Also, the artificial intelligence device 300 may perform deep learning on the first to third artificial neural networks based on input data. In addition, the artificial intelligence device 300 performs an inference operation using the deep-learned first to third artificial neural networks to extract key sentences and key noun phrases included in the provided patent document. Can be performed there is.

Accordingly, the artificial intelligence device 300 includes the plurality of artificial neural networks, executes the plurality of artificial neural networks, learns each of the artificial neural networks according to input data, and includes the patent document and the learned artificial neural network. It can also be referred to as a computing device that performs a series of operations for extracting key sentences and key noun phrases included in the patent document using .

The artificial intelligence device 300 may include a communication unit 310, a memory 320 and a processor 330.

The communication unit 310 may receive input data by communicating with at least one of the electronic device 100 and the database 200 . In this case, the input data may be input data for learning, or may be input data for reasoning differently. The learning operation and reasoning operation of the artificial intelligence device 300 will be described in more detail below.

The communication unit 310 may perform communication with the electronic device 100 or the database 200 in a wireless data communication method. As a wireless data communication method, technical standards or communication methods for mobile communication (eg, GSM (Global System for Mobile communication), CDMA (Code Division Multi Access), CDMA2000 (Code Division Multi Access 2000), EV-DO (Enhanced Voice-Data Optimized or Enhanced Voice-Data Only), WCDMA (Wideband CDMA), HSDPA (High Speed Downlink Packet Access), HSUPA (High Speed Uplink Packet Access), LTE (Long Term Evolution), LTE-A (Long Term Evolution-Advanced), etc.) can be used.

Also, the communication unit 310 may communicate with any one of the electronic device 100 and the database 200 using wireless Internet technology. For example, wireless Internet technologies include WLAN (Wireless LAN), Wi-Fi (Wireless-Fidelity), Wi-Fi (Wireless Fidelity) Direct, DLNA (Digital Living Network Alliance), WiBro (Wireless Broadband), WiMAX (World Interoperability for Microwave Access), High Speed Downlink Packet Access (HSDPA), High Speed Uplink Packet Access (HSUPA), Long Term Evolution (LTE), Long Term Evolution-Advanced (LTE-A), and the like may be used.

In addition, the communication unit 310 can communicate with any one of the electronic device 100 and the database 200 using short range communication, for example, Bluetooth™, RFID (Radio Frequency Identification), Infrared Data Association (IrDA), UWB (Ultra Wideband), ZigBee, NFC (Near Field Communication), Wi-Fi (Wireless-Fidelity), Wi-Fi Direct, Wireless USB (Wireless Short-distance communication may be supported using at least one of the Universal Serial Bus technologies.

The memory 320 may include a model storage unit 321 . For example, the memory 320 may store a model (or an artificial neural network 321a) that is currently being learned or has been completed through the processor 330. This memory 320 is a flash memory type , hard disk type, multimedia card micro type, card type memory (eg SD or XD memory, etc.), RAM (Random Access Memory, RAM), SRAM (Static Random Access Memory), ROM (Read-Only Memory, ROM), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk, and optical disk. In addition, the artificial intelligence device 300 may operate in relation to a web storage performing a storage function of the memory 320 on the Internet.

The processor 330 may control the overall operation of the artificial intelligence device 300. The processor 330 processes signals, data, and information input or output through each of the components described above, or drives an application program stored in the memory 320 to provide specific information to the electronic device 100. or a specific function executed by the electronic device 100 may be provided.

The processor 330 may be composed of one or more cores, which include a central processing unit (CPU), a general purpose graphics processing unit (GPGPU), and a tensor processing unit (TPU) of a computing device. It may include a processor for data analysis and deep learning, such as a tensor processing unit). The processor 330 may read the computer program stored in the memory 320 and extract key sentences and key noun phrases included in a specific patent document according to an embodiment of the present disclosure.

In an embodiment, the processor 330 may perform learning on the first to third artificial neural networks through a separate process. In this case, the processor 330 may use a summary algorithm for summarizing data input to the first to third artificial neural networks. The summary algorithm may include a TextRank algorithm, but is not limited thereto. After constructing a word graph or sentence graph, the TextRank algorithm can summarize patent documents using PageRank, a graph ranking algorithm.

The processor 330 may perform calculations for learning of the first to third artificial neural networks. For example, the processor 330 processes input data for learning in the first to third artificial neural networks and calculates a loss function through deep learning to update weights or parameters of the neural network. A series of operations for learning such as, etc. can be performed.

In addition, the processor 330 may perform an inference operation using a neural network in which the weights/parameters have been updated or learned. For example, the processor 330 may use a patent document corresponding to analysis target information as input data and execute the trained artificial neural network stored in the memory 320 accordingly. Through this, the processor 330 extracts key sentences and key noun phrases included in the patent document. In addition, when the key noun phrase and extraction are performed, the processor 330 may provide information thereon to the electronic device 100 through the communication unit 310 .

Hereinafter, the artificial intelligence device 300 will be described in more detail.

Prior to the description of the artificial intelligence device 300 of the present application, artificial intelligence will be described as follows. Artificial intelligence refers to the field of studying artificial intelligence or methodology to create it, and machine learning (Machine Learning) refers to the field of defining various problems dealt with in the field of artificial intelligence and studying methodologies to solve them. do. Machine learning is also defined as an algorithm that improves the performance of a certain task through constant experience. An artificial neural network (ANN) is a model used in machine learning, and may refer to an overall model that has problem-solving capabilities and is composed of artificial neurons (nodes) that form a network by synaptic coupling. An artificial neural network can be defined by a connection pattern between neurons in different layers, a learning process for updating model parameters, and an activation function for generating output values.

An artificial neural network may include an input layer, an output layer, and optionally one or more hidden layers. Each layer may include one or more neurons, and the artificial neural network may include neurons and synapses connecting the neurons. In an artificial neural network, each neuron may output a function value of an activation function for input signals, weights, and biases input through a synapse.

Model parameters refer to parameters determined through learning, and include weights of synaptic connections and biases of neurons. In addition, hyperparameters mean parameters that must be set before learning in a machine learning algorithm, and include a learning rate, number of iterations, mini-batch size, initialization function, and the like.

The purpose of learning an artificial neural network can be seen as determining model parameters that minimize a loss function. For example, the artificial intelligence device 300 in the embodiment may perform a learning operation for determining a parameter that minimizes a loss function of each of the first to third artificial neural networks. In addition, the loss function may be used as an index for determining optimal model parameters in the learning process of the artificial neural network.

Machine learning can be classified into supervised learning, unsupervised learning, and reinforcement learning according to learning methods.

Supervised learning refers to a method of training an artificial neural network given a label for training data, and a label is the correct answer (or result value) that the artificial neural network must infer when learning data is input to the artificial neural network. can mean Unsupervised learning may refer to a method of training an artificial neural network in a state in which a label for training data is not given. Reinforcement learning may refer to a learning method in which an agent defined in an environment learns to select an action or action sequence that maximizes a cumulative reward in each state.

Among artificial neural networks, machine learning implemented as a deep neural network (DNN) including a plurality of hidden layers is also called deep learning, and deep learning is a part of machine learning. And, in an embodiment, a neural network may be trained through deep learning, and inference may be performed using the learning result.

Meanwhile, a deep learning neural network (DNN) in an embodiment may refer to a neural network including a plurality of hidden layers in addition to an input layer and an output layer. Deep neural networks can reveal latent structures in data. The deep neural network used in the embodiment is a convolutional neural network (CNN), a recurrent neural network (RNN), an auto encoder, a generative adversarial network (GAN), and a restricted Boltzmann. It may include a restricted boltzmann machine (RBM), a deep belief network (DBN), a Q network, a U network, a Siamese network, and the like.

Meanwhile, throughout this specification, a computation model, a neural network, a network function, and a neural network may be used as the same meaning. And the data structure may include a neural network. And the data structure including the neural network may be stored in a computer readable medium. The data structure including the neural network may also include data input to the neural network, weights of the neural network, hyperparameters of the neural network, data acquired from the neural network, an activation function associated with each node or layer of the neural network, and a loss function for learning the neural network. there is. A data structure including a neural network may include any of the components described above. In other words, the data structure including the neural network includes data input to the neural network, weights of the neural network, hyperparameters of the neural network, data obtained from the neural network, activation function associated with each node or layer of the neural network, and loss function for training the neural network. It may be configured to include any combination of. In addition to the foregoing configurations, the data structure comprising the neural network may include any other information that determines the characteristics of the neural network. In addition, the data structure may include all types of data used or generated in the computational process of the neural network, but is not limited to the above. A computer readable medium may include a computer readable recording medium and/or a computer readable transmission medium. A neural network may consist of a set of interconnected computational units, which may generally be referred to as nodes. These nodes may also be referred to as neurons. A neural network includes one or more nodes.

Meanwhile, the data structure may include data input to the neural network. A data structure including data input to the neural network may be stored in a computer readable medium. Data input to the neural network may include training data input during a neural network learning process and/or input data input to a neural network that has been trained. Data input to the neural network may include pre-processed data and/or data subject to pre-processing. Pre-processing may include a data processing process for inputting data to a neural network. Accordingly, the data structure may include data subject to pre-processing and data generated by pre-processing. The aforementioned data structure is only an example, and the present disclosure is not limited thereto. The data structure may include the weights of the neural network. The weight of the neural network means a parameter. And the data structure including the weight of the neural network may be stored in a computer readable medium. A neural network may include a plurality of weights. Weights can be variable, and can be changed through learning to perform a desired function of the neural network. For example, when one or more input nodes are interconnected by respective links to one output node, the output node is set to a link corresponding to values input to input nodes connected to the output node and respective input nodes. An output node value can be determined based on the parameter.

In addition, the weight may include a weight variable during the neural network learning process and/or a weight after neural network learning is completed. The variable weight in the neural network learning process may include a weight at the time the learning cycle starts and/or a variable weight during the learning cycle. The weights for which neural network learning has been completed may include weights for which learning cycles have been completed. Accordingly, the data structure including the weights of the neural network may include a data structure including weights that are variable during the neural network learning process and/or weights for which neural network learning is completed. Therefore, it is assumed that the above-described weights and/or combinations of weights are included in the data structure including the weights of the neural network.

The data structure including the weights of the neural network may be stored in a computer readable storage medium (eg, a memory or a hard disk) after going through a serialization process. Serialization can be the process of converting a data structure into a form that can be stored on the same or another computing device and later reconstructed and used. A computing device may serialize data structures to transmit and receive data over a network. The data structure including the weights of the serialized neural network may be reconstructed on the same computing device or another computing device through deserialization. The data structure including the weights of the neural network is not limited to serialization. Furthermore, the data structure including the weights of the neural network is a data structure for increasing the efficiency of operation while minimizing the resource of the computing device (for example, B-Tree, Trie, m-way search tree, AVL tree, Red-Black Tree).

Hereinafter, learning and reasoning operations of a neural network according to an embodiment will be described in detail.

Referring to FIG. 3 , the neural network 400 in the embodiment may include first to third

neural networks

410 , 420 , and 430 . In this case, the first to third

neural networks

410, 420, and 430 may be deep-learned and learned according to input data. Also, at least one of the finally trained first to third

neural networks

410, 420, and 430 may be used to infer a resultant value for input data for analysis of a patent document.

As described above, the first to third

neural networks

410, 420, and 430 in the embodiment may be deep learning neural networks. Specifically, the first to third

neural networks

410, 420, and 430 may include an artificial intelligence module for natural language processing to analyze patent documents. For example, the first to third

neural networks

410, 420, and 430 may be a T5 model. However, in the embodiment, the type of model constituting the first to third

neural networks

410, 420, and 430 is not limited to 'T5', and other models among artificial intelligence models for natural language processing may be used.

Through this, the learning process in the embodiment includes a second process of learning the first neural network 410, a second process of learning the second neural network 420, and a third process of learning the third neural network 430. can include In this case, the first to third processes may be sequentially performed with a time difference. However, the embodiment is not limited thereto, and at least two of the first to third processes may be simultaneously performed at the same time point.

Preferably, in an embodiment, a first process of training the first neural network 410 may be preferentially performed. In an embodiment, the second process of learning the second neural network 420 and the third process of learning the third neural network 430 may be simultaneously performed using the learning result value of the first process.

In this case, the first process may refer to a process of conducting learning specific to a patent document with respect to the first neural network 410 . For example, the first process may refer to a learning process for updating the parameters or weights of the neural network to a state specialized for patent document analysis.

The second process may refer to a learning process of extracting key sentences from a patent document for the second neural network 420 . For example, the second process may refer to a process of learning or deep learning the second neural network 420 to extract or infer a key sentence that is the core of the contents included in the entire scope of the patent document.

The third process may refer to a learning process of extracting or inferring a key noun phrase from a key sentence included in a patent document with respect to the third neural network 430 . For example, the third process may refer to a process of learning or deep learning the third neural network 430 to extract or infer a key noun phrase from a key sentence included in a patent document.

Hereinafter, a process of learning or deep learning the first neural network 410, the second neural network 420, and the third neural network 430 through the first to third processes will be described in detail.

Referring to FIG. 4 , in the embodiment, a neural network of a model for general natural language processing is first used, and deep learning or a learning process is performed to update the general neural network to a state specialized for processing patent documents.

In this case, the first process may be referred to as a preprocessing process for learning or deep learning the second neural network 420 and the third neural network 430 .

The artificial intelligence model of the first neural network 410 is the same as the artificial intelligence model of the second neural network 420 and the artificial intelligence model of the third neural network 430 . In addition, the first process of learning or deep learning the first neural network 410 increases the learning accuracy of the second neural network 420 and the third neural network 430, while increasing inference accuracy in a later inference process. it could be

At this time, the second neural network 420 in the embodiment is required to determine a key sentence and extract the key sentence, and the third neural network 430 is a new natural language meaning a core within the key sentence. The work to create is required. Specifically, the second neural network 420 and the third neural network 430 in the embodiment should be able to perform learning and reasoning for natural language understanding and natural language generation.

To this end, the base line of the model structures of the second neural network 420 and the third neural network 430 in the embodiment may utilize a T5 model structure having high performance in both natural language understanding and natural language generation. However, the embodiment is not limited thereto, and the second neural network 420 and the third neural network 430 may utilize other model structures with high performance in natural language generation and natural language understanding other than the T5 model structure.

Meanwhile, as described above, the model structure of the first neural network 410 may be the same as that of the second neural network 420 and the third neural network 430 .

However, in the learning process, the input data input to the first neural network 410 and the learning method of the first neural network 410 are input to the second neural network 420 and the third neural network 430, respectively. Data and learning methods may differ.

At this time, in the embodiment, the process of learning or deep learning the second neural network 420 and the third neural network 430 may be performed directly without the first process of learning or deep learning the first neural network 410. That is, the learning process of the first neural network 410 can be said to be a process for increasing learning accuracy and inference accuracy of the second neural network 420 and the third neural network 430 .

In the first process, the first neural network 410 is trained on the input data in a state in which the model learned through the general domain is set as the basic value (eg, parameter value or weight value) of the network function. can make it

To this end, as a starting step of the first process, first input data is input to the first neural network 410 (S110). The first input data may be provided from the electronic device 100 and may be provided from the database 200 differently. However, the embodiment is not limited thereto, and data stored inside the artificial intelligence device 300 may be used as the first input data.

The first input data may be specific data of a patent domain. For example, the first input data may be a specific patent document for learning by the first neural network 410 .

At this time, in the embodiment, the learning of the first neural network 410 is not performed using a partial document of a specific range within the first input data, but by using the contents of the entire range of the first input data. The learning of the first neural network 410 proceeds.

For example, the first input data is a patent document of a patent domain. And the document of the patent domain is divided into a plurality of identification items. For example, a document in the patent domain is divided into a background description section, a detailed description section, a summary section, and a claim section. Further, in a general patent document analysis system, as a patent domain document has many long and unnecessary noise contents, the analysis is performed with a focus on the claims item or summary item among the identification items. However, documents in the general patent domain contain most of the important information in the detailed description section. Accordingly, in a document of a specific patent domain, if the analysis is not performed centering on the detailed description items, it is difficult to obtain an accurate analysis result of the document.

Accordingly, in the embodiment, in order to proceed with the learning of the first neural network 410, reference is made to the contents of the entire range (in particular, the contents of the detailed description item), rather than only the contents of a specific range in the document of the patent domain. So that the learning of the first neural network 410 proceeds.

At this time, the document of the patent domain includes a number of technical terms not included in the document of the general domain. Accordingly, when the document in the patent domain is immediately analyzed by applying the neural network of the model trained using the document in the general domain, the learning rate and the inference rate may be relatively low.

Accordingly, in the embodiment, parameter values or weights of the first neural network 410 are updated with a model structure specialized for analyzing patent domain classification through the first process.

Meanwhile, in the learning process of the first neural network 410, when the first input data is input, in the embodiment, a process of processing the first input data may be performed. Preferably, in the embodiment, a process of generating or extracting a corpus used in a document of a patent domain may be performed from the first input data (S120). Here, the corpus may refer to language data in which texts based on an application solution in a natural language processing technology are collected in a computer-readable form. The corpus may also be expressed as an assortment of words or an assortment of writings. In the embodiment, a corpus corresponding to a document in a patent domain is obtained from the first input data without the learning of the first neural network 410 using the first input data directly, and the corpus is used to The learning of the first neural network 410 proceeds. At this time, as the amount of the corpus increases, the accuracy of the natural language that can be recognized or understood by the first neural network 410 increases, and accordingly, learning accuracy and inference accuracy can be increased.

Next, in the embodiment, learning of the first neural network 410 is performed using the first input data (preferably, a corpus obtained from a document of a patent domain) (S130).

In this case, learning of the first neural network 410 may be performed through an unsupervised learning method. Preferably, in the embodiment, in order to proceed with learning of the first neural network 410, in a state where label data (eg, correct answer data) is not included in the first input data, the first input data is used. proceed with learning

A brief description of the learning method of the first neural network 410 is as follows. The first neural network 410 masks a portion of the first input data, and accordingly, the first neural network 410 can perform learning based on a loss function according to whether the content of the masked portion is accurately generated. there is.

That is, looking at the learning process of the first neural network 410 in more detail, the first input data may be unlabeled data,

can be called

And, in the embodiment, the

which is part of a word in

is a special token

, and corrupted sentence X' can be generated through the process.

In addition, as a learning process of the first neural network 410, the first neural network 410 receives the X' as an input and generates a set of substituted words (for example, a span set).

You can learn the process of generating them in an auto-regressive way.

From here,

Is

means the span length for When this is defined as a masking span, the training objective for the model θ in the corresponding processor can be expressed as Equation 1 below.

[Equation 1]

And, in the embodiment, in the step of pre-learning the first neural network 410 through the corresponding objective, pre-learning may be performed simultaneously for several sub-tasks. This means a process of pre-processing all tasks such as natural language understanding in the form of one sequential input/output data, and learning a neural network model in a text-to-text form through the data. Through this, in the embodiment, in the learning process of the first neural network 410 learned through the first process, the second neural network 420 and the third neural network 430 through the second and third processes thereafter. Have a consistent training objective.

On the other hand, in the first process, generated through the model θ of the first neural network 410

is the preprocessed input sequence

It can be generated through a sequence-to-sequence based autoregression process as shown in Equation 2 below. Here, the sequence-to-sequence (Seq2Seq) learning means to proceed with model learning that converts a sequence from a specific domain (eg, a general domain) to another domain (eg, a patent domain). do.

[식 2][Equation 2]

And learning of the first neural network 410 is the cross entropy between the embedding representation of the output sequence generated through Equation 2 and the actual reference output sequence. -entropy) can be carried out in the direction of minimizing the loss.

In the embodiment, a first process of learning the first neural network 410 may be performed through the above process.

Also, through the first process, the model structure of the first neural network 410 may be converted from a structure for analyzing documents in the general domain to a model structure for analyzing documents in the patent domain.

For example, through the first process, parameter values of the first neural network 410 may be updated to a state specialized for analyzing a document of a patent domain (S140).

Meanwhile, in the embodiment, the learning of the second neural network 420 and the third neural network 430 is performed using the parameter values of the first neural network 410 updated through the first process.

Next, a second process and a third process for learning the second neural network 420 and the third neural network 430 will be described. 5 is a diagram for explaining a learning process of a second neural network according to an embodiment.

In this case, the second process of learning the second neural network 420 and the third process of learning the third neural network 430 may be performed simultaneously or sequentially.

The second process of learning the second neural network 420 will be described as follows.

In the embodiment, second input data may be input for learning of the second neural network 420 (S210). The second input data may include a patent document for learning and label data that is correct answer data in the patent document.

For example, the second input data may include a patent document and label data of the patent document as one set, and a plurality of such sets may be included.

For example, the second input data may include a first data set including a first patent document and first label data for labeling key sentences in the first patent document, a second patent document, and the second patent document. A second data set including second label data for labeling key sentences in the document may be included. At this time, it has been described that the second input data includes two data sets, but is not limited thereto. For example, the second input data may include three or more data sets in order to increase the learning accuracy of the second neural network 420 .

Meanwhile, in the embodiment, when the second input data is input, a process of generating summary data summarizing the second input data may be performed (S220). Specifically, in the embodiment, a process of summarizing patent documents among each data set included in the second input data may be performed. For example, in the embodiment, summary data summarizing the patent document included in the second input data is obtained using a summary algorithm. In this case, the summary algorithm may include textRank algorithm, but is not limited thereto. After constructing a word graph or sentence graph, the TextRank algorithm can summarize patent documents using PageRank, a graph ranking algorithm.

That is, in the embodiment, for the learning of the second neural network 420, the learning is not performed with the entire range of the patent document included in the second input data, but by using summary data summarizing the patent document. 2 Learning the neural network (420). In the embodiment, it was confirmed through experiments that when learning the second neural network 420 with the summary data, the number of sentences included in the patent document can be effectively reduced, thereby increasing learning efficiency. Furthermore, in the embodiment, additional performance of the second neural network 420 is obtained by primarily filtering out content that adversely affects the learning of the second neural network 420 from among the entire range of content of the patent document included in the second input data. It can be seen that improvements can be made.

In conclusion, in the embodiment, when the second input data is input, the second input data is not directly applied as the learning data of the second neural network 420, and in order to further improve performance and improve learning efficiency, the second input data Prior to learning of the neural network 420, pre-processing of the second input data is performed. In the embodiment, the relative importance of sentences in the patent document included in the second input data is assigned, and the patent document included in the second input data is efficiently analyzed by applying a graph-based method of summarizing the entire document. can be summarized as In this case, the summary methodology through the summary algorithm may be as shown in Equation 3 below.

[식 3][Equation 3]

From here,

is the sentence

It means the importance score for, and the sentences (

) means a weight based on the similarity between sentences. And, in the embodiment, for the sentences included in the patent document

find all the values, and

After sorting the sentences by value, the highest

Summary data of the first input data may be obtained by selecting sentences having values.

And, in the embodiment, unnecessary data in a document can be effectively reduced through the sentence summary as described above, and after setting a minimum ratio so that important sentences are included as much as possible, as the second neural network 420 learns, The learning time of the second neural network 420 can be effectively reduced.

Meanwhile, when the summary data is acquired, if the patent document is excessively summarized, label data may not be included in the obtained summary data. In other words, according to the summary result of the patent document, correct answer data may not be included in the summary data. Accordingly, in the embodiment, summary data is obtained by applying text-rank using different threshold values. According to the result of obtaining the summary data, a minimum summary ratio at which label data is not omitted in the summary data is checked, and a threshold value corresponding to the minimum summary ratio is determined so that the summary data is acquired.

Meanwhile, in an embodiment, corresponding input data may be selectively used according to whether label data is included in summary data obtained through the summary algorithm. For example, even when a threshold value applied through a summary algorithm is low, label data may be missing from the summary data. And, in this case, it means that the label data is missing even though the patent document is not excessively summarized, and efficient learning of the second neural network 420 may not be achieved with this.

Therefore, in the embodiment, if the label data is included in the summary data summarized through the summary algorithm, it is used as learning data for learning of the second neural network 420 . And, when the label data is not included in the summary data summarized through the summary algorithm, it may be excluded from the learning data for learning of the second neural network 420 .

On the other hand, in the embodiment, it was confirmed whether the case of learning using the summary data acquired based on the text-rank algorithm has higher performance than the case of learning with the original data that is not summarized.

That is, in the embodiment, for the learning of the second neural network 420, the summarized summary data and the non-summarized original data were respectively applied for learning, and the result of the learning was confirmed.

In addition, in the embodiment, it was confirmed that the neural network of the model trained with the summary data can produce better performance than the model trained with the original data. And, this shows that the number of sentences in each document can be effectively reduced through the summary algorithm, thereby effectively reducing the training time as well as improving the overall performance of the model. Through this, computing power and GPU resources used for learning of the second neural network 420 can be efficiently reduced.

Next, in the embodiment, the second neural network 420 is trained using second input data composed of a set of the summary data and its label data.

To this end, in the embodiment, the parameter value obtained from the first neural network 410 is applied as the initial value of the parameter value of the second neural network 420 (S230). Preferably, in the embodiment, for learning of the second neural network 420, a final updated parameter value of the first neural network 410 is extracted, and this is used as an initial value of the parameter value of the second neural network 420. apply And, in the embodiment, by applying the parameter values of the second neural network 420 as initial values to the parameter values of the first neural network 410, the second neural network 420 is specialized in analyzing documents in the patent domain. can proceed more efficiently, and the resulting learning accuracy or learning time can be drastically reduced.

Thereafter, in the embodiment, the second neural network uses the second input data (preferably summarized data) in a state in which the parameter values of the first neural network 410 are applied as initial parameter values of the second neural network 420. The learning of (420) proceeds (S240).

To this end, in an embodiment, a model for quantitatively measuring the importance of a sentence may be utilized. That is, in the embodiment, for a data set including summary data and its label data, the case where a specific sentence is used as a key sentence in the document is labeled as '1', and the case where it is not a key sentence is labeled as '0'. label it Also, in an embodiment, the second neural network 420 may be trained by learning a process of regressing a label value for each sentence through a model. To this end, the second neural network 420 may have a model of an encoder structure. And the second neural network 420 determines whether the corresponding sentence is an important sentence (eg, label = 1) or not (eg, label = 0) through encoding of the input sentence as 0 and 1, respectively. You can score between them. At this time, the model

and determine the importance of the sentence

can be calculated through the same process as in Equation 4 below.

[식 4][Equation 4]

In the embodiment, the second neural network 420 is trained in a direction that minimizes cross entropy loss between actual label data in the corresponding document and label data obtained through the learning process. At this time, in the inference step of extracting key sentences for a specific analysis patent document using the second neural network 420, the

can be used as a judgment criterion.

Through this, the embodiment checks how important the sentence is in the patent document, measures the importance of each sentence in the document, and selects N sentences with the highest importance. to proceed with key sentence extraction learning and reasoning.

For example, in the learning process of the second neural network 420, the output data obtained through learning is compared with the label data, and a loss function is calculated. And, when the loss function is greater than a predetermined threshold value, re-learning of the second neural network 420 is performed. Alternatively, when the loss function is smaller than a predetermined threshold value, the learning process may be completed and the next step may be entered accordingly.

However, the learning of the second neural network 420 proceeds through an extraction method. At this time, when learning the extraction method, label imbalance can be a major constraint on performance improvement. This can be attributed to the fact that the number of sentences included in the characteristic section of the detailed description section of the patent document and the entire document is very large, whereas the number of core sentences to be extracted from the document is very small. This has excellent classification performance for general sentences other than key sentences, but when extracting key sentences, the accuracy of classifying key sentences may decrease. That is, since there are relatively more data with an importance of 0 compared to data with an importance of 1 due to the nature of a patent document, when learning the second neural network 420 without an additional preprocessing process, learning accuracy may be lowered. .

Accordingly, in an embodiment, upscaling may be performed on core sentence data. This may proceed with a method of adding a plurality of label data to the first summary data in the first summary data used to proceed with the learning of the second neural network 420 and the label data of the first summary data. . Through this, in the embodiment, learning is performed in a state in which the frequency of data having a label of 1 and the frequency of data having a label of 0 are similarly matched in the summary data, and thus learning accuracy can be improved.

At this time, for the scaling values (eg, the number of copied label data) for the upscaling, a comparison experiment is performed with each scaling value applied to obtain an optimal scaling value, and the acquisition Apply one scaling value. Through this, in the embodiment, the accuracy of determining the importance of each sentence in the document can be efficiently improved, and a model for extracting a high-performance key sentence can be generated accordingly.

And in the embodiment, the parameter value according to the learning result of the second neural network 420 is updated (S250). For example, in the embodiment, the initial value of the parameter value of the second neural network 420 is the same as the parameter value of the first neural network 410, which is a model for analyzing documents in the general domain. This is the parameter value of the model for analyzing documents.

Accordingly, in the embodiment, through the learning process of the second neural network 420 as described above, the parameter value of the second neural network 420 may be updated to a parameter value of a model for extracting key sentences in a patent document. there is.

Next, the third neural network 430 that learns the third neural network 430 will be described.

The third process of the third neural network 430 may be performed at the same time as the second process of the second neural network 420, and otherwise, the second process for learning of the second neural network 420 ends. can proceed after

In the embodiment, third input data may be input for learning of the third neural network 430 (S310). The third input data may include a key sentence included in a patent document for learning and label data that is correct answer data for a key noun phrase in the key sentence.

For example, the third input data may include a core sentence included in a patent document and label data for the core sentence as one set, and a plurality of such sets may be included.

For example, the third input data may include a first data set including a first core sentence and first label data for labeling a core noun phrase in the first core sentence, a second core sentence and the second core sentence. A second data set including second label data for labeling key noun phrases in the sentence may be included. At this time, it has been described that the third input data includes two data sets, but is not limited thereto. For example, the third input data may include three or more data sets in order to increase learning accuracy of the third neural network 430 .

Next, in the embodiment, learning of the third neural network 430 may be performed using the third input data.

At this time, in the embodiment, the parameter value obtained from the first neural network 410 is used as the initial value of the parameter value of the third neural network 430 to correspond to the second process for learning the second neural network 420. Apply (S320). Preferably, in the embodiment, for learning of the third neural network 430, a final updated parameter value of the first neural network 410 is extracted, and this is used as an initial value of the parameter value of the third neural network 430. apply And, in the embodiment, by applying the parameter values of the third neural network 430 as initial values to the parameter values of the first neural network 410, the third neural network 430 is specialized in analyzing documents in the patent domain. can proceed more efficiently, and the resulting learning accuracy or learning time can be drastically reduced.

Meanwhile, the third neural network 430 may proceed with learning using the third input data by setting the parameter value of the first neural network 410 as an initial value (S330).

At this time, the learning of the third neural network 430 is a method of generating key noun phrases through a method similar to Equation 2 used for learning of the first neural network 410, and the following equation for the third input data Learning can proceed through a training objective such as 5.

[식 5][Equation 5]

This separates the sequences and key noun phrases in the third input data used for extracting the key noun phrase into one

It means a method of preprocessing with , and learning it based on sequence-to-sequence.

For example, in the learning process of the third neural network 430, a loss function is calculated by comparing output data acquired through learning with the label data. In addition, when the loss function is greater than a predetermined threshold value, re-learning of the third neural network 430 is performed. Alternatively, when the loss function is smaller than a predetermined threshold value, the learning process may be completed and the next step may be entered accordingly.

In this case, in the embodiment, a generation method rather than a sentence extraction method is applied to the third neural network 430 . For example, in the embodiment, some words in the core sentence included in the third input data are not included in the core noun phrase as its label data, but the core noun phrase includes new words not included in the core sentence. do. Through this, in an embodiment, a core noun phrase in a corresponding core sentence can be generated more flexibly. This allows key noun phrases to be newly created based on the content of the entire sentence rather than finding them within the sentence. Through this, in the embodiment, the key noun phrase is generated in a direction semantically similar to the standard key noun phrase, and through this, the error rate can be lowered compared to the extraction method. In this way, in the embodiment, compared to the extraction method of extracting some words included in a key sentence as a key noun phrase, it is possible to generate a key noun phrase that is semantically richer and implicitly expresses the entire document content, thereby improving user satisfaction. can

Through this, in the embodiment, for the third neural network 430, the key noun phrase can be generated based on words considering the entire context of the key sentence, rather than some words included in the key sentence. For example, a noun phrase such as 'breakage of the latching claw' can be generated as a noun phrase such as 'brokenage of the latching claw' or 'cutage of the latching claw', and a noun phrase such as 'secure the viscosity of the slurry' can be generated as 'brokenage of the latching claw' or 'cutage of the latching claw'. It can be created with 'enhance the viscosity of the slurry'.

And in the embodiment, the parameter value according to the learning result of the third neural network 430 is updated (S340). For example, in the embodiment, the initial value of the parameter value of the third neural network 430 is the same as the parameter value of the first neural network 410, which is a model for analyzing documents in the general domain. This is the parameter value of the model for analyzing documents.

Accordingly, in an embodiment, through the learning process of the third neural network 430 as described above, the parameter value of the third neural network 430 may be updated to a parameter value of a model for extracting a key noun phrase from a key sentence.

The learning process of the neural network of the present application as described above is summarized as follows.

Meanwhile, in an embodiment, an inference process may be performed using the second neural network 420 and the third neural network 430 whose parameter values are updated through the learning process. The inference process may refer to a process of performing an analysis on a specific analysis target requested by a user using a neural network for which learning has been completed, and proceeding with inference.

At this time, in the embodiment, in order to proceed with the inference, the second neural network 420 and the third neural network 430 for which the learning has been completed may be used, but are not limited thereto. For example, the second neural network 420 and the third neural network 430 may continuously perform learning for learning and perform an operation of updating a parameter value corresponding to this. In addition, the inference process may be performed using a separate neural network other than the learning neural network such as the second neural network 420 and the third neural network 430 . In this case, when the learning neural network and the inference neural network are configured separately, the inference neural network may receive a final updated parameter value from the learning neural network and apply it to perform inference on the analysis target. However, for the neural network for which the learning has been completed, additional learning is practically unnecessary, or simple additional learning may be performed only in specific situations. Accordingly, in the embodiment, the learning neural network and the inference neural network are not distinguished and used in common as described above.

Accordingly, in an embodiment, inference may be performed using the second neural network 420 and the third neural network 430 in an inference process.

Referring to FIG. 7 , in the embodiment, data for an object to be analyzed is input (S310). At this time, the input of the data may be performed in various ways. For example, when information to be analyzed (eg, a document number of a patent document) is input in the electronic device 100, the artificial intelligence device 300 receives a patent document corresponding to the information from the database 200. You can search for and receive it. Alternatively, the patent document to be analyzed may be directly input from the electronic device 100 to the artificial intelligence device 300 .

Alternatively, information to be analyzed by the electronic device 100 may be input to the database 200 . And, the database 200 may provide the patent document corresponding to the input information to the artificial intelligence device 300 as input data.

Next, in the embodiment, summary data for the input data is obtained using a summary algorithm (S320). For example, in an embodiment, summary data of the input data may be obtained using a summary algorithm used in the learning process.

Next, in the embodiment, key sentences may be extracted from the summary data using the second neural network 420 for which the learning is completed (S330). The second neural network 420 may output the extracted core sentence as input data of the third neural network 430 .

Thereafter, in the embodiment, a key noun phrase may be generated from the key sentence using the third neural network 430 for which the learning has been completed (S340).

The operations may be performed concurrently and are not bound by the order in which they are performed. The present disclosure described above can be implemented as computer readable codes in a medium on which a program is recorded. The computer-readable medium includes all types of recording devices in which data that can be read by a computer system is stored. Examples of computer-readable media include Hard Disk Drive (HDD), Solid State Disk (SSD), Silicon Disk Drive (SDD), ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage device, etc. there is Also, the computer may include a processor of a terminal.

Features, structures, effects, etc. described in the embodiments above are included in at least one embodiment, and are not necessarily limited to only one embodiment. Furthermore, the features, structures, effects, etc. illustrated in each embodiment can be combined or modified with respect to other embodiments by a person having ordinary knowledge in the field to which the embodiments belong. Therefore, the contents related to these combinations and variations should be interpreted as being included in the scope of the embodiments.

Although the above has been described centering on the embodiment, this is only an example and is not intended to limit the embodiment, and those skilled in the art to which the embodiment belongs may find various things not exemplified above to the extent that they do not deviate from the essential characteristics of the present embodiment. It will be appreciated that variations and applications of branches are possible. For example, each component specifically shown in the embodiment can be modified and implemented. And differences related to these modifications and applications should be interpreted as being included in the scope of the embodiments set forth in the appended claims.

Claims

training a first neural network based on the first input data;

learning a second neural network based on second input data by applying a parameter value of the first neural network according to a learning result of the first neural network; and

Learning a third neural network based on third input data by applying parameter values of the first neural network;

The first input data includes a document of a patent domain,

The second input data includes first label data for labeling a plurality of patent documents and key sentences corresponding to each of the plurality of patent documents,

The third input data includes second label data labeling a plurality of core sentences and a core noun phrase respectively corresponding to the plurality of core sentences,

How neural networks learn.
According to claim 1,

Learning the first neural network,

Acquiring a plurality of corpus using documents of the patent domain,

Including the step of learning the first neural network using the plurality of corpora,

How neural networks learn.
According to claim 1,

Learning the second neural network,

applying parameter values of the first neural network as initial values of parameter values of the second neural network;

Including the step of updating the parameter value of the second neural network by performing learning based on the second input data,

How neural networks learn.
According to claim 1,

Learning the second neural network,

Upscaling the second input data;

In the upscaling step,

copying first label data labeling key sentences corresponding to each of the plurality of patent documents; and

Adding the copied label data as first label data of each patent document,

How neural networks learn.
According to claim 1,

Learning the third neural network,

applying parameter values of the first neural network as initial values of parameter values of the third neural network;

Including the step of performing learning based on the third input data and updating parameter values of the third neural network.

How neural networks learn.
According to claim 1,

Learning the second neural network,

Acquiring summary data summarizing each of a plurality of patent documents included in the second input data,

How neural networks learn.
According to claim 1,

Learning the first neural network,

Changing a parameter value of a neural network learned in a document of a general domain to a parameter value corresponding to a document of the patent domain,

How neural networks learn.
According to claim 1,

The first to third neural networks include a deep learning neural network (DNN),

How neural networks learn.
According to claim 8,

The model structures of the first to third neural networks are the same as each other,

How neural networks learn.
According to claim 9,

The model structure of the first to third neural networks includes a T5 model,

How neural networks learn.