CN109325226B

CN109325226B - Deep learning network-based term extraction method and device and storage medium

Info

Publication number: CN109325226B
Application number: CN201811052429.XA
Authority: CN
Inventors: 杨旭; 杜翠凤; 周善明; 张添翔; 叶绍恩; 梁晓文
Original assignee: Guangzhou Jiesai Communication Planning And Design Institute Co ltd; GCI Science and Technology Co Ltd
Current assignee: Guangzhou Jiesai Communication Planning And Design Institute Co ltd; GCI Science and Technology Co Ltd
Priority date: 2018-09-10
Filing date: 2018-09-10
Publication date: 2023-04-14
Anticipated expiration: 2038-09-10
Also published as: CN109325226A

Abstract

The invention provides a term extraction method, a term extraction device and a storage medium based on a deep learning network, wherein the method comprises the following steps: carrying out term annotation on the target text; performing word segmentation processing on the labeled target text to obtain a word segmentation text and extracting keywords; training a pre-established RNN deep learning network according to the keywords to obtain a term prediction model, and obtaining a term prediction result output by the term prediction model; and training the pre-established CNN deep learning network according to the term prediction result and term label corresponding to the target text to obtain a term extraction model, and obtaining a term extraction result output by the term extraction model. The method integrates the RNN and the CNN deep learning networks to form a deeper deep learning network, and carries out term prediction and extraction on the target text according to the extracted keywords and the term labeling result of the target text, so that the term extraction rate can be effectively improved, and the extraction of Chinese terms of massive texts is realized.

Description

Deep learning network-based term extraction method and device and storage medium

Technical Field

The present invention relates to the field of term extraction technologies, and in particular, to a term extraction method and apparatus based on a deep learning network, and a storage medium.

Background

The term denotes a professional or a research direction of a field, and the term extraction has research significance in the field of natural language processing, and particularly has wide application prospects in machine translation and cross-language information retrieval.

The traditional term extraction is three: the method is carried out manually or non-manually according to information of a corpus mostly, the term extraction rate is low, and for the era of information explosion today, extraction of Chinese terms of massive texts is difficult to finish by a manual or semi-manual mode.

Disclosure of Invention

Based on the method, the device and the storage medium, the term extraction method, the device and the storage medium based on the deep learning network are provided, the term extraction rate can be improved, and therefore the extraction of Chinese terms of massive texts is achieved.

In order to achieve the above object, an aspect of the embodiments of the present invention provides a term extraction method based on a deep learning network, including:

carrying out term annotation on the target text;

performing word segmentation processing on the labeled target text to obtain a word segmentation text, and extracting keywords from the word segmentation text;

training a pre-established RNN deep learning network according to the keywords to obtain a term prediction model, and obtaining a term prediction result corresponding to the target text output by the term prediction model;

and training a pre-established CNN deep learning network according to the term prediction result and term label corresponding to the target text to obtain a term extraction model, and acquiring a term extraction result corresponding to the target text output by the term extraction model.

Preferably, the training of the pre-established RNN deep learning network according to the keyword specifically includes:

the hidden layer of the RNN deep learning network adopts an RNN network, and the output layer of the RNN deep learning network adopts a Softmax multilayer network;

and training the RNN of a hidden layer in the RNN deep learning network by using the keywords, and inputting an output result of the RNN into a Softmax multilayer network of an output layer of the RNN for training.

Preferably, before training the pre-established RNN deep learning network according to the keyword, the method further comprises:

performing word vector conversion on the extracted keywords to obtain a word sequence;

and training the pre-established RNN deep learning network by using the word sequence.

Preferably, the term labeling of the target text specifically includes:

carrying out term annotation on a target text by adopting an HANLP open source tool; and the term labeling result of each word in the target text comprises a word, a part of speech and a term boundary.

Preferably, the performing word segmentation processing on the labeled target text to obtain a word segmented text, and extracting keywords from the word segmented text specifically includes:

performing word segmentation processing on the labeled target text by adopting an HANLP open source tool to obtain a word segmentation text;

and extracting a term word and a plurality of words positioned in front of and behind the term word from the word segmentation text according to a term labeling result of each word in the target text to obtain a keyword corresponding to the target text.

Preferably, the training of the pre-established CNN deep learning network according to the term prediction result and the term label corresponding to the target text specifically includes:

the hidden layer of the CNN deep learning network adopts a CNN network, and the output layer of the CNN deep learning network adopts a Softmax multilayer network;

and training a CNN network of a hidden layer in the CNN deep learning network by using a term prediction result and a term label corresponding to the target text, and inputting an output result of the CNN network into a Softmax multilayer network of an output layer of the CNN network for training.

In another aspect, an embodiment of the present invention further provides a term extraction device based on a deep learning network, including:

the term labeling module is used for carrying out term labeling on the target text;

the keyword extraction module is used for performing word segmentation processing on the labeled target text to obtain a word segmentation text and extracting keywords from the word segmentation text;

the first training module is used for training a pre-established RNN deep learning network according to the keywords to obtain a term prediction model and obtain a term prediction result corresponding to the target text output by the term prediction model;

and the second training module is used for training the pre-established CNN deep learning network according to the term prediction result and the term label corresponding to the target text to obtain a term extraction model and acquiring the term extraction result corresponding to the target text output by the term extraction model.

Preferably, the keyword extraction module includes:

the segmentation processing unit is used for performing segmentation processing on the labeled target text by adopting an HANLP open source tool to obtain a segmentation text;

and the keyword acquisition unit is used for extracting term words and a plurality of words positioned in front of and behind the term words from the word segmentation text according to the term labeling result of each word in the target text to obtain keywords corresponding to the target text.

In another aspect, the present invention further provides a deep learning network-based term extraction apparatus, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, and when the processor executes the computer program, the deep learning network-based term extraction method is implemented.

In another aspect, the embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, a device in which the computer-readable storage medium is located is controlled to perform the term extraction method based on a deep learning network as described above.

Compared with the prior art, the embodiment of the invention has the beneficial effects that: the term extraction method based on the deep learning network comprises the following steps: carrying out term annotation on the target text; performing word segmentation processing on the labeled target text to obtain a word segmentation text, and extracting keywords from the word segmentation text; training a pre-established RNN deep learning network according to the keywords to obtain a term prediction model, and obtaining a term prediction result corresponding to the target text output by the term prediction model; and training a pre-established CNN deep learning network according to the term prediction result and term label corresponding to the target text to obtain a term extraction model, and acquiring a term extraction result corresponding to the target text output by the term extraction model. The method integrates the RNN and the CNN deep learning networks to form a deeper deep learning network, and carries out term prediction and extraction on the target text according to the extracted keywords and the term labeling result of the target text, so that the term extraction rate can be effectively improved, and the extraction of Chinese terms of massive texts is realized.

Drawings

Fig. 1 is a schematic flowchart of a term extraction method based on a deep learning network according to an embodiment of the present invention;

FIG. 2 is a block flow diagram of a deep learning network based term extraction method of FIG. 1;

fig. 3 is a schematic block diagram of a term extraction device based on a deep learning network according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Please refer to fig. 1, which is a flowchart illustrating a term extraction method based on a deep learning network according to an embodiment of the present invention. The method comprises the following steps:

s100: carrying out term annotation on the target text;

s200: performing word segmentation processing on the labeled target text to obtain a word segmentation text, and extracting keywords from the word segmentation text;

s300: training a pre-established RNN deep learning network according to the keywords to obtain a term prediction model, and acquiring a term prediction result corresponding to the target text output by the term prediction model;

s400: and training a pre-established CNN deep learning network according to the term prediction result and term label corresponding to the target text to obtain a term extraction model, and acquiring a term extraction result corresponding to the target text output by the term extraction model.

The invention combines RNN and CNN deep learning networks to form a deeper deep learning network, namely, the RNN deep learning network has strong prediction capability on the sequence of word sequences to realize the prediction of the next word of each word, and the CNN deep learning network has strong characteristic extraction function to realize the automatic extraction of terms, thereby effectively improving the term extraction rate, realizing the extraction of Chinese terms of massive texts, and greatly improving the term identification and extraction accuracy.

In an optional embodiment, the training of the pre-established RNN deep learning network according to the keyword specifically includes:

In this embodiment, all the extracted keywords are input to the RNN neural learning network to be repeatedly trained, the order of the word sequence is automatically learned, the training is stopped when the loss function satisfies a certain condition, a train model (term prediction model) is generated, and after the term prediction model is completed, the term prediction model automatically realizes prediction.

Further, when the output result of the loss function of the RNN deep learning network is obtained, the word prediction accuracy of the generated term prediction model is judged, if the accuracy reaches a preset threshold value (for example, 80%) of the system, the model is considered to be ideal, and the training is stopped at the moment; otherwise, the model is considered to be not ideal, and the parameters (e.g., including learning rate (learning rate), epoch number, batch size, and Dropout) need to be readjusted, and the training is continued repeatedly.

In an optional embodiment, before training the pre-established RNN deep learning network according to the keyword, the method further includes:

performing Word vector transformation (Word 2 Vec) on the extracted keywords to obtain a Word sequence;

in this embodiment, the vector of a keyword has dimensions of 1 × 128.

And training a pre-established RNN deep learning network by using the word sequence.

In an optional embodiment, the term labeling for the target text specifically includes:

In an optional embodiment, the performing word segmentation processing on the labeled target text to obtain a word segmentation text, and extracting keywords from the word segmentation text specifically includes:

In this embodiment, the tfidf algorithm of the tensolflow tool is used to extract terms from the segmented text. Further, all the extracted terms are ranked according to the weights of the terms, N terms in front of the weights are extracted, and 3 words before and after the N term words are extracted to serve as the keywords of the target text. Wherein, the weight of the term can be calculated by the ratio of the frequency of the term appearing in the target text to the sum of the frequencies of all terms appearing in the target text; the N terms that are extracted in front of the weight are the terms whose weight corresponds to 20% before TOP.

According to the invention, the target text is subjected to primary keyword extraction, words with frequent application are found out to form a new corpus, and the next word is predicted by adopting an RNN deep learning network aiming at the words with frequent application, so that the speed of a computer is greatly improved, and the words which can embody the subject characteristics of the target text and are applied frequently can be found out. The invention only segments the front 3 words and the back 3 words of the term, which can greatly reduce the capacity of the corpus and reduce the term extraction time.

In an optional embodiment, the training of the pre-established CNN deep learning network according to the term prediction result and the term label corresponding to the label text specifically includes:

In this embodiment, the term labeling result obtained in step S100 and the term prediction result output by the term prediction model in step S300 are simultaneously input into the CNN deep learning network for training, and the training is stopped when the loss function thereof satisfies a certain condition, so as to automatically learn the feature of the term, generate the term extraction model, and after the term extraction model is completed, the term extraction will automatically implement the term extraction.

Further, when the output result of the loss function of the CNN deep learning network is obtained, the term extraction model word extraction accuracy is judged, if the accuracy reaches a preset threshold value (for example, 80%) of the system, the model is considered to be ideal, and then the training is stopped; otherwise, the model is deemed not ideal, and the parameters (including learning rate, epoch number, batch size, and Dropout, for example) are re-adjusted to continue the training.

For convenience of understanding, the principle and process of the deep learning network-based term extraction method according to the embodiment of the present invention are described with reference to fig. 2:

in step S100 and step S200, the marks denoted by predetermined terms, for example: nx represents a noun, v represents a verb; b denotes the term boundary, O and I denote other boundaries. In this embodiment, it is indicated that the term to be extracted is a word corresponding to the term boundary with B as the term. Taking 'in the field of artificial intelligence' as a target text, and obtaining a term labeling result as 'in/v/O/artificial intelligence nx/B/field nx/I' through term labeling; wherein, the first item in each word is the word itself, the second item is the part of speech, the third item is the term boundary mark, the marked items are separated by using a "/" number, and the words are separated by using a blank space. And obtaining the terms to be extracted according to the term labeling result of the target text, wherein the terms are artificial intelligence.

In the keyword extraction in step S200, an open-source tensoflow tool is used, the terms are extracted using tfidf, the extracted terms are ranked according to weight, terms located 20% ahead of TOP are extracted, and the first 3 words and the last 3 words of the terms are extracted as keywords and used for RNN training. If the artificial intelligence is raised from the industry level to the national policy level, the promotion of governments enables artificial intelligence technology and application to be expected to realize breakthrough in 2017. In the industrial field, express trains carrying artificial intelligence are carried, the development direction of industrial robots and artificial intelligence becomes a consensus of robot enterprises at home and abroad, and a plurality of macros begin to be in a strategic direction of tightening layout. The terms required for extraction by the above steps are: artificial/intelligent/industrial/robotic; the first 3 and last 3 words of the term are then extracted, i.e.: artificial/intelligent/slave/industry/layer, push/let/artificial/intelligent/technology/and/application,/carry/landing/artificial/intelligent/express/drive,/and/in/industry/field,/carry \8230; thereby obtaining a corpus of keywords for training the RNN deep learning network.

In step S300, there are various relationships between the input and output of the RNN neural learning network, in the present invention, the hidden layer of the RNN neural learning network employs the RNN network, and the output layer employs the Softmax multilayer network. When a keyword is input, the RNN neural learning network predicts the next word of the keyword through training until the required set output length is reached. Inputting all the extracted keywords into an RNN neural learning network for repeated training, stopping training when a loss function of the keywords meets certain conditions, generating a train model (term prediction model), and after the term prediction model is completed, automatically realizing prediction by the term prediction model, for example: when "artificial" is entered, the term predictive model predicts that the word after "artificial" is intelligent. Corresponding to the term "artificial/intelligent/industrial/robot" in the above steps, what the word after each word is can be found out through the term prediction model, and the specific result is: artificial intelligence, industrial robot, intelligent realization and robot enterprise. Therefore, common phrase stacks in the text corpus are extracted through the RNN neural learning network.

In step S300, the term standard result, for example, obtained by step S100: artificial intelligence (terminology), industrial robots (terminology), intelligent implementation, robotic enterprise; the CNN network is then trained and classified through its last softmax layer, marking artificial intelligence and industrial robots as terms, while the other two are marked as corresponding types. And finally, enabling the CNN deep learning network to automatically learn the characteristics of the terms through the softmax classifier and realizing the term extraction by combining the powerful feature extraction capability of the CNN.

Compared with the prior art, the term extraction method based on the deep learning network provided by the embodiment of the invention has the following advantages:

(1) The RNN deep learning network has strong prediction capability on the sequence of word sequences, realizes the prediction of the next word of each word, and the CNN deep learning network has strong feature extraction function;

(2) The method comprises the steps of performing preliminary keyword extraction on a target text, finding out frequently-applied words to form a new corpus, and specifically predicting the next word by using an RNN (radio network) deep learning network aiming at the frequently-applied words, so that the speed of a computer is greatly improved, and the words which can embody the subject characteristics of the target text and are frequently applied can be found out;

(3) The invention segments the first 3 words and the last 3 words of the term, greatly reduces the capacity of the corpus and reduces the term extraction time.

Please refer to fig. 3, which is a schematic block diagram of a deep learning network-based term extraction apparatus according to an embodiment of the present invention, the apparatus includes:

the term labeling module 1 is used for carrying out term labeling on the target text;

the keyword extraction module 2 is used for performing word segmentation processing on the labeled target text to obtain a word segmentation text and extracting keywords from the word segmentation text;

the first training module 3 is configured to train a pre-established RNN deep learning network according to the keyword to obtain a term prediction model, and obtain a term prediction result corresponding to the target text output by the term prediction model;

and the second training module 4 is configured to train a pre-established CNN deep learning network according to the term prediction result and the term label corresponding to the target text, obtain a term extraction model, and obtain a term extraction result corresponding to the target text output by the term extraction model.

The invention combines the RNN and the CNN deep learning networks to form a deeper deep learning network, namely, the RNN deep learning network has strong prediction capability on the sequence of word sequences to realize the prediction of the next word of each word, and the CNN deep learning network has strong characteristic extraction function to realize the automatic extraction of terms, thereby effectively improving the term extraction rate, realizing the extraction of Chinese terms of massive texts and greatly improving the term identification and extraction accuracy.

In an optional embodiment, the hidden layer of the RNN deep learning network is an RNN network, and the output layer thereof is a Softmax multilayer network;

the first training module 3 is configured to train the RNN network of the hidden layer in the RNN deep learning network by using the keyword, and input an output result of the RNN network to the Softmax multilayer network of the output layer of the RNN network for training.

In an alternative embodiment, the apparatus further comprises:

the word vector conversion module is used for carrying out word vector conversion on the extracted keywords to obtain a word sequence;

the first training module 3 is configured to train a pre-established RNN deep learning network by using the word sequence.

In an optional embodiment, the term tagging module 1 is configured to perform term tagging on a target text by using an shanlp open source tool; and the term labeling result of each word in the target text comprises a word, a part of speech and a term boundary.

In an alternative embodiment, the keyword extraction module 2 includes:

and the keyword acquisition unit is used for extracting the term words and a plurality of words positioned in front of and behind the term words from the word segmentation text according to the term labeling result of each word in the target text to obtain the keywords corresponding to the target text.

In an optional embodiment, the hidden layer of the CNN deep learning network adopts a CNN network, and the output layer thereof adopts a Softmax multilayer network;

the second training module 4 is configured to train a CNN network of a hidden layer in the CNN deep learning network by using the term prediction result and the term label corresponding to the target text, and input an output result of the CNN network to a Softmax multilayer network of an output layer of the CNN network for training.

The term extraction device based on the deep learning network according to this embodiment is a product of the term extraction method based on the deep learning network according to the foregoing embodiment, and the principle and the technical effect of the implementation are the same as those of the term extraction method based on the deep learning network according to the foregoing embodiment, and will not be described repeatedly here.

Illustratively, the computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used for describing the execution process of the computer program in the deep learning network based term extraction device. For example, the computer program may be divided into a term labeling module 1, a keyword extraction module 2, a first training module 3, and a second training module 4, and each module has the following specific functions: the term labeling module 1 is used for carrying out term labeling on the target text; the keyword extraction module 2 is used for performing word segmentation processing on the labeled target text to obtain a word segmentation text and extracting keywords from the word segmentation text; the first training module 3 is configured to train a pre-established RNN deep learning network according to the keyword to obtain a term prediction model, and obtain a term prediction result corresponding to the target text output by the term prediction model; and the second training module 4 is configured to train the pre-established CNN deep learning network according to the term prediction result and the term label corresponding to the target text to obtain a term extraction model, and obtain a term extraction result corresponding to the target text output by the term extraction model.

The term extraction device based on the deep learning network can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The term extraction device based on the deep learning network can comprise a processor and a memory, but is not limited to the processor and the memory. It will be understood by those skilled in the art that the schematic diagram is merely an example of the term extraction apparatus based on the deep learning network, and does not constitute a limitation of the term extraction apparatus based on the deep learning network, and may include more or less components than those shown in the figure, or combine some components, or different components, for example, the term extraction apparatus based on the deep learning network may further include an input and output device, a network access device, a bus, and the like.

The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor is a control center of the deep learning network based term extraction device, and various interfaces and lines are used to connect various parts of the whole deep learning network based term extraction device.

The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the deep learning network-based term extraction apparatus by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

Wherein, the module/unit integrated with the term extraction device based on the deep learning network can be stored in a computer readable storage medium if the module/unit is implemented in the form of a software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that the above-described embodiments of the apparatus are merely illustrative, where the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. A term extraction method based on a deep learning network is characterized by comprising the following steps:

carrying out term annotation on the target text;

training a pre-established CNN deep learning network according to term prediction results and term labels corresponding to the target text to obtain a term extraction model, and obtaining term extraction results corresponding to the target text output by the term extraction model;

the word segmentation processing is carried out on the labeled target text to obtain a word segmentation text, and the extracting of the keywords from the word segmentation text specifically comprises the following steps:

extracting term words from the word segmentation text according to term labeling results of all words in the target text;

and sequencing all the extracted terms according to the weights of the terms, extracting N terms positioned in the front of the weights, and extracting a plurality of words positioned before and after the N term words to be used as the keywords of the target text.

2. The deep learning network-based term extraction method according to claim 1, wherein the training of the pre-established RNN deep learning network according to the keyword specifically comprises:

3. The deep learning network-based term extraction method according to claim 1 or 2, wherein before training the pre-established RNN deep learning network according to the keyword, the method further comprises:

4. The deep learning network-based term extraction method as claimed in claim 1, wherein the term labeling of the target text specifically includes:

5. The deep learning network-based term extraction method according to claim 1, wherein the training of the pre-established CNN deep learning network according to the term prediction result and the term label corresponding to the target text specifically comprises:

6. A term extraction device based on a deep learning network is characterized by comprising:

the second training module is used for training a pre-established CNN deep learning network according to term prediction results and term labels corresponding to the target text to obtain a term extraction model and obtain term extraction results corresponding to the target text output by the term extraction model;

the keyword extraction module comprises:

a keyword obtaining unit, configured to extract a term word from the word segmentation text according to a term labeling result of each word in the target text; and sequencing all the extracted terms according to the weights of the terms, extracting N terms positioned in front of the weights, and extracting a plurality of words positioned before and after the N term words to be used as the keywords of the target text.

7. A deep learning network-based term extraction apparatus comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the deep learning network-based term extraction method according to any one of claims 1 to 5 when executing the computer program.

8. A computer-readable storage medium, comprising a stored computer program, wherein when the computer program runs, the computer-readable storage medium is controlled by a device to execute the deep learning network-based term extraction method according to any one of claims 1 to 5.