WO2022267453A1 - 关键信息提取模型的训练方法、提取方法、设备及介质 - Google Patents
关键信息提取模型的训练方法、提取方法、设备及介质 Download PDFInfo
- Publication number
- WO2022267453A1 WO2022267453A1 PCT/CN2022/071360 CN2022071360W WO2022267453A1 WO 2022267453 A1 WO2022267453 A1 WO 2022267453A1 CN 2022071360 W CN2022071360 W CN 2022071360W WO 2022267453 A1 WO2022267453 A1 WO 2022267453A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- key information
- word
- cross
- text
- words
- Prior art date
Links
- 238000012549 training Methods 0.000 title claims abstract description 121
- 238000000605 extraction Methods 0.000 title claims abstract description 102
- 238000000034 method Methods 0.000 title claims abstract description 70
- 230000006870 function Effects 0.000 claims description 81
- 239000013598 vector Substances 0.000 claims description 65
- 238000004590 computer program Methods 0.000 claims description 38
- 230000015654 memory Effects 0.000 claims description 14
- 238000002372 labelling Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 238000013136 deep learning model Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000002265 prevention Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 229920001690 polydopamine Polymers 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Definitions
- the present application relates to the technical field of data processing of machine learning, and specifically relates to a training method of a key information extraction model, a key information extraction method, a terminal device, and a computer-readable storage medium.
- the key information of the text can reflect the main content of the text, and accurately finding the key information of the text can accurately reflect the content of the text.
- the key information of the text is mostly extracted using the deep learning model, and the key information of the text is extracted through the deep learning model.
- the inventor realizes that the key information extracted by the current deep learning model has the disadvantage of not being accurate enough. Therefore, how to improve the accuracy of the deep learning model is a problem that needs to be solved at present.
- One of the objectives of the embodiments of the present application is to provide a key information extraction model training method, a key information extraction method, a terminal device, and a computer-readable storage medium.
- a training method for a key information extraction model including:
- the implementation word is a word that has a first preset relationship with the relationship word and/or exists with the relationship word
- the word of the second preset relationship the first information includes the position of the implementation word in the training text and the relationship between the implementation word and the relationship word;
- the second aspect provides a method for extracting key information, which is applied to the trained key information extraction model obtained by the training method for the key information extraction model described in the first aspect above, including:
- the key information of the text to be processed is obtained.
- a device for extracting key information including:
- Text obtaining module is used for obtaining to-be-processed text
- a result output module configured to obtain key information of the text to be processed based on the trained key information extraction model, wherein the trained key information extraction model is the key information extraction model of the first aspect above The model obtained by the training method.
- a terminal device including: a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein the above-mentioned computer program is implemented when the processor executes the computer program.
- a terminal device including: a memory, a processor, and a computer program stored in the memory and operable on the processor, wherein the above-mentioned computer program is implemented when the processor executes the computer program.
- the method for extracting key information according to any one of the second aspect.
- a computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, the key points described in any one of the above-mentioned first aspects are realized.
- a training method for information extraction models is provided, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, the key points described in any one of the above-mentioned first aspects are realized.
- a computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, the key points described in any one of the above-mentioned second aspects are realized.
- Information extraction method when the computer program is executed by a processor, the key points described in any one of the above-mentioned second aspects are realized.
- a computer program product is provided.
- the terminal device is made to execute the key information extraction model training method described in any one of the above first aspects, and/or implement The method for extracting key information described in any one of the above-mentioned second aspects.
- the beneficial effects of the training method of the key information extraction model are: obtaining the training text; determining the first position of the relational word in the training text in the training text; based on the first position, determining The first information of the implementation word in the training text, wherein the implementation word is a word having a first preset relationship with the relation word and/or a word having a second preset relationship with the relation word,
- the first information includes the position of the implementation word in the training text and the relationship between the implementation word and the relative word; based on the first position and the first information, a cross-entropy loss function is obtained; updating parameters in the key information extraction model based on the cross-entropy loss function to obtain a trained key information extraction model.
- the beneficial effects of the method for extracting key information are: acquiring text to be processed; and obtaining key information of the text to be processed based on the trained key information extraction model.
- FIG. 1 is a schematic diagram of an application scenario of a training method for a key information extraction model provided by an embodiment of the present application
- Fig. 2 is a schematic flow chart of a training method for a key information extraction model provided by an embodiment of the present application
- FIG. 3 is a schematic flowchart of a method for determining the first information of an implementation word provided by an embodiment of the present application
- Fig. 4 is a schematic diagram of a relationship result tensor provided by an embodiment of the present application.
- FIG. 5 is a schematic flowchart of a method for determining a cross-entropy loss function provided by an embodiment of the present application
- FIG. 6 is a schematic flowchart of a method for extracting key information provided by an embodiment of the present application.
- Fig. 7 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
- FIG. 1 is a schematic diagram of an application scenario of a training method for a key information extraction model provided in an embodiment of the present application.
- the above training method for a key information extraction model can be used to train the model.
- the storage device 10 is used to store the text
- the electronic device 20 is used to obtain the text from the storage device 10, and based on the obtained text, train the key information extraction model to be trained to obtain the trained key information extraction model.
- Fig. 2 shows a schematic flowchart of the training method of the key information extraction model provided by the present application. With reference to Fig. 2, the detailed description of the method is as follows:
- the key information extraction model to be trained can be obtained from a storage device or a database when acquiring training samples, and the training text can be a sentence or a paragraph.
- the training text may also be an article title or the like.
- the training text may also be a policy text issued by a government or a regulatory agency.
- the key information extraction model to be trained can be a BERT model (Bidirectional Encoder Representation from Transformers).
- the relative words may be words whose part of speech is a verb in the training text.
- the training text After obtaining the training text, it is first checked whether there is a verb in the training text, and if there is a verb in the training text, the position of the verb can be found out, which is recorded as the first position in this application.
- the processing of the training text is actually processing the text vector of the training text.
- Text vectors consist of vectors for each character in the training text. Therefore, after the training text is obtained, the vector of each character in the training text can be determined first. Then determine the first position of the relative word in the text vector.
- the first position of the relational word in the training text can be obtained from the Dense layer (fully connected layer) in the key information extraction model to be trained.
- the first information includes the second position of the implementing word in the training text and the relationship between the implementing word and the related word.
- the implementation word is a word that has a subject-predicate relationship with the relational word in the training text
- the implementation word can also be a word that has a verb-object relationship with the relational word in the training text.
- the first position of the relative word is determined and the relative word is a verb
- it can be determined whether there is a subject of the verb in the training text and if there is a subject of the verb in the training text, the subject and the position of the subject can be extracted, In this application, it is recorded as the second position, and the relationship between the subject and the verb is output.
- the training text is "I eat an apple”
- eating is a relational word
- I is an implementation word
- the relative word when the first position of the relative word is determined and the relative word is a verb, it can be determined whether there is an object of the verb in the training text, and if there is an object of the verb in the training text, the object and the position of the object can be extracted , which is recorded as the second position in this application, and the object and the verb are output as the verb-object relationship.
- the training text is "Strictly implement the responsibility for epidemic prevention and control”
- implementation is a relational word
- responsibility for epidemic prevention and control is an implementation word
- Object words related to verbs can be extracted even when no subject exists in the training text.
- two kinds of relationships can be extracted from the training text, one is the verb-object relationship, and the other is the subject-predicate relationship.
- the LSTM layer Long Short-Term Memory-long short-term memory network
- the Dense layer in the key information extraction model to be trained can be used to obtain the first information.
- two cross-entropy functions can be obtained through the first position and the first information, and a cross-entropy loss function can be obtained based on the two cross-entropy functions obtained above.
- This application uses the first position of the relational word and the second position of the implementation word Determine the cross-entropy loss function with the relationship between the implementation word and the relational word, use the cross-entropy loss function to update the parameters in the key information extraction model to be trained, and obtain the key information extraction model after training.
- This application uses multiple parameters to determine the cross-entropy loss The function can make the trained key information extraction model more accurate, and make the key information extracted by the key information extraction model more accurate.
- step S102 may include:
- the first position includes the head position and the tail position, wherein the relational word is a word whose part of speech is a verb.
- the text vector can be BIO marked (B-begin, I-inside, O-outside), and the vector of the first position of the relational word is marked with B, and the tail position of the relational word The vector of is marked with I, and the vector of non-relational words in the training text is marked with O.
- determining the first position of the relational word may be determining a vector of the head position and a vector of the tail position of the relational word.
- the relational word includes at least three characters, the vector of each character in the relational word can also be determined.
- the first position may include a vector for each character in the relative word.
- determining the vector of the head position and the vector of the tail position of the relative word can reduce the processing of data when the subsequent first information is determined, and speed up the training speed of the model.
- step S103 may include:
- the average value of the vectors of each character may be calculated, and the average value may be used as the average vector.
- the average vector is used to represent the vectors of relational words.
- the vector of the head position and the vector of the tail position may be returned to determine the first information.
- the average vector may be added to the vector of each character in the text vector, and then the first information is determined.
- the above method may further include:
- the relationship result tensor of the implementation words is obtained.
- the relationship result tensor of the implementation word can be determined according to the first information, and the relationship result tensor is used to represent the position of the head position of the implementation word in the training text, the implementation The position of the tail position of the word in the training text, the relationship between the implementation word and the relative word.
- the first two rows are the positions of the implementation words that have a subject-predicate relationship with the relationship words in the training text
- the position of 1 in the first row indicates that the head of the implementation word is in the training text position in .
- the position of the 1 in the first row represents the position of the tail of the implementation word in the training text.
- the last two rows are the positions of the implementation words that have a verb-object relationship with the relation words in the training text.
- there is no position marked as 1 in the two rows of the verb-object relationship which means that there is no implementation word that has a verb-object relationship with the relational word in the training text.
- the information of the implementation word is represented by the relationship result tensor, which can make information processing more convenient.
- the position of the implementation word and the relationship with the relationship word can be represented, which reduces the amount of data processing and improves the efficiency. The speed of data processing.
- step S104 may include:
- the training text before the training text is input into the key information extraction model to be trained, the training text can be marked, and the head position and tail position of the relational words can be marked, and the head position of the marked relational words can be marked in this application Denote it as the first target position, and the tail position of the marked relative word is the second target position.
- the marked positions of the implementation words that have the first preset relationship with the relational words are recorded as the third target position, and/or mark the position of the implementation word that has the second preset relationship with the relation word in the training text, and mark the position of the word that has the second preset relationship with the relation word in the application as the fourth target location.
- the real vector of the head position of the relational word in the training text can be determined.
- the first cross-entropy function can be obtained according to the real vector of the head position of the relational word and the vector of the head position of the relational word output by the key information extraction model.
- the second target position of the relational word since the second target position of the relational word is marked, the real vector of the tail position of the relational word in the training text can be determined. Therefore, the second cross-entropy function can be obtained according to the real vector of the tail position of the relation word and the vector of the tail position of the relation word output by the key information extraction model.
- the third target position and/or the fourth target position of the implementation word are marked, there is a third target position in the training text, and the third target position and the third target position's implementation word and The relation of relational words obtains a true relation tensor.
- a fourth target position in the training text and a real relationship tensor can be obtained according to the fourth target position and the relationship between the implementation word and the relation word of the fourth target position.
- the third cross-entropy function is obtained according to the real relationship tensor and the relationship result tensor output by the key information extraction model.
- the sum of the first cross-entropy function, the second cross-entropy function and the third cross-entropy function may be used as the cross-entropy loss function.
- the cross entropy can also be obtained by the first cross entropy function, the weight of the first cross entropy function, the second cross entropy function, the weight of the second cross entropy function, the third cross entropy function and the weight of the third cross entropy function loss function.
- the cross-entropy loss function is obtained, which provides a multi-source basis for updating the parameters in the key information extraction model, and can make the updated The parameters are more accurate, which in turn makes the key information of the text extracted by the key information extraction model more accurate.
- FIG. 6 shows the key information extraction method provided by the embodiment of the present application.
- the extraction method includes:
- the trained key information extraction model is a model trained based on the above key information extraction model training method.
- key information may include keywords and relationships between keywords.
- step S202 may include:
- the position of the relational word in the text to be processed may include a vector of the head position of the relational word and a vector of the tail position of the relational word.
- the key information includes the relational word, The implementing words and the relationship between the implementing words and the relative words.
- the average vector of the vector of the head position of the relative word and the vector of the tail position of the relative word is calculated, and the implementation word and the relationship between the implementation word and the relationship word are determined based on the average vector.
- the embodiment of the present application provides a device for extracting key information, including:
- Text obtaining module is used for obtaining to-be-processed text
- a result output module configured to obtain key information of the text to be processed based on the trained key information extraction model, wherein the trained key information extraction model is a model obtained by the above key information extraction model training method .
- the result output module is also used for:
- the implementation words in the text to be processed and the relationship between the implementation words and the relational words are obtained, and the key information includes the relational words, the Implementation words and the relationship between the implementation words and the relative words.
- the relative word is a word whose part of speech is a verb.
- the relationship between the implementing word and the relative word includes a subject-predicate relationship and/or a verb-object relationship.
- the embodiment of the present application also provides a terminal device.
- the terminal device 400 may include: at least one processor 410, a memory 420, and A running computer program, when the processor 410 executes the computer program, it implements the steps in any of the above-mentioned method embodiments, such as steps S101 to S105 in the embodiment shown in FIG. 2 or the steps in the embodiment shown in FIG. 6 Step S201 to step S202.
- the processor 410 executes the computer program, functions of the modules/units in the foregoing device embodiments are implemented.
- the computer program can be divided into one or more modules/units, and one or more modules/units are stored in the memory 420 and executed by the processor 410 to complete the present application.
- the one or more modules/units may be a series of computer program segments capable of accomplishing specific functions, and the program segments are used to describe the execution process of the computer program in the terminal device 400 .
- FIG. 7 is only an example of a terminal device, and does not constitute a limitation on the terminal device. It may include more or less components than those shown in the figure, or combine certain components, or different components, such as Input and output devices, network access devices, buses, etc.
- the processor 410 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
- the storage 420 may be an internal storage unit of the terminal device, or an external storage device of the terminal device, such as a plug-in hard disk, a smart memory card, a secure digital (Secure Digital, SD) card, a flash memory card (Flash Card), and the like.
- the memory 420 is used to store the computer program and other programs and data required by the terminal device.
- the memory 420 can also be used to temporarily store data that has been output or will be output.
- the bus can be an industry standard architecture (Industry Standard Architecture, ISA) bus, external device interconnection bus or Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc.
- the bus can be divided into address bus, data bus, control bus and so on.
- the buses in the drawings of the present application are not limited to only one bus or one type of bus.
- the training method of the key information extraction model and the extraction method of key information provided in the embodiments of the present application can be applied to terminal devices such as computers, tablet computers, notebook computers, netbooks, and personal digital assistants (personal digital assistants, PDAs).
- terminal devices such as computers, tablet computers, notebook computers, netbooks, and personal digital assistants (personal digital assistants, PDAs).
- PDAs personal digital assistants
- the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it realizes the training method or key information that can realize the above-mentioned key information extraction model The steps in the various embodiments of the extraction method.
- An embodiment of the present application provides a computer program product.
- the computer program product When the computer program product is run on a mobile terminal, the mobile terminal can realize the above-mentioned key information extraction model training method or key information extraction method in each embodiment. step.
- the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the computer-readable storage medium may be non-volatile or volatile. Based on this understanding, all or part of the procedures in the methods of the above embodiments in the present application can be completed by instructing related hardware through computer programs, and the computer programs can be stored in a computer-readable storage medium.
- the computer program When executed by a processor, the steps in the above-mentioned various method embodiments can be realized.
- the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form.
- the computer-readable medium may at least include: any entity or device capable of carrying computer program codes to a photographing device/terminal device, a recording medium, a computer memory, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunication signals, and software distribution media.
- ROM read-only memory
- RAM random access memory
- electrical carrier signals telecommunication signals
- software distribution media Such as U disk, mobile hard disk, magnetic disk or optical disk, etc.
- computer readable media may not be electrical carrier signals and telecommunication signals under legislation and patent practice.
- the disclosed device/network device and method may be implemented in other ways.
- the device/network device embodiments described above are only illustrative.
- the division of the modules or units is only a logical function division.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
一种关键信息提取模型的训练方法、提取方法、设备及介质,该方法包括:获取训练文本(S101),确定训练文本中的关系词在训练文本中的第一位置(S102);基于第一位置,确定训练文本中的实施词的第一信息(S103),基于第一位置和第一信息,得到交叉熵损失函数(S104),基于交叉熵损失函数更新关键信息提取模型中的参数,得到训练后的关键信息提取模型(S105)。该方法利用多个参数确定交叉熵损失函数可以使得到的训练后的关键信息提取模型更准确,使关键信息提取模型提取的关键信息更准确。
Description
本申请要求于2021年06月24日在中国专利局提交的、申请号为202110704690.9、发明名称为“关键信息提取模型的训练方法、提取方法、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及机器学习的数据处理技术领域,具体涉及一种关键信息提取模型的训练方法、关键信息的提取方法、终端设备及计算机可读存储介质。
文本的关键信息可以反映文本的主要内容,准确查找文本的关键信息可以准确反映文本的内容。
目前,提取文本的关键信息多采用深度学习模型,通过深度学习模型提取文本的关键信息。但是发明人意识到目前使用深度学习模型提取的关键信息存在不够准确的缺点,因此,如何提高深度学习模型的准确度是目前需要解决的问题。
本申请实施例的目的之一在于:提供一种关键信息提取模型的训练方法、关键信息的提取方法、终端设备及计算机可读存储介质。
本申请实施例采用的技术方案是:
第一方面,提供了一种关键信息提取模型的训练方法,包括:
获取训练文本;
确定所述训练文本中的关系词在所述训练文本中的第一位置;
基于所述第一位置,确定所述训练文本中的实施词的第一信息,其中,所述实施词为与所述关系词存在第一预设关系的词和/或与所述关系词存在第二预设关系的词,所述第一信息包括所述实施词在所述训练文本中的位置及所述实施词与所述关系词的关系;
基于所述第一位置和所述第一信息,得到交叉熵损失函数;
基于所述交叉熵损失函数更新所述关键信息提取模型中的参数,得到训练后的关键信息提取模型。
第二方面,提供了一种关键信息的提取方法,应用于上述第一方面所述关键信息提取模型的训练方法得到的训练后的关键信息提取模型,包括:
获取待处理文本;
基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息。
第三方面,提供一种关键信息的提取装置,包括:
文本获取模块,用于获取待处理文本;
结果输出模块,用于基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息,其中,所述训练后的关键信息提取模型为上述第一方面所述关键信息提取模型的训练方法得到的模型。
第四方面,提供一种终端设备,包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现上述第一方面中任一项所述的关键信息提取模型的训练方法。
第五方面,提供一种终端设备,包括:存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现上述第二方面中任一项所述的关键信息的提取方法。
第六方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现上述第一方面中任一项所述的关键信息提取模型的训练方法。
第七方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现上述第二方面中任一项所述的关键信息的提取方法。
第八方面,提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行上述第一方面中任一项所述的关键信息提取模型的训练方法,和/或实现上述第二方面中任一项所述的关键信息的提取方法。
本申请实施例提供的关键信息提取模型的训练方法的有益效果在于:获取训练文本;确定所述训练文本中的关系词在所述训练文本中的第一位置;基于所述第一位置,确定所述训练文本中的实施词的第一信息,其中,所述实施词为与所述关系词存在第一预设关系的词和/或与所述关系词存在第二预设关系的词,所述第一信息包括所述实施词在所述训练文本中的位置及所述实施词与所述关系词的关系;基于所述第一位置和所述第一信息,得到交叉熵损失函数;基于所述交叉熵损失函数更新所述关键信息提取模型中的参数,得到训练后的关键信息提取模型。
本申请实施例提供的关键信息的提取方法的有益效果在于:获取待处理文本;基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息。
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或示范性技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是本申请一实施例提供的关键信息提取模型的训练方法的应用场景示意图;
图2是本申请一实施例提供的关键信息提取模型的训练方法的流程示意图;
图3是本申请一实施例提供的实施词的第一信息的确定方法的流程示意图;
图4是本申请一实施例提供的关系结果张量的示意图;
图5是本申请一实施例提供的交叉熵损失函数的确定方法的流程示意图;
图6是本申请一实施例提供的关键信息的提取方法的流程示意图;
图7是本申请一实施例提供的终端设备的结构示意图。
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
术语“第一”、“第二”仅用于便于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明技术特征的数量。“多个”的含义是两个或两个以上,除非另有明确具体的限定。
为了说明本申请所提供的技术方案,以下结合具体附图及实施例进行详细说明。
图1为本申请实施例提供的关键信息提取模型的训练方法的应用场景示意图,上述关键信息提取模型的训练方法可以用于对模型进行训练。其中,存储设备10用于存储文本,电子设备20用于从存储设备10中获取文本,并基于获取的文本对待训练的关键信息提取模型进行训练,得到训练后的关键信息提取模型。
以下结合图1对本申请实施例的关键信息提取模型的训练方法进行详细说明。
图2示出了本申请提供的关键信息提取模型的训练方法的示意性流程图,参照图2,对该方法的详述如下:
S101,获取训练文本。
在本实施例中,待训练的关键信息提取模型在获取训练样本时可以从存储设备中获得,还可以从数据库中获得,训练文本可以是一句话或一段话。训练文本还可以是文章题目等。训练文本还可以是由政府或监管机构发布的政策文本。
在本实施例中,待训练的关键信息提取模型可以是BERT模型(Bidirectional Encoder Representation from
Transformers)。
S102,确定所述训练文本中的关系词在所述训练文本中的第一位置。
在本实施例中,关系词可以是训练文本中词性为动词的词。
具体的,在获取到训练文本后,先查找训练文本中是否存在动词,如果训练文本中存在动词,可以将动词的位置找出,本申请中记为第一位置。
具体的,对训练文本的处理,实际是对训练文本的文本向量进行处理。文本向量由训练文本中各个字符的向量组成。因此,在获取到训练文本后,可以先确定训练文本中各个字符的向量。然后在文本向量中确定关系词的第一位置。
可选的,关系词在训练文本中的第一位置可以从待训练的关键信息提取模型中的Dense层(全连层)得到。
S103,基于所述第一位置,确定所述训练文本中的实施词的第一信息,其中,所述实施词为与所述关系词存在第一预设关系的词和/或与所述关系词存在第二预设关系的词,所述第一信息包括所述实施词在所述训练文本中的第二位置及所述实施词与所述关系词的关系。
在本实施例中,在关系词为动词时,实施词为训练文本中与关系词存在主谓关系的词,实施词还可以为训练文本中与关系词存在动宾关系的词。
具体的,在确定了关系词的第一位置,且关系词为动词时,可以确定训练文本中是否存在动词的主语,如果训练文本中存在动词的主语,则可以提取该主语及主语的位置,本申请中记为第二位置,并输出该主语与动词为主谓关系。例如,如果训练文本为“我吃苹果”,吃是关系词,我是实施词,则可以得到“我-主谓关系-吃”。
具体的,在确定了关系词的第一位置,且关系词为动词时,可以确定训练文本中是否存在动词的宾语,如果训练文本中存在动词的宾语,则可以提取该宾语和该宾语的位置,本申请中记为第二位置,并输出该宾语与动词为动宾关系。例如,如果训练文本为“从严落实疫情防控责任”,落实是关系词,疫情防控责任是实施词,则可以得到“落实-动宾关系-疫情防控责任”。即使在训练文本中不存在主语时,也可以提取出与动词相关的宾语词。
本申请实施例中,可以从训练文本中提取两种关系,一种是动宾关系,一种是主谓关系。
可选的,可以采用待训练的关键信息提取模型中的LSTM层(Long Short-Term Memory-长短期记忆网络)和Dense层得到第一信息。
S104,基于所述第一位置和所述第一信息,得到交叉熵损失函数。
在本实施例中,可以通过第一位置和第一信息得到两个交叉熵函数,基于上述得到的两个交叉熵函数得到交叉熵损失函数。
S105,基于所述交叉熵损失函数更新关键信息提取模型中的参数,得到训练后的关键信息提取模型。
在本实施例中,在得到交叉熵损失函数后,可以利用交叉熵损失函数更新关键信息提取模型中的参数,继续对关键信息信息提取模型进行训练,直到交叉熵损失函数满足预设要求,则可以停止对关键信息提取模型的训练,得到训练后的关键信息提取模型。
本申请实施例中,首先获取训练文本,并确定训练文本中关系词在训练文本中的第一位置,基于第一位置,确定训练文本中的实施词的第一信息,其中,实施词为与关系词存在第一预设关系的词和/或与关系词存在第二预设关系的词,第一信息包括实施词在训练文本中的位置及实施词与关系词的关系,基于第一位置和第一信息,得到交叉熵损失函数,基于交叉熵损失函数更新关键信息提取模型中的参数,得到训练后的关键信息提取模型,本申请利用关系词的第一位置、实施词的第二位置和实施词与关系词的关系确定交叉熵损失函数,利用交叉熵损失函数更新待训练的关键信息提取模型中的参数,得到训练后的关键信息提取模型,本申请利用多个参数确定交叉熵损失函数可以使得到的训练后的关键信息提取模型更准确,使关键信息提取模型提取的关键信息更准确。
在一种可能的实现方式中,步骤S102的实现过程可以包括:
S1021,确定所述关系词在所述训练文本中的头位置和尾位置,所述第一位置包括所述头位置和所述尾位置,其中,所述关系词为词性为动词的词。
可选的,得到训练文本的文本向量后,可以对文本向量进行BIO标注(B-begin,I-inside,O-outside),将关系词的首位置的向量用B标注,关系词的尾位置的向量用I标注,训练文本中的非关系词的向量用O标注。
具体的,确定关系词的第一位置可以是确定关系词的头位置的向量和尾位置的向量。另外,如果关系词至少包括三个字符,还可以确定关系词中每个字符的向量。第一位置可以包括关系词中每个字符的向量。
本申请实施例中,确定关系词的头位置的向量和尾位置的向量,可以减少后续第一信息确定时对数据的处理,加快模型的训练速度。
如图3所示,在一种可能的实现方式中,步骤S103的实现过程可以包括:
S1031,计算所述头位置的向量和所述尾位置的向量的平均向量。
在本实施例中,如果第一位置中包括关系词中每个字符的向量,可以计算每个字符的向量的平均值,将该平均值作为平均向量。平均向量用于表征关系词的向量。
在本实施例中,在得到头位置的向量和尾位置的向量后,可以将头位置的向量和尾位置的向量回传,以确定第一信息。
S1032,基于所述平均向量,确定所述实施词的第一信息。
在本实施例中,在得到平均向量后,可以将平均向量加入到文本向量中每个字符的向量中,然后确定第一信息。
在一种可能的实现方式中,在步骤S1032之后,上述方法还可以包括:
基于所述实施词的第一信息,得到所述实施词的关系结果张量。
在本实施例中,在得到实施词的第一信息后,可以根据第一信息确定实施词的关系结果张量,关系结果张量用于表征实施词的头位置在训练文本中的位置、实施词的尾位置在训练文本中的位置、实施词与关系词的关系。
作为举例,如图4所示的关系结果张量图,前两排为训练文本中与关系词存在主谓关系的实施词的位置,第一排中1的位置表征实施词的头在训练文本中的位置。第一排中1的位置表征实施词的尾在训练文本中的位置。后两排为训练文本中与关系词存在动宾关系的实施词的位置。图4中动宾关系的两排中没有标注为1的位置,则说明训练文本中不存在与关系词是动宾关系的实施词。
本申请实施例中,通过关系结果张量表征实施词的信息,可以使信息处理更方便,在张量表中可以将实施词的位置和与关系词的关系进行表示,减少了数据处理量,提高了数据处理的速度。
如图5所示,在一种可能的实现方式中,步骤S104的实现过程可以包括:
S1041,基于所述头位置的向量,得到第一交叉熵函数。
在本实施例中,在训练文本输入待训练的关键信息提取模型之前可以对训练文本进行标注,将关系词的头位置和尾位置标出,本申请中可以将标出的关系词的头位置记作第一目标位置,将标出的关系词的尾位置即为第二目标位置。另外,还可以将训练文本中与关系词存在第一预设关系的词的位置标出,本申请中将标出的与关系词存在第一预设关系的实施词的位置记为第三目标位置,和/或将训练文本中与关系词存在第二预设关系的实施词的位置标出,本申请中将标出的与关系词存在第二预设关系的词的位置记为第四目标位置。
在本实施例中,由于标出了关系词的第一目标位置,因此可以确定训练文本中关系词的头位置的真实向量。可以根据关系词的头位置的真实向量和关键信息提取模型输出的关系词的头位置的向量得到第一交叉熵函数。
S1042,基于所述尾位置的向量,得到第二交叉熵函数。
在本实施例中,由于标出了关系词的第二目标位置,因此可以确定训练文本中关系词的尾位置的真实向量。因此可以根据关系词的尾位置的真实向量和关键信息提取模型输出的关系词的尾位置的向量得到第二交叉熵函数。
S1043,基于所述关系结果张量,得到第三交叉熵函数。
在本实施例中,由于标出了实施词的第三目标位置和/或第四目标位置,在训练文本中存在第三目标位置,可以根据第三目标位置和第三目标位置的实施词与关系词的关系得到一真实关系张量。在训练文本中存在第四目标位置,可以根据第四目标位置和第四目标位置的实施词与关系词的关系得到一真实关系张量。在训练文本中存在第三目标位置和第四目标位置时,可以根据第三目标位置和第三目标位置的实施词与关系词的关系、第四目标位置和第四目标位置的实施词与关系词的关系得到一真实关系张量。
在本实施例中,根据真实关系张量和关键信息提取模型输出的关系结果张量得到第三交叉熵函数。
S1044,基于所述第一交叉熵函数、所述第二交叉熵函数和所述第三交叉熵函数,得到所述交叉熵损失函数。
在本实施例中,可以将第一交叉熵函数、第二交叉熵函数和第三交叉熵函数之和作为交叉熵损失函数。
可选的,还可以第一交叉熵函数、第一交叉熵函数的权重、第二交叉熵函数、第二交叉熵函数的权重、第三交叉熵函数和第三交叉熵函数的权重得到交叉熵损失函数。
本申请实施例中,根据第一交叉熵函数、第二交叉熵函数和第三交叉熵函数,得到交叉熵损失函数,为关键信息提取模型中参数的更新提供了多源依据,可以使更新的参数更准确,进而使关键信息提取模型提取的文本的关键信息更准确。
对应于上文实施例所述的关键信息提取模型的训练方法,图6示出了本申请实施例提供的关键信息的提取方法,参照图6,该提取方法包括:
S201,获取待处理文本。
S202,基于上述训练后的关键信息提取模型,得到所述待处理文本的关键信息。
在本实施例中,训练后的关键信息提取模型为基于上述关键信息提取模型的训练方法训练得到的模型。
在本实施例中,关键信息可以包括关键词以及关键词之间的关系。
具体的,步骤S202的实现过程可以包括:
S2021,提取所述待处理文本中的关系词及所述关系词在所述待处理文本中的位置。
在本实施例中,关系词在待处理文本中的位置可以包括关系词的头位置的向量和关系词的尾位置的向量。
S2022,基于所述关系词在所述待处理文本中的位置,得到所述待处理文本中的实施词及所述实施词与所述关系词的关系,所述关键信息包括所述关系词、所述实施词和所述实施词与所述关系词的关系。
在本实施例中,计算关系词的头位置的向量和关系词的尾位置的向量的平均向量,基于平均向量确定实施词及实施词与关系词的关系。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
对应于上文实施例所述的关键信息提取模型的训练方法,本申请实施例提供了关键信息的提取装置,包括:
文本获取模块,用于获取待处理文本;
结果输出模块,用于基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息,其中,所述训练后的关键信息提取模型为上述关键信息提取模型的训练方法得到的模型。
在一种可能的实现方式中,结果输出模块还用于:
提取所述待处理文本中的关系词及所述关系词在所述待处理文本中的位置;
基于所述关系词在所述待处理文本中的位置,得到所述待处理文本中的实施词及所述实施词与所述关系词的关系,所述关键信息包括所述关系词、所述实施词和所述实施词与所述关系词的关系。
在一种可能的实现方式中,所述关系词为词性为动词的词。
在一种可能的实现方式中,所述实施词与所述关系词的关系包括主谓关系和/或动宾关系。
本申请实施例还提供了一种终端设备,参见图7,该终端设备400可以包括:至少一个处理器410、存储器420以及存储在所述存储器420中并可在所述至少一个处理器410上运行的计算机程序,所述处理器410执行所述计算机程序时实现上述任意各个方法实施例中的步骤,例如图2所示实施例中的步骤S101至步骤S105或图6所示实施例中的步骤S201至步骤S202。或者,处理器410执行所述计算机程序时实现上述各装置实施例中各模块/单元的功能。
示例性的,计算机程序可以被分割成一个或多个模块/单元,一个或者多个模块/单元被存储在存储器420中,并由处理器410执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序段,该程序段用于描述计算机程序在终端设备400中的执行过程。
本领域技术人员可以理解,图7仅仅是终端设备的示例,并不构成对终端设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如输入输出设备、网络接入设备、总线等。
处理器410可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application
Specific Integrated Circuit,ASIC)、现成可编程门阵列 (Field-Programmable Gate Array,FPGA) 或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器420可以是终端设备的内部存储单元,也可以是终端设备的外部存储设备,例如插接式硬盘,智能存储卡,安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。所述存储器420用于存储所述计算机程序以及终端设备所需的其他程序和数据。所述存储器420还可以用于暂时地存储已经输出或者将要输出的数据。
总线可以是工业标准体系结构(Industry
Standard Architecture,ISA)总线、外部设备互连总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。
本申请实施例提供的关键信息提取模型的训练方法和关键信息的提取方法可以应用于计算机、平板电脑、笔记本电脑、上网本、个人数字助理(personal digital assistant,PDA)等终端设备上,本申请实施例对终端设备的具体类型不作任何限制。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现可实现上述关键信息提取模型的训练方法或关键信息的提取方法各个实施例中的步骤。
本申请实施例提供了一种计算机程序产品,当计算机程序产品在移动终端上运行时,使得移动终端执行时实现可实现上述关键信息提取模型的训练方法或关键信息的提取方法各个实施例中的步骤。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。所述计算机可读存储介质可以是非易失性,也可以是易失性。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的实施例中,应该理解到,所揭露的装置/网络设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/网络设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。
Claims (20)
- 一种关键信息提取模型的训练方法,其中,包括:获取训练文本;确定所述训练文本中的关系词在所述训练文本中的第一位置;基于所述第一位置,确定所述训练文本中的实施词的第一信息,其中,所述实施词为与所述关系词存在第一预设关系的词和/或与所述关系词存在第二预设关系的词,所述第一信息包括所述实施词在所述训练文本中的第二位置及所述实施词与所述关系词的关系;基于所述第一位置和所述第一信息,得到交叉熵损失函数;基于所述交叉熵损失函数更新关键信息提取模型中的参数,得到训练后的关键信息提取模型。
- 如权利要求1所述的关键信息提取模型的训练方法,其中,所述确定所述训练文本中的关系词在所述训练文本中的第一位置,包括:确定所述关系词在所述训练文本中的头位置和尾位置,所述第一位置包括所述头位置和所述尾位置,其中,所述关系词为词性为动词的词。
- 如权利要求2所述的关键信息提取模型的训练方法,其中,所述确定所述关系词在所述训练文本中的头位置和尾位置,包括:确定所述关系词的头位置的向量和尾位置的向量。
- 如权利要求3所述的关键信息提取模型的训练方法,其中,所述基于所述第一位置,确定所述训练文本中的实施词的第一信息,包括:计算所述头位置的向量和所述尾位置的向量的平均向量;基于所述平均向量,确定所述实施词的第一信息,其中,所述第一预设关系主谓关系,和/或与所述第二预设关系为动宾关系。
- 如权利要求4所述的关键信息提取模型的训练方法,其中,在所述基于所述平均向量,确定所述实施词的第一信息之后,包括:基于所述实施词的第一信息,得到所述实施词的关系结果张量。
- 如权利要求5所述的关键信息提取模型的训练方法,其中,所述基于所述第一位置和所述第一信息,得到交叉熵损失函数,包括:基于所述头位置的向量,得到第一交叉熵函数;基于所述尾位置的向量,得到第二交叉熵函数;基于所述关系结果张量,得到第三交叉熵函数;基于所述第一交叉熵函数、所述第二交叉熵函数和所述第三交叉熵函数,得到所述交叉熵损失函数。
- 如权利要求3所述的关键信息提取模型的训练方法,其中,所述确定所述关系词的头位置的向量和尾位置的向量,包括:获得所述训练文本的文本向量;对所述文本向量进行BIO标注,得到所述关系词的头位置的向量和尾位置的向量。
- 如权利要求6所述的关键信息提取模型的训练方法,其中,所述基于所述第一交叉熵函数、所述第二交叉熵函数和所述第三交叉熵函数,得到所述交叉熵损失函数,包括:将第一交叉熵函数、第二交叉熵函数和第三交叉熵函数之和作为交叉熵损失函数。
- 如权利要求6所述的关键信息提取模型的训练方法,其中,所述基于所述第一交叉熵函数、所述第二交叉熵函数和所述第三交叉熵函数,得到所述交叉熵损失函数,包括:基于第一交叉熵函数、第一交叉熵函数的权重、第二交叉熵函数、第二交叉熵函数的权重、第三交叉熵函数和第三交叉熵函数的权重得到交叉熵损失函数。
- 如权利要求6所述的关键信息提取模型的训练方法,其中,所述基于所述头位置的向量,得到第一交叉熵函数,包括:根据所述关系词的头位置的真实向量和所述关键信息提取模型输出的关系词的头位置的向量得到第一交叉熵函数。
- 一种关键信息的提取方法,其中,应用于上述如权利要求1至10任一项所述关键信息提取模型的训练方法得到的训练后的关键信息提取模型;所述提取方法包括:获取待处理文本;基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息。
- 如权利要求11所述的关键信息的提取方法,其中,所述基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息,包括:提取所述待处理文本中的关系词及所述关系词在所述待处理文本中的位置;基于所述关系词在所述待处理文本中的位置,得到所述待处理文本中的实施词及所述实施词与所述关系词的关系,所述关键信息包括所述关系词、所述实施词和所述实施词与所述关系词的关系。
- 如权利要求11所述的关键信息的提取方法,其中,所述关系词为词性为动词的词。
- 如权利要求11所述的关键信息的提取方法,其中,所述实施词与所述关系词的关系包括主谓关系和/或动宾关系。
- 一种关键信息的提取装置,其中,包括:文本获取模块,用于获取待处理文本;结果输出模块,用于基于所述训练后的关键信息提取模型,得到所述待处理文本的关键信息,其中,所述训练后的关键信息提取模型为上述如权利要求1至10任一项所述关键信息提取模型的训练方法得到的模型。
- 如权利要求15所述的关键信息的提取装置,其中,所述结果输出模块还用于:提取所述待处理文本中的关系词及所述关系词在所述待处理文本中的位置;基于所述关系词在所述待处理文本中的位置,得到所述待处理文本中的实施词及所述实施词与所述关系词的关系,所述关键信息包括所述关系词、所述实施词和所述实施词与所述关系词的关系。
- 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求1至10任一项所述的关键信息提取模型的训练方法。
- 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如权利要求11至14任一项所述的关键信息的提取方法。
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至10任一项所述的关键信息提取模型的训练方法。
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求11至14任一项所述的关键信息的提取方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110704690.9A CN113420120B (zh) | 2021-06-24 | 2021-06-24 | 关键信息提取模型的训练方法、提取方法、设备及介质 |
CN202110704690.9 | 2021-06-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022267453A1 true WO2022267453A1 (zh) | 2022-12-29 |
Family
ID=77716789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/071360 WO2022267453A1 (zh) | 2021-06-24 | 2022-01-11 | 关键信息提取模型的训练方法、提取方法、设备及介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113420120B (zh) |
WO (1) | WO2022267453A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113420120B (zh) * | 2021-06-24 | 2024-05-31 | 平安科技(深圳)有限公司 | 关键信息提取模型的训练方法、提取方法、设备及介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111291185A (zh) * | 2020-01-21 | 2020-06-16 | 京东方科技集团股份有限公司 | 信息抽取方法、装置、电子设备及存储介质 |
WO2020224219A1 (zh) * | 2019-05-06 | 2020-11-12 | 平安科技(深圳)有限公司 | 中文分词方法、装置、电子设备及可读存储介质 |
CN112270196A (zh) * | 2020-12-14 | 2021-01-26 | 完美世界(北京)软件科技发展有限公司 | 实体关系的识别方法、装置及电子设备 |
CN112270604A (zh) * | 2020-10-14 | 2021-01-26 | 招商银行股份有限公司 | 信息结构化处理方法、装置及计算机可读存储介质 |
CN113420120A (zh) * | 2021-06-24 | 2021-09-21 | 平安科技(深圳)有限公司 | 关键信息提取模型的训练方法、提取方法、设备及介质 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595421B (zh) * | 2018-04-13 | 2022-04-08 | 鼎富智能科技有限公司 | 一种中文实体关联关系的抽取方法、装置及系统 |
US20200074322A1 (en) * | 2018-09-04 | 2020-03-05 | Rovi Guides, Inc. | Methods and systems for using machine-learning extracts and semantic graphs to create structured data to drive search, recommendation, and discovery |
CN111476023B (zh) * | 2020-05-22 | 2023-09-01 | 北京明朝万达科技股份有限公司 | 识别实体关系的方法及装置 |
CN112560475B (zh) * | 2020-11-16 | 2023-05-12 | 和美(深圳)信息技术股份有限公司 | 三元组抽取方法及系统 |
-
2021
- 2021-06-24 CN CN202110704690.9A patent/CN113420120B/zh active Active
-
2022
- 2022-01-11 WO PCT/CN2022/071360 patent/WO2022267453A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020224219A1 (zh) * | 2019-05-06 | 2020-11-12 | 平安科技(深圳)有限公司 | 中文分词方法、装置、电子设备及可读存储介质 |
CN111291185A (zh) * | 2020-01-21 | 2020-06-16 | 京东方科技集团股份有限公司 | 信息抽取方法、装置、电子设备及存储介质 |
CN112270604A (zh) * | 2020-10-14 | 2021-01-26 | 招商银行股份有限公司 | 信息结构化处理方法、装置及计算机可读存储介质 |
CN112270196A (zh) * | 2020-12-14 | 2021-01-26 | 完美世界(北京)软件科技发展有限公司 | 实体关系的识别方法、装置及电子设备 |
CN113420120A (zh) * | 2021-06-24 | 2021-09-21 | 平安科技(深圳)有限公司 | 关键信息提取模型的训练方法、提取方法、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113420120A (zh) | 2021-09-21 |
CN113420120B (zh) | 2024-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020224219A1 (zh) | 中文分词方法、装置、电子设备及可读存储介质 | |
WO2020232882A1 (zh) | 命名实体识别方法、装置、设备及计算机可读存储介质 | |
CN108664574B (zh) | 信息的输入方法、终端设备及介质 | |
WO2022142121A1 (zh) | 摘要语句提取方法、装置、服务器及计算机可读存储介质 | |
WO2021143299A1 (zh) | 语义纠错方法、电子设备及存储介质 | |
CN108334609B (zh) | Oracle中实现JSON格式数据存取的方法、装置、设备及存储介质 | |
CN111046667B (zh) | 一种语句识别方法、语句识别装置及智能设备 | |
CN111209396B (zh) | 实体识别模型的训练方法及实体识别方法、相关装置 | |
US20200242490A1 (en) | Method and device for acquiring data model in knowledge graph, and medium | |
CN111046653B (zh) | 一种语句识别方法、语句识别装置及智能设备 | |
CN109492217B (zh) | 一种基于机器学习的分词方法及终端设备 | |
CN107590291A (zh) | 一种图片的搜索方法、终端设备及存储介质 | |
CN112307122B (zh) | 一种基于数据湖的数据管理系统及方法 | |
CN110209892A (zh) | 敏感信息识别方法、装置、电子设备及存储介质 | |
CN110413751B (zh) | 药品搜索方法、装置、终端设备以及存储介质 | |
CN107679208A (zh) | 一种图片的搜索方法、终端设备及存储介质 | |
WO2021159718A1 (zh) | 命名实体识别方法、装置、终端设备及存储介质 | |
CN108763202A (zh) | 识别敏感文本的方法、装置、设备及可读存储介质 | |
WO2020248499A1 (zh) | 基于卷积神经网络的显存处理方法、装置及存储介质 | |
CN115098700B (zh) | 知识图谱嵌入表示方法及装置 | |
WO2022267453A1 (zh) | 关键信息提取模型的训练方法、提取方法、设备及介质 | |
CN110046344B (zh) | 添加分隔符的方法及终端设备 | |
CN110069594B (zh) | 合同确认方法、装置、电子设备及存储介质 | |
CN113220900B (zh) | 实体消歧模型的建模方法和实体消歧预测方法 | |
WO2022257455A1 (zh) | 一种相似文本的确定方法、装置、终端设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22826964 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22826964 Country of ref document: EP Kind code of ref document: A1 |