CN111209348B - Method and device for outputting information - Google Patents

Method and device for outputting information Download PDF

Info

Publication number
CN111209348B
CN111209348B CN201811392563.4A CN201811392563A CN111209348B CN 111209348 B CN111209348 B CN 111209348B CN 201811392563 A CN201811392563 A CN 201811392563A CN 111209348 B CN111209348 B CN 111209348B
Authority
CN
China
Prior art keywords
relationship
entities
entity
data
identified
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811392563.4A
Other languages
Chinese (zh)
Other versions
CN111209348A (en
Inventor
刘畅
张阳
谢奕
杨双全
郑灿祥
季昆鹏
张雪婷
熊云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811392563.4A priority Critical patent/CN111209348B/en
Publication of CN111209348A publication Critical patent/CN111209348A/en
Application granted granted Critical
Publication of CN111209348B publication Critical patent/CN111209348B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The embodiment of the application discloses a method and a device for outputting information. One embodiment of the method includes obtaining data to be identified; performing entity identification on the data to be identified, and determining an entity set in the data to be identified; performing relationship identification on at least two entities in the entity set, and determining the relationship between the at least two entities; and outputting the at least two entities and the relation between the at least two entities correspondingly. This embodiment reduces the labor cost of data structuring.

Description

Method and device for outputting information
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for outputting information.
Background
With the increasing development of big data technology, more and more enterprises need to analyze through various data to mine important values therein. However, semi-structured or unstructured data cannot be analyzed directly, resulting in insufficient utilization. Data structuring techniques are therefore important for data mining. And extracting the entity and the relation between the entities from the semi-structured or unstructured data by using the information extraction technology is one of the important means of the data structuring technology.
With the great development of artificial intelligence technology, the machine learning method has wide application in extracting the relation between entities. Typically, a large amount of annotation data (labeling entities in the data and relationships between entities) is manually required, and the model is trained using these annotation data. The trained model is capable of extracting structured data, i.e., entities and relationships between entities, from semi-structured or unstructured data.
Disclosure of Invention
The embodiment of the application provides a method and a device for outputting information.
In a first aspect, an embodiment of the present application provides a method for outputting information, including: acquiring data to be identified; performing entity identification on the data to be identified, and determining an entity set in the data to be identified; performing relationship identification on at least two entities in the entity set, and determining the relationship between the at least two entities; and outputting the at least two entities and the relation between the at least two entities correspondingly.
In some embodiments, performing entity identification on data to be identified, determining a set of entities in the data to be identified, includes: and performing word segmentation on the data to be identified by using a natural language processing lexical analysis technology to obtain an entity set in the data to be identified.
In some embodiments, identifying a relationship between at least two entities in a set of entities, determining a relationship between the at least two entities, includes: and carrying out relationship matching on the entity set based on a pre-configured relationship template, and determining the relationship between at least two entities in the entity set, wherein the relationship template comprises the category and the slot of the entity, and the relationship words and the slot of the relationship words between the entities.
In some embodiments, the relationship templates are configured by: configuring the category of the entity and the part of speech of the relation word; configuring the slot positions of the entities and the slot positions of the relation words; configuring Guan Jici; relationships between entities corresponding to the configuration relationship words.
In some embodiments, performing relationship matching on the entity set based on a pre-configured relationship template, determining a relationship between at least two entities in the entity set includes: matching the entity set according to the category of the entity in the relation template and the slot position of the entity, and determining successfully matched entity; and matching the data to be identified according to the relationship words and the slots of the relationship words in the relationship templates, and taking the relationship between the entities corresponding to the relationship words successfully matched in the relationship templates as the relationship between the entities successfully matched if the matching is successful.
In a second aspect, an embodiment of the present application provides an apparatus for outputting information, including: an acquisition unit configured to acquire data to be identified; the entity identification unit is configured to identify the entity of the data to be identified and determine an entity set in the data to be identified; the relationship identification unit is configured to identify the relationship between at least two entities in the entity set and determine the relationship between the at least two entities; and an output unit configured to output at least two entities and a relationship correspondence between the at least two entities.
In some embodiments, the entity identification unit is further configured to: and performing word segmentation on the data to be identified by using a natural language processing lexical analysis technology to obtain an entity set in the data to be identified.
In some embodiments, the relationship identification unit is further configured to: and carrying out relationship matching on the entity set based on a pre-configured relationship template, and determining the relationship between at least two entities in the entity set, wherein the relationship template comprises the category and the slot of the entity, and the relationship words and the slot of the relationship words between the entities.
In some embodiments, the relationship templates are configured by: configuring the category of the entity and the part of speech of the relation word; configuring the slot positions of the entities and the slot positions of the relation words; configuring Guan Jici; relationships between entities corresponding to the configuration relationship words.
In some embodiments, the relationship identification unit is further configured to: matching the entity set according to the category of the entity in the relation template and the slot position of the entity, and determining successfully matched entity; and matching the data to be identified according to the relationship words and the slots of the relationship words in the relationship templates, and taking the relationship between the entities corresponding to the relationship words successfully matched in the relationship templates as the relationship between the entities successfully matched if the matching is successful.
In a third aspect, an embodiment of the present application provides a server, including: one or more processors; a storage device having one or more programs stored thereon; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.
The method and the device for outputting information provided by the embodiment of the application firstly carry out entity identification on the acquired data to be identified so as to determine an entity set in the data to be identified; then, carrying out relationship identification on at least two entities in the entity set to determine the relationship between the at least two entities; and finally, outputting the at least two entities and the relation between the at least two entities correspondingly. The model for extracting the structured data is trained without manually marking a large amount of data, and the structured data is extracted from unstructured or semi-structured data through entity identification and relationship identification, so that the labor cost of data structuring is reduced.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method for outputting information in accordance with the present application;
FIG. 3A is a schematic diagram of an application scenario of the method for outputting information provided in FIG. 2;
FIG. 3B is a schematic diagram of yet another application scenario of the method for outputting information provided in FIG. 2;
FIG. 4 is a flow chart of yet another embodiment of a method for outputting information in accordance with the present application;
FIG. 5 is a schematic diagram of an embodiment of an apparatus for outputting information in accordance with the present application;
FIG. 6 is a schematic diagram of a computer system suitable for use with a server implementing an embodiment of the application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
Fig. 1 shows an exemplary system architecture 100 to which an embodiment of a method for outputting information or an apparatus for outputting information of the present application may be applied.
As shown in fig. 1, a terminal device 101, a network 102, and a server 103 may be included in a system architecture 100. Network 102 is the medium used to provide communication links between terminal device 101 and server 103. Network 102 may include various connection types such as wired, wireless communication links, or fiber optic cables, among others.
Terminal device 101 may interact with server 103 via network 102 to receive or send messages, etc. The terminal device 101 may be hardware or software. When the terminal device 101 is hardware, it may be a variety of electronic devices including, but not limited to, smartphones, tablets, laptop and desktop computers, and the like. When the terminal apparatus 101 is software, it may be installed in the above-described electronic apparatus. Which may be implemented as a plurality of software or software modules, or as a single software or software module. The present application is not particularly limited herein.
The server 103 may provide various services, for example, the server 103 may perform processing such as analysis on data such as data to be identified acquired from the terminal device 101, and feed back processing results (e.g., at least two entities and a relationship between at least two entities) to the terminal device 101.
The server 103 may be hardware or software. When the server 103 is hardware, it may be implemented as a distributed server cluster composed of a plurality of servers, or may be implemented as a single server. When the server 103 is software, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or may be implemented as a single software or software module. The present application is not particularly limited herein.
It should be noted that, the method for outputting information provided by the embodiment of the present application is generally performed by the server 103, and accordingly, the device for outputting information is generally disposed in the server 103.
It should be understood that the number of photographing apparatuses, networks, and servers in fig. 1 is merely illustrative. There may be any number of photographing devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for outputting information in accordance with the present application is shown. The method for outputting information comprises the following steps:
in step 201, data to be identified is acquired.
In the present embodiment, the execution subject of the method for outputting information (e.g., the server 103 shown in fig. 1) may acquire data to be identified from a terminal device (e.g., the terminal device 101 shown in fig. 1) by a wired connection or a wireless connection. The data to be identified may be semi-structured or unstructured data. In general, the data to be identified may be text data, containing one or more sentences. The content of the data to be identified may include not only a plurality of specific entities, but also a relationship between at least two entities of the plurality of entities. The categories of entities may include, but are not limited to, people, things, places, things, time, and so forth. A person refers to a person. An event refers to an event that occurs. By ground is meant an address, e.g., an address where an event occurs, an address where a person occurs, etc. The object refers to an article, and can be an article operated by a person or an article related to an event. Time refers to the time at which an event occurred, the time at which a person occurred, and so forth. The relationship between entities may be any of several specific relationships between entities such as people, things, places, things, time, etc., for example, people, things, etc.
In the case where the data to be recognized includes a plurality of sentences, the execution body may first divide the data to be recognized into a plurality of sentences, and execute the following steps for each sentence.
Step 202, performing entity recognition on the data to be recognized, and determining an entity set in the data to be recognized.
In this embodiment, the executing body may perform entity recognition on the data to be recognized to determine the entity set in the data to be recognized. Wherein the entities in the set of entities may be specific entities present in the content of the data to be identified. In some embodiments, the executing body may first segment the data to be identified to obtain a keyword set of the data to be identified; then, part-of-speech analysis is carried out on each keyword in the keyword set, and keywords conforming to the part-of-speech of the entity are screened out; then, specific description information of each selected keyword is obtained, and specific description information describing entities such as people, things, places, things, time and the like is determined from the specific description information; and finally, taking the keywords corresponding to the determined specific description information as entities to generate an entity set. Where the parts of speech of an entity are typically nouns and thesaurus. Parts of speech of people, things, places and things are nouns, and parts of speech of time is an adjective.
And 203, performing relationship identification on at least two entities in the entity set, and determining the relationship between the at least two entities.
In this embodiment, the executing body may perform relationship identification on at least two entities in the entity set to determine a relationship between the at least two entities. In some embodiments, the executing entity may perform relationship identification on any several entities in the entity set to determine a relationship between the several entities. In general, the entities that exist a relationship may be several entities that are adjacent or several entities that are separated by a very small number of entities. For example, for any two adjacent entities, the execution body may first analyze a sentence containing the two adjacent entities in the data to be identified, extract a relationship word from the sentence, and then determine a relationship between the two adjacent entities according to Guan Jici.
And 204, outputting at least two entities and the relation between the at least two entities correspondingly.
In this embodiment, the execution body may output at least two entities and a relationship between the at least two entities. For example, it may be sent to a terminal device for viewing by the user. For another example, it may be sent to a database server for storage.
With continued reference to fig. 3A, fig. 3A is a schematic diagram of an application scenario of the method for outputting information provided in fig. 2. In the application scenario shown in fig. 3A, if the user wants to determine the relationship between the entity and the entity in the sentence "xiaoming is born in beijing", the sentence "xiaoming is born in beijing" may be sent to the server by using the terminal device. The server can firstly perform entity identification on the statement "Xiaoming is born in Beijing", so as to determine the entity "Xiaoming" and the entity "Beijing", then perform relationship identification on the entity "Xiaoming" and the entity "Beijing", so as to determine the relationship between the entity "Xiaoming" and the entity "Beijing" which is the "birth place", and finally send the entity "Xiaoming", the entity "Beijing" and the relationship "birth place" to the terminal equipment of the user. At this time, the content displayed on the terminal device of the user may be as shown in fig. 3A.
With continued reference to fig. 3B, fig. 3B is a schematic diagram of yet another application scenario of the method for outputting information provided in fig. 2. In the application scenario shown in fig. 3B, if the user wants to determine the relationship between the entities in the sentence "little in the Shanghai", the sentence "little in the Shanghai" may be sent to the server by using the terminal device. The server can firstly perform entity identification on the statement "Ming" and determine the entity "Ming" and the entity "Shanghai", then perform relation identification on the entity "Ming" and the entity "Shanghai", determine the relation between the entity "Ming" and the entity "Shanghai" which is a living place, and finally send the entity "Ming", the entity "Shanghai" and the relation "living place" to the terminal equipment of the user. At this time, the content displayed on the terminal device of the user may be as shown in fig. 3B.
The method for outputting information provided by the embodiment of the application comprises the steps of firstly, carrying out entity identification on the acquired data to be identified so as to determine an entity set in the data to be identified; then, carrying out relationship identification on at least two entities in the entity set to determine the relationship between the at least two entities; and finally, outputting the at least two entities and the relation between the at least two entities correspondingly. The model for extracting the structured data is trained without manually marking a large amount of data, and the structured data is extracted from unstructured or semi-structured data through entity identification and relationship identification, so that the labor cost of data structuring is reduced.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for outputting information in accordance with the present application is shown. The method for outputting information comprises the following steps:
in step 401, data to be identified is acquired.
In this embodiment, the specific operation of step 401 is substantially the same as that of step 201 in the embodiment shown in fig. 2, and will not be described herein.
Step 402, word segmentation is performed on the data to be identified by using a natural language processing lexical analysis technology, so as to obtain an entity set in the data to be identified.
In this embodiment, the execution subject of the method for outputting information may perform word segmentation on the data to be identified using a natural language processing (Natural Language Processing, NLP) lexical analysis technique to obtain a set of entities in the data to be identified. Among these, NLP is a way for a computer to analyze, understand and obtain meaning from human language in a clever and useful way. By utilizing NLP, developers can organize and build knowledge to perform tasks such as automatic abstracting, translating, named entity recognition, relation extraction, emotion analysis, voice recognition, topic segmentation and the like. The current approach to NLP is based on deep learning, a sub-field of artificial intelligence that examines and uses patterns in data to improve the understanding of the program. Deep learning models require a large amount of marker data to train and identify relevant correlations.
In general, the format of information of an entity returned after the NLP segmentation is JSON (JavaScript Object Notation, JS object profile) format. Among them, JSON is a lightweight data interchange format that stores and represents data in a text format that is completely independent of the programming language. The compact and clear hierarchical structure makes JSON an ideal data exchange language.
Here, the information of the entity may include, but is not limited to, a status of the request, a model version, a request text, a specific description of each word, an entity number, a part of speech, and the like. Wherein the status of the request is whether the entity was successfully identified from the data to be identified. The model version is a version of the NLP model. The request text is the data to be identified. The specific description of each word is a specific description of an entity. The entity number may characterize the class of the entity, with the number of the entities of different classes being different. For example, the format of the information of the entity may be as shown in table 1 below, and the entity number may be as shown in table 2 below:
Status status of request Version Model version
Text Request text Items Detailed description of each word
Ne Entity sequence number Pos Part of speech
TABLE 1
Entity class Human body Events Ground (floor) Article (B) Time
Entity sequence number 10 37 22 13、33 31、80
TABLE 2
Step 403, performing relationship matching on the entity set based on a pre-configured relationship template, and determining a relationship between at least two entities in the entity set.
In this embodiment, the executing body may perform relationship matching on the entity set based on a pre-configured relationship template, so as to determine a relationship between at least two entities in the entity set. The relationship templates may include, but are not limited to, a person relationship template, a personnel relationship template, a person-ground relationship template, a ground feature relationship template, and the like. The relationship templates may include categories and slots of entities, and relationship words and slots of relationship words between entities. The categories of entities may include, but are not limited to, people, things, places, things, time, and so forth. The slots of an entity may characterize the order of the entity in the data to be identified. For example, if the category of the entity in the relationship template is a person, the slot of the entity is 1, indicating that the entity requiring the determination of the relationship between the entities includes the entity whose first category in the data to be identified is a person. The relationship words between entities may be words derived from relationships between entities. For example, if the place is a human birth place, then the relationship words between entities may include, but are not limited to: birth, etc. If the place is a human residence, then the relationship words between entities may include, but are not limited to: living, home, etc. If the location of the person is the ground, then the relationship words between the entities may include, but are not limited to: pass, arrive, departure, etc. Guan Jici slots can characterize the relative positional relationship between a relationship word and an entity. For example, a relationship word may be between several entities that require determining the relationship of the entities, a relationship word may precede the several entities, a relationship word may follow the several entities, and so on.
In some alternative implementations of the present embodiment, the relationship template may be configured by:
first, the category of the entity and the part of speech of the relation word are configured.
And then configuring the slot positions of the entities and the slot positions of the relational words.
Then, guan Jici is configured.
Finally, the relationship between the entities corresponding to Guan Jici is configured.
For example, the human-ground relationship template may be configured as follows:
person indicates that the category of the entity is a person, person_ph is a placeholder for the person, site indicates that the category of the entity is a place, site_ph is a placeholder for the place, v indicates that the part of speech is a verb, verb_ph is a placeholder for the verb, p indicates that the part of speech is a preposition, prep_ph is a placeholder for the preposition, c indicates that the part of speech is a conjunction, conj_ph is a placeholder for the conjunction, n indicates that the part of speech is a noun, and noun_ph is a placeholder for the noun.
rule represents a set of rules. Each rule consists of pattern and slot parts.
Pattern represents the order of matching slots of a statement. slot represents the value of the slot and a specific relationship. Pattern: [ person_1, v, site_1] represents the basic structure of a statement, i.e., ordered arrangement of people, predicates, places. person_1 indicates that the category of entity is person, person_1 indicates the person's serial number, and person_1 indicates the first person. site in site_1 indicates that the category of the entity is place, 1 in site_1 indicates the serial number of place, and site_1 indicates the first place.
slots represent 3 class-specific slots. In the first class, it is defined that the slot v is [ 'birth', i.e. if the predicate contains one of these three words, then the first class of this pattern is hit, and the specific relationship is given by the results section, i.e. the relationship between the first person and the first place is the place of birth. In the second class, a slot v is defined as [ 'living', 'family', i.e., if the predicate contains one of these three words, then the second class of this pattern is hit, and the specific relationship is given by the results section, i.e., the relationship between the first person and the first place is living. In the third class, it is defined that the slot v is [ 'pass', 'arrive', 'go', i.e. if the predicate contains one of these three words, then the third class of this pattern is hit, and the specific relationship is given by the results section, i.e. the relationship between the first person and the first place is location.
Presently, the relationships supported by the relationship templates may include, but are not limited to, personnel relationships, ground object relationships, and the like. These relationships may in turn include single-to-single relationships, single-to-many relationships, many-to-single relationships, many-to-many relationships, and the like.
For example, for a single-to-single relationship for a person's ground, the rule of its relationship template may be configured as follows:
for example, for a many-to-one relationship of people to ground, the rule of its relationship template may be configured as follows:
for example, for a many-to-one relationship of people to ground, the rule of its relationship template may be configured as follows:
in some optional implementations of this embodiment, the executing body may first match the entity set according to the category of the entity in the relationship template and the slot of the entity, to determine the entity that is successfully matched; and then matching the data to be identified according to the relationship words and the slots of the relationship words in the relationship templates, and taking the relationship between the entities corresponding to the successfully matched relationship words in the relationship templates as the relationship between the successfully matched entities if the matching is successful. Generally, if the data to be identified includes a plurality of sentences, sentences including successfully matched entities can be found, and only sentences including successfully matched entities can be matched according to the relationship words and the slots of the relationship words in the relationship templates.
And step 404, outputting at least two entities and the relation between the at least two entities correspondingly.
In this embodiment, the specific operation of step 404 is substantially the same as that of step 204 in the embodiment shown in fig. 2, and will not be described herein.
As can be seen from fig. 4, the flow 400 of the method for outputting information in this embodiment highlights the steps of entity identification and relationship identification, compared to the corresponding embodiment of fig. 2. Therefore, the entity is identified by using a natural language processing lexical analysis technology, and the identification accuracy of the entity is improved. And the relationship among the entities is identified by utilizing the relationship template, so that the identification accuracy of the relationship among the entities is improved.
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for outputting information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.
As shown in fig. 5, the apparatus 500 for outputting information of the present embodiment may include: an acquisition unit 501, an entity recognition unit 502, a relationship recognition unit 503, and an output unit 504. Wherein the acquiring unit 501 is configured to acquire data to be identified; the entity recognition unit 502 is configured to perform entity recognition on the data to be recognized, and determine an entity set in the data to be recognized; a relationship identifying unit 503 configured to identify a relationship between at least two entities in the entity set, and determine a relationship between the at least two entities; an output unit 504 configured to output at least two entities and a relationship correspondence between the at least two entities.
In the present embodiment, in the apparatus 500 for outputting information: the specific processing of the obtaining unit 501, the entity identifying unit 502, the relationship identifying unit 503 and the output unit 504 and the technical effects thereof may refer to the relevant descriptions of step 201, step 202, step 203 and step 204 in the corresponding embodiment of fig. 2, and are not repeated here.
In some optional implementations of the present embodiment, the entity identification unit 502 is further configured to: and performing word segmentation on the data to be identified by using a natural language processing lexical analysis technology to obtain an entity set in the data to be identified.
In some optional implementations of the present embodiment, the relationship identification unit 503 is further configured to: and carrying out relationship matching on the entity set based on a pre-configured relationship template, and determining the relationship between at least two entities in the entity set, wherein the relationship template comprises the category and the slot of the entity, and the relationship words and the slot of the relationship words between the entities.
In some alternative implementations of the present embodiment, the relationship templates are configured by: configuring the category of the entity and the part of speech of the relation word; configuring the slot positions of the entities and the slot positions of the relation words; configuring Guan Jici; relationships between entities corresponding to the configuration relationship words.
In some optional implementations of the present embodiment, the relationship identification unit 503 is further configured to: matching the entity set according to the category of the entity in the relation template and the slot position of the entity, and determining successfully matched entity; and matching the data to be identified according to the relationship words and the slots of the relationship words in the relationship templates, and taking the relationship between the entities corresponding to the relationship words successfully matched in the relationship templates as the relationship between the entities successfully matched if the matching is successful.
Referring now to FIG. 6, there is illustrated a schematic diagram of a computer system 600 suitable for use with a server (e.g., server 103 of FIG. 1) for implementing an embodiment of the present application. The server illustrated in fig. 6 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601. It should be noted that the computer readable medium according to the present application may be a computer readable signal medium or a computer readable medium, or any combination of the two. The computer readable medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented in software or in hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, an entity identification unit, a relationship identification unit, and an output unit. The names of these units do not constitute a limitation on the unit itself in some cases, and the acquisition unit may also be described as "a unit that acquires data to be identified", for example.
As another aspect, the present application also provides a computer-readable medium that may be contained in the server described in the above embodiment; or may exist alone without being assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: acquiring data to be identified; performing entity identification on the data to be identified, and determining an entity set in the data to be identified; performing relationship identification on at least two entities in the entity set, and determining the relationship between the at least two entities; and outputting the at least two entities and the relation between the at least two entities correspondingly.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (8)

1. A method for outputting information, comprising:
acquiring data to be identified;
performing entity identification on the data to be identified, and determining an entity set in the data to be identified;
performing relationship identification on at least two entities in the entity set, and determining the relationship between the at least two entities;
outputting the at least two entities and the relation between the at least two entities correspondingly;
wherein the identifying the relationship between at least two entities in the entity set, and determining the relationship between the at least two entities, includes:
performing relationship matching on the entity set based on a pre-configured relationship template, and determining the relationship between at least two entities in the entity set, wherein the relationship template comprises the category and the slot of the entity, and the relationship words and the slot of the relationship words between the entities;
wherein, the relation template is configured by the following steps:
configuring the category of the entity and the part of speech of the relation word;
configuring the slot positions of the entities and the slot positions of the relation words;
configuring Guan Jici;
and configuring the relation among the entities corresponding to the relation words.
2. The method of claim 1, wherein the entity identification of the data to be identified, determining the set of entities in the data to be identified, comprises:
and segmenting the data to be identified by using a natural language processing lexical analysis technology to obtain an entity set in the data to be identified.
3. The method of claim 1, wherein the relationship matching the set of entities based on a pre-configured relationship template, determining a relationship between at least two entities in the set of entities, comprises:
matching the entity set according to the category of the entity in the relation template and the slot position of the entity, and determining successfully matched entity;
and matching the data to be identified according to the relationship words and the slots of the relationship words in the relationship templates, and taking the relationship between the entities corresponding to the relationship words successfully matched in the relationship templates as the relationship between the entities successfully matched if the matching is successful.
4. An apparatus for outputting information, comprising:
an acquisition unit configured to acquire data to be identified;
the entity identification unit is configured to identify the entity of the data to be identified and determine an entity set in the data to be identified;
a relationship identification unit configured to perform relationship identification on at least two entities in the entity set, and determine a relationship between the at least two entities;
an output unit configured to output the at least two entities and a relationship between the at least two entities correspondingly;
wherein the relationship-identifying unit is further configured to:
performing relationship matching on the entity set based on a pre-configured relationship template, and determining the relationship between at least two entities in the entity set, wherein the relationship template comprises the category and the slot of the entity, and the relationship words and the slot of the relationship words between the entities;
wherein, the relation template is configured by the following steps:
configuring the category of the entity and the part of speech of the relation word;
configuring the slot positions of the entities and the slot positions of the relation words;
configuring Guan Jici;
and configuring the relation among the entities corresponding to the relation words.
5. The apparatus of claim 4, wherein the entity identification unit is further configured to:
and segmenting the data to be identified by using a natural language processing lexical analysis technology to obtain an entity set in the data to be identified.
6. The apparatus of claim 4, wherein the relationship identification unit is further configured to:
matching the entity set according to the category of the entity in the relation template and the slot position of the entity, and determining successfully matched entity;
and matching the data to be identified according to the relationship words and the slots of the relationship words in the relationship templates, and taking the relationship between the entities corresponding to the relationship words successfully matched in the relationship templates as the relationship between the entities successfully matched if the matching is successful.
7. A server, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-3.
8. A computer readable medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of any of claims 1-3.
CN201811392563.4A 2018-11-21 2018-11-21 Method and device for outputting information Active CN111209348B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811392563.4A CN111209348B (en) 2018-11-21 2018-11-21 Method and device for outputting information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811392563.4A CN111209348B (en) 2018-11-21 2018-11-21 Method and device for outputting information

Publications (2)

Publication Number Publication Date
CN111209348A CN111209348A (en) 2020-05-29
CN111209348B true CN111209348B (en) 2023-09-29

Family

ID=70788260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811392563.4A Active CN111209348B (en) 2018-11-21 2018-11-21 Method and device for outputting information

Country Status (1)

Country Link
CN (1) CN111209348B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015080561A1 (en) * 2013-11-27 2015-06-04 Mimos Berhad A method and system for automated relation discovery from texts
CN105468605A (en) * 2014-08-25 2016-04-06 济南中林信息科技有限公司 Entity information map generation method and device
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
WO2018072563A1 (en) * 2016-10-18 2018-04-26 中兴通讯股份有限公司 Knowledge graph creation method, device, and system
CN107977379A (en) * 2016-10-25 2018-05-01 百度国际科技(深圳)有限公司 Method and apparatus for mined information

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070067320A1 (en) * 2005-09-20 2007-03-22 International Business Machines Corporation Detecting relationships in unstructured text
EP1983444A1 (en) * 2007-04-16 2008-10-22 The European Community, represented by the European Commission A method for the extraction of relation patterns from articles
US9892208B2 (en) * 2014-04-02 2018-02-13 Microsoft Technology Licensing, Llc Entity and attribute resolution in conversational applications

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015080561A1 (en) * 2013-11-27 2015-06-04 Mimos Berhad A method and system for automated relation discovery from texts
CN105468605A (en) * 2014-08-25 2016-04-06 济南中林信息科技有限公司 Entity information map generation method and device
WO2018072563A1 (en) * 2016-10-18 2018-04-26 中兴通讯股份有限公司 Knowledge graph creation method, device, and system
CN107977379A (en) * 2016-10-25 2018-05-01 百度国际科技(深圳)有限公司 Method and apparatus for mined information
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘明童等.开放域上基于深度语义计算的复述模板获取方法.《中文信息学报》.2018,第32卷(第32期),全文. *
面向企业知识图谱构建的中文实体关系抽取;孙晨;《华东师范大学学报(自然科学版)》;全文 *

Also Published As

Publication number Publication date
CN111209348A (en) 2020-05-29

Similar Documents

Publication Publication Date Title
CN107491534B (en) Information processing method and device
CN107679039B (en) Method and device for determining statement intention
US9286290B2 (en) Producing insight information from tables using natural language processing
US20180068221A1 (en) System and Method of Advising Human Verification of Machine-Annotated Ground Truth - High Entropy Focus
US9626622B2 (en) Training a question/answer system using answer keys based on forum content
CN109697239B (en) Method for generating teletext information
US20140250045A1 (en) Authoring system for bayesian networks automatically extracted from text
CN109241286B (en) Method and device for generating text
CN111709240A (en) Entity relationship extraction method, device, equipment and storage medium thereof
CN109635094B (en) Method and device for generating answer
CN111159220B (en) Method and apparatus for outputting structured query statement
US11651015B2 (en) Method and apparatus for presenting information
US10083398B2 (en) Framework for annotated-text search using indexed parallel fields
US9703773B2 (en) Pattern identification and correction of document misinterpretations in a natural language processing system
CN113704429A (en) Semi-supervised learning-based intention identification method, device, equipment and medium
CN110807311A (en) Method and apparatus for generating information
CN110738055A (en) Text entity identification method, text entity identification equipment and storage medium
US10552461B2 (en) System and method for scoring the geographic relevance of answers in a deep question answering system based on geographic context of a candidate answer
CN111488742A (en) Method and device for translation
CN113761190A (en) Text recognition method and device, computer readable medium and electronic equipment
CN112579733A (en) Rule matching method, rule matching device, storage medium and electronic equipment
US10558760B2 (en) Unsupervised template extraction
Rahmi Dewi et al. Software Requirement-Related Information Extraction from Online News using Domain Specificity for Requirements Elicitation: How the system analyst can get software requirements without constrained by time and stakeholder availability
CN111209348B (en) Method and device for outputting information
CN114357195A (en) Knowledge graph-based question-answer pair generation method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant