WO2021157897A1 - Système et procédé pour la compréhension et l'extraction efficaces d'une entité multi-relationnelle - Google Patents

Système et procédé pour la compréhension et l'extraction efficaces d'une entité multi-relationnelle Download PDF

Info

Publication number
WO2021157897A1
WO2021157897A1 PCT/KR2021/000579 KR2021000579W WO2021157897A1 WO 2021157897 A1 WO2021157897 A1 WO 2021157897A1 KR 2021000579 W KR2021000579 W KR 2021000579W WO 2021157897 A1 WO2021157897 A1 WO 2021157897A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
entity
relationship
data
electronic device
Prior art date
Application number
PCT/KR2021/000579
Other languages
English (en)
Inventor
Dalkandura Arachchige Kalpa Shashika Silva Gunaratna
Hongxia Jin
Original Assignee
Samsung Electronics Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/900,664 external-priority patent/US11687570B2/en
Priority claimed from KR1020200106571A external-priority patent/KR20210098820A/ko
Application filed by Samsung Electronics Co., Ltd. filed Critical Samsung Electronics Co., Ltd.
Publication of WO2021157897A1 publication Critical patent/WO2021157897A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Definitions

  • This disclosure relates generally to intelligent understanding of things to support intelligent search, retrieval, and personal assistant tools, such as a question answering system. More specifically, this disclosure relates to entity-relationship embeddings using automatically generated entity graphs.
  • a KG is composed of facts and information about inter-related entities. It is extremely important to put entity-based information stored in a special representation format.
  • the structured data in the KG enables next generation's intelligent search, retrieval, and personal assistant tools, which is achieved by transforming this structured data (symbolic) into continuous vector format (namely, embeddings).
  • certain Web search companies have started a community driven project (a general ontology that describes entities) to tag entities in Web pages so that search engines optimize search results by using this entity metadata. This also shows the real-world need to understand entities to enable better services.
  • Question answering systems utilize entity facts by linking question phrases to KGs and then search KGs to find answers.
  • KG defines the relationships between entities as a fixed set of short labels.
  • This disclosure provides systems and methods creating an entity graph from text descriptions and computing embeddings.
  • a method in a first embodiment, includes receiving, by a processor, an input text. The method also includes identifying a primary entity, a secondary entity and a context from the input text, wherein the context comprises a relationship between the primary entity and the secondary entity. The method additionally includes generating, by the processor, an entity context graph based on the primary entity, the secondary entity, and the context by: extracting, from the context, one or more text segments comprising a plurality of words describing one or more additional relationships between the primary entity and the secondary entity, and generating a plurality of context triples from the one or more text segments, each of the plurality of context triples defining a respective relationship between primary entity and the secondary entity.
  • an electronic device in a second embodiment, includes a processor and a memory operably coupled to the processor.
  • the memory includes instructions executable by the processor to: receive an input text, identify a primary entity, a secondary entity and a context from the input text, wherein the context comprises a relationship between the primary entity and the secondary entity, generate, by the processor, an entity context graph based on the primary entity, the secondary entity, and the context by: 1) extracting, from the context, one or more text segments comprising a plurality of words describing one or more additional relationships between the primary entity and the secondary entity, and 2) generating a plurality of context triples from the one or more text segments, each of the plurality of context triples defining a respective relationship between primary entity and the secondary entity.
  • a non-transitory machine-readable medium contains instructions that, when executed, cause at least one processor of an electronic device to: receive an input text, identify a primary entity, a secondary entity and a context from the input text, wherein the context comprises a relationship between the primary entity and the secondary entity, generate, by the processor, an entity context graph based on the primary entity, the secondary entity, and the context by: 1) extracting, from the context, one or more text segments comprising a plurality of words describing one or more additional relationships between the primary entity and the secondary entity, and 2) generating a plurality of context triples from the one or more text segments, each of the plurality of context triples defining a respective relationship between primary entity and the secondary entity.
  • An electronic device includes a memory storing at least one instruction, and a processor executing the at least one instruction, wherein the processor may, based on first text data being input, extract a first relation text including a first entity text, a second entity text, and a plurality of words describing a relationship between the first entity text and the second entity text from the first text data, generate first triple data defining the relationship between the first entity text and the second entity text based on the first entity text, the second entity text, and the first relation text, and generate a first knowledge graph based on the generated first triple data.
  • a method of controlling an electronic device may include the steps of, based on first text data being input, extracting a first relation text including a first entity text, a second entity text, and a plurality of words describing a relationship between the first entity text and the second entity text from the first text data, generating first triple data defining the relationship between the first entity text and the second entity text based on the first entity text, the second entity text, and the first relation text, and
  • couple and its derivatives refer to any direct or indirect communication between two or more elements, whether or not those elements are in physical contact with one another.
  • transmit transmit
  • receive receive
  • communicate communicate
  • communicate encompass both direct and indirect communication.
  • the term is inclusive, meaning and/or.
  • controller means any device, system or part thereof that controls at least one operation. Such a controller may be implemented in hardware or a combination of hardware and software and/or firmware. The functionality associated with any particular controller may be centralized or distributed, whether locally or remotely.
  • phrases "at least one of,” when used with a list of items, means that different combinations of one or more of the listed items may be used, and only one item in the list may be needed.
  • at least one of: A, B, and C includes any of the following combinations: A, B, C, A and B, A and C, B and C, and A and B and C.
  • various functions described below can be implemented or supported by one or more computer programs, each of which is formed from computer readable program code and embodied in a computer readable medium.
  • application and program refer to one or more computer programs, software components, sets of instructions, procedures, functions, objects, classes, instances, related data, or a portion thereof adapted for implementation in a suitable computer readable program code.
  • computer readable program code includes any type of computer code, including source code, object code, and executable code.
  • the phrase computer readable medium includes any type of medium capable of being accessed by a computer, such as read only memory (ROM), random access memory (RAM), a hard disk drive, a compact disc (CD), a digital video disc (DVD), or any other type of memory.
  • a non-transitory computer readable medium excludes wired, wireless, optical, or other communication links that transport transitory electrical or other signals.
  • a non-transitory computer readable medium includes media where data can be permanently stored and media where data can be stored and later overwritten, such as a rewritable optical disc or an erasable memory device.
  • FIGURE 1 illustrates an example network configuration according to embodiments of the present disclosure
  • FIGURE 2 illustrates an example triple representation according to embodiments of the present disclosure
  • FIGURES 3A and 3B illustrate an Entity Context Graph (ECG) generator and an embedding learner according to embodiments of the present disclosure
  • FIGURE 4 illustrates sub-components of the Entity Context Graph (ECG) generator and embedding learner according to embodiments of the present disclosure
  • FIGURE 5 illustrates a process for context text extraction for entities according to embodiments of the present disclosure
  • FIGURE 6 illustrates a process for generating an entity context graph according to embodiments of the present disclosure
  • FIGURE 7 illustrates a process for combined relationship encoder learning and embedding learning according to embodiments of the present disclosure
  • FIGURE 8 illustrates an example for combined relationship encoder learning and embedding learning according to embodiments of the present disclosure
  • FIGURE 9 illustrates a relationship encoding neural network according to embodiments of the present disclosure.
  • FIGURE 10 illustrates a process for using entity context graph-based embedding learning to improve KG-based embeddings according to embodiments of the present disclosure
  • FIGURE 11 illustrates a process for generating additional traditional knowledge graph triples from context triples according to embodiments of the present disclosure
  • FIGURE 12 is a diagram for illustrating a method of controlling an electronic device according to an embodiment of the disclosure.
  • FIGURE 13 is a diagram for illustrating a process wherein an electronic device updates a first model and a second model according to an embodiment of the disclosure
  • FIGURE 14 is a diagram for illustrating a process wherein an electronic device updates a first knowledge graph according to an embodiment of the disclosure
  • FIGURE 15 is a diagram for illustrating a process wherein an electronic device updates a third knowledge graph according to an embodiment of the disclosure.
  • FIGURE 16 is a diagram for illustrating a process wherein an electronic device updates a third knowledge graph according to an embodiment of the disclosure.
  • FIGURES 1 through 16 discussed below, and the various embodiments used to describe the principles of the present disclosure in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the disclosure. Those skilled in the art will understand that the principles of the present disclosure may be implemented in any suitably-arranged system or device.
  • a Knowledge Graph helps predict the relationship between two entities by mining existing triples, wherein a triple is a subject-relationship-object tuple.
  • a triple includes KG components of two entities (subject and object) and a relationship between the entities.
  • KG components i.e. entities and relationships, are transformed or embedded into continuous vector format called embeddings. The data structured in these KGs enables next generation's intelligent search, retrieval, and personal assistant tools.
  • Certain embodiments of the present disclosure provide a capability to construct an entity graph automatically from a textual knowledge source.
  • Embodiments of the present disclosure provide a system and method that can then learn entity-relationship embeddings using this graph instead of a traditional KG. Further, this automatic process supports changing data environments and domain-specific entity centric knowledge capturing for entity-based search and retrieval (with optional given context text).
  • an entity graph (or, an entity context graph) may mean a first knowledge graph or a second knowledge graph described with reference to FIGURE 12.
  • a knowledge graph generated by a conventional method may mean a third knowledge graph described with reference to the drawings below FIGURE 12.
  • context triple data may mean first triple data described with reference to FIGURE 12.
  • a mobile device, or computing device is configured to receive an inquiry regarding one or more entities.
  • the mobile device, or computing device can access and search an existing KG to obtain a response to the inquiry.
  • the mobile device searches KG triples in the existing KG to identify a relationship mentioned in inquiry. If no match is found or the KG does not contain a fact responsive to the inquiry, the mobile device accesses, obtains, or creates, intermediate post-processed context triples.
  • the mobile device can create one or more context triples as disclosed herein below.
  • the mobile device can extract a most prominent/suitable relation word/verb as the relation to represent a traditional KG triple by joining subject and object entities of the Context triple.
  • the KG has additional triples extracted from Context triples.
  • the mobile device, or computing device is further configured to process the context triples and KG triples to extract more triples to add to the KG triples.
  • Certain embodiments provide for inputs that include a structured text corpus that describes entities/concepts using documents/text. Additional or alternative inputs include a list of entities or an entity detection method.
  • Certain embodiments provide for outputs that include a graph structure creating entities and links between them that are of free text. Outputs also include a continuous vector space representation (embeddings) for entities using the created graph (for machine understanding). Certain embodiments of the present disclosure also provide for a model that learned the entities and relationship links between entities using free text. Certain embodiments include a method to retrieve similar/contextually related entities without a KG.
  • Embodiments of the present disclosure create a Dynamic Entity Context Graph (ECG) that contains context information of the relationship from entity-centric text.
  • ECG Dynamic Entity Context Graph
  • Certain embodiments create triples of the form ⁇ entity> ⁇ relation text> ⁇ entity> to form the entity context graph.
  • Certain embodiments provide for a system and method that is able to extract the triples, namely the entities and the relation text.
  • the extracted relation in a triple is free text.
  • the system is configured to learn entity and relationship embeddings without a semantic KG from the ECG.
  • the system can dynamically encode the relationship text in ECG triples in the learning and testing processes. Therefore, certain embodiments provide a system and method that can enable related/similar entity retrieval for a given entity and a context text.
  • Certain embodiments provide for a graph creation and entity and relationship embedding learning process that is automated. Certain embodiments provide for a system and method that can support learning entity dynamics in changing data environment and sub-domain areas in which no entity-centric structured knowledge is available. Since building a traditional knowledge graph is costly, certain embodiments of the present disclosure do not need a KG. Moreover, existing semi-automated systems require seed patterns (at least a relaxed ontology) to extract triples, such as a NELL system as described in Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B. and Krishnamurthy, J., 2018.
  • Certain embodiments provide for a system and method that learns context between two entities and can use context to associate with an entity.
  • FIGURE 1 illustrates an example network configuration 100 in accordance with this disclosure.
  • an electronic device 101 is included in the network configuration 100.
  • the electronic device 101 may include at least one of a bus 110, a processor 120, a memory 130, an input/output (I/O) interface 150, a display 160, a communication interface 170, or an event processing module 180.
  • the electronic device 101 may exclude at least one of the components or may add another component.
  • the bus 110 may include a circuit for connecting the components 120-180 with one another and transferring communications (such as control messages and/or data) between the components.
  • the processor 120 may include one or more of a central processing unit (CPU), an application processor (AP), or a communication processor (CP).
  • the processor 120 may perform control on at least one of the other components of the electronic device 101 and/or perform an operation or data processing relating to communication.
  • the processor 120 may analyze input audio data, automatically identify one or more languages used in the input audio data, and process the input audio data based on the identified language(s).
  • the memory 130 may include a volatile and/or non-volatile memory.
  • the memory 130 may store commands or data related to at least one other component of the electronic device 101.
  • the memory 130 may store software and/or a program 140.
  • the program 140 may include, for example, a kernel 141, middleware 143, an application programming interface (API) 145, and/or an application program (or Application) 147. At least a portion of the kernel 141, middleware 143, or API 145 may be denoted an operating system (OS).
  • OS operating system
  • the kernel 141 may control or manage system resources (such as the bus 110, processor 120, or memory 130) used to perform operations or functions implemented in other programs (such as the middleware 143, API 145, or application program 147).
  • the kernel 141 may provide an interface that allows the middleware 143, API 145, or application 147 to access the individual components of the electronic device 101 to control or manage the system resources.
  • the application 147 includes one or more applications for processing input data based on automated language detection as discussed below. These functions can be performed by a single application or by multiple applications that each carries out one or more of these functions.
  • the middleware 143 may function as a relay to allow the API 145 or the application 147 to communicate data with the kernel 141, for example.
  • a plurality of applications 147 may be provided.
  • the middleware 143 may control work requests received from the applications 147, such as by allocating the priority of using the system resources of the electronic device 101 (such as the bus 110, processor 120, or memory 130) to at least one of the plurality of applications 147.
  • the API 145 is an interface allowing the application 147 to control functions provided from the kernel 141 or the middleware 143.
  • the API 145 may include at least one interface or function (such as a command) for file control, window control, image processing, or text control.
  • the input/output interface 150 may serve as an interface that may, for example, transfer commands or data input from a user or other external devices to other component(s) of the electronic device 101. Further, the input/output interface 150 may output commands or data received from other component(s) of the electronic device 101 to the user or the other external device.
  • the display 160 may include, for example, a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, a quantum light emitting diode (QLED) display, a microelectromechanical systems (MEMS) display, or an electronic paper display.
  • the display 160 can also be a depth-aware display, such as a multi-focal display.
  • the display 160 may display various contents (such as text, images, videos, icons, or symbols) to the user.
  • the display 160 may include a touchscreen and may receive, for example, a touch, gesture, proximity, or hovering input using an electronic pen or a body portion of the user.
  • the communication interface 170 may set up communication between the electronic device 101 and an external electronic device (such as a first electronic device 102, a second electronic device 104, or a server 106).
  • the communication interface 170 may be connected with a network 162 or 164 through wireless or wired communication to communicate with the external electronic device.
  • the wireless communication may use at least one of, for example, long term evolution (LTE), long term evolution-advanced (LTE-A), code division multiple access (CDMA), wideband code division multiple access (WCDMA), universal mobile telecommunication system (UMTS), wireless broadband (WiBro), or global system for mobile communication (GSM), as a cellular communication protocol.
  • LTE long term evolution
  • LTE-A long term evolution-advanced
  • CDMA code division multiple access
  • WCDMA wideband code division multiple access
  • UMTS universal mobile telecommunication system
  • WiBro wireless broadband
  • GSM global system for mobile communication
  • the wired connection may include at least one of, for example, universal serial bus (USB), high definition multimedia interface (HDMI), recommended standard 232 (RS-232), or plain old telephone service (POTS).
  • the network 162 or 164 may include at least one communication network, such as a computer network (like a local area network (LAN) or wide area network (WAN)), the Internet, or a telephone network
  • the first external electronic device 102 or the second external electronic device 104 may be a wearable device or an electronic device 101-mountable wearable device (such as a head mounted display (HMD)).
  • HMD head mounted display
  • the electronic device 101 may detect the mounting in the HMD and operate in a virtual reality mode.
  • the electronic device 101 may communicate with the electronic device 102 through the communication interface 170.
  • the electronic device 101 may be directly connected with the electronic device 102 to communicate with the electronic device 102 without involving with a separate network.
  • the first and second external electronic devices 102 and 104 each may be a device of the same type or a different type from the electronic device 101.
  • the server 106 may include a group of one or more servers. Also, according to embodiments of this disclosure, all or some of the operations executed on the electronic device 101 may be executed on another or multiple other electronic devices (such as the electronic devices 102 and 104 or server 106). Further, according to embodiments of this disclosure, when the electronic device 101 should perform some function or service automatically or at a request, the electronic device 101, instead of executing the function or service on its own or additionally, may request another device (such as electronic devices 102 and 104 or server 106) to perform at least some functions associated therewith.
  • the other electronic device may execute the requested functions or additional functions and transfer a result of the execution to the electronic device 101.
  • the electronic device 101 may provide a requested function or service by processing the received result as it is or additionally.
  • a cloud computing, distributed computing, or client-server computing technique may be used, for example.
  • FIGURE 1 shows that the electronic device 101 includes the communication interface 170 to communicate with the external electronic device 102 or 104 or server 106 via the network(s) 162 and 164, the electronic device 101 may be independently operated without a separate communication function, according to embodiments of this disclosure. Also, note that the electronic device 102 or 104 or the server 106 could be implemented using a bus, a processor, a memory, an I/O interface, a display, a communication interface, and an event processing module (or any suitable subset thereof) in the same or similar manner as shown for the electronic device 101.
  • the server 106 may operate to drive the electronic device 101 by performing at least one of the operations (or functions) implemented on the electronic device 101.
  • the server 106 may include an event processing server module (not shown) that may support the event processing module 180 implemented in the electronic device 101.
  • the event processing server module may include at least one of the components of the event processing module 180 and perform (or instead perform) at least one of the operations (or functions) conducted by the event processing module 180.
  • the event processing module 180 may process at least part of the information obtained from other elements (such as the processor 120, memory 130, input/output interface 150, or communication interface 170) and may provide the same to the user in various manners.
  • event processing module 180 is shown to be a module separate from the processor 120 in FIGURE 1, at least a portion of the event processing module 180 may be included or implemented in the processor 120 or at least one other module, or the overall function of the event processing module 180 may be included or implemented in the processor 120 shown or another processor.
  • the event processing module 180 may perform operations according to embodiments of this disclosure in interoperation with at least one program 140 stored in the memory 130.
  • FIGURE 1 illustrates one example of a network configuration 100
  • the network configuration 100 could include any number of each component in any suitable arrangement.
  • computing and communication systems come in a wide variety of configurations, and FIGURE 1 does not limit the scope of this disclosure to any particular configuration.
  • FIGURE 1 illustrates one operational environment in which various features disclosed in this patent document can be used, these features could be used in any other suitable system.
  • a user of the electronic device 101 is interacting with an intelligent assistant, such as BIXBY, and based on a user action on a certain entity, such as a selection or request for information, the electronic device 101 attempts to retrieve similar entities.
  • the electronic device 101 namely through the processor 120, is configured to measure similarity of an entity given another entity and handling changing data environment to learn entity dynamics; and retrieve an entity for a given context, conversational search over entities using free text.
  • the processor 120 is configured to measure the similarity of an entity given another entity and handling changing data environment to learn entity dynamics. Assume a scenario where entity related data changes (e.g., every day or few days) where keeping up with the changing data to update and maintain a knowledge graph becomes difficult but the system needs to adapt to support user needs. For example, a personal assistant needs to model person's conversational contexts relevant to interacting entities (movies, actors, shopping products, etc.) and general knowledge of entities. Moreover, the assistant needs to support this capability over many other users and automatically building traditional KGs and maintaining them is impractical and impossible (due to manual verification requirements and costs).
  • the disclosed system can represent the user preferences, and interactions using our textual context triples and build the entity context graph.
  • the system can periodically process this graph and learn the entity and relation representations to support contextual search and retrieval operations with higher quality as the data is represented in a graph and the relationship encoder learns the latent relationships automatically in the training step.
  • the system can enable handling frequently changing data environments because of the fact that the system does not need to generate and maintain a traditional knowledge graph and ontology.
  • the processor 120 can retrieve an entity for a given context, conversational search over entities using free text.
  • the processor 120 can easily retrieve contextually similar entities in relation to the given entity and context text that user may describe using free text (hence, inherently avoids vocabulary mismatch problem). Since triples are represented in our entity graph using context text, the system can encode the user given context text (which may describe a complex relation of the given entity) using the learned relation encoder and search in our entity graph using the given entity and encoded relation text to find most similar entities.
  • the disclosed system and method can model entity dynamics in sub domains that are too costly and impractical to build KGs.
  • a scenario exists in which links between cross-domain entities are required.
  • cross domain links between entities are important to support cross domain search and recommendation applications. Getting a relatedness link between two products that are under two different categories (in the ontology) is difficult because the vocabulary they share are quite different. Further, it is further necessary to customize an existing ontology to reflect these links or build one. Certain embodiments of the present disclosure are able to avoid customizing or building a knowledge graph for this purpose.
  • Embodiments of the present disclosure can capture these latent cross domain links between entities by processing user reviews on them and creating triples in the form ' ⁇ user> ⁇ review_text> ⁇ aspect>' by mining aspects on the reviews (this can be reviews or product descriptions). Then the disclosed system and method can learn the vector space representations for aspects. Since each product (entity) has a set of aspects assigned, by measuring similarity of aspects that belong to two cross domain products, we can get relatedness links between the entities.
  • Certain embodiments provide a system and method that support question answering (QA) using context and KG triples and improves coverage of KG triples. Updating KGs for new documents takes time as the process requires correctness and consistency checking and verifications (most of the time manual verification is needed if new facts and patterns to be extracted). Hence, a question answering system that uses KG triples to find answers may not be able to support new facts appearing in new documents.
  • the disclosed context triple extraction from entity-based text documents is straightforward and can improve a question answering system to make use of context triples to enhance its capabilities by:
  • Context triples can be used to answer factual questions by having a matching component to align questions to textual relationships in the context triples and if a match is found, the matching context triples' object entities are the answers to the question. This matching/alignment can be done by mining or understanding the intent of the question and context triple description.
  • context triples can be used to extract traditional KG triples by extracting labeled relationships from the textual relationships of context triples and then forming KG triples (subject and object entities are the same in both context and KG triples and the extracted labeled relationship makes the KG triple).
  • supervised extractors used in KG triple extraction process can be used with modifications (since only relation label extraction is required and the two entities of the triple are already identified).
  • Certain embodiments improve and support KG completion tasks through better KG embeddings and enhance QA capabilities through KG completion.
  • KG coverage can be improved using context triples.
  • Better KG entity embeddings can be generated by learning from both KG and context triples.
  • KGs contain incomplete knowledge because of errors and shortcomings in the complex extraction process. That means, some entities may miss some obvious information. For example, where 'Barack Obama' entity has the 'pro profession' relation value as 'politician,' 'Joe Biden' entity does not have it, even though it should have that information.
  • Context triples can be created in an automated fashion.
  • the entity embeddings can be used to see if two entities are similar to each other by comparing their embeddings. This may not be possible to do with high accuracy without context triples because incomplete KG triples may not reflect the true similarity of the entities (context triples may contain the missing information and help entity embeddings to contain this information in the embedding space). If two entities are similar, they most probably should consist of similar relationships (i.e., edges in the KG).
  • link prediction is a technique used in KG completion tasks and we can use the context triples to learn better KG embeddings and predict missing links accordingly. Furthermore, this will enable wider coverage for KG construction and enable better QA support as discussed herein above.
  • FIGURE 2 illustrates an example triple representation according to embodiments of the present disclosure.
  • the embodiment of the triple representation 200 shown in FIGURE 2 is for illustration only. Other embodiments of triples could be used without departing from the scope of this disclosure.
  • a first triple representation 205 includes a first entity 210, which in the example is barack Obama a second entity 215, which in the example is 44th President of US and a context 220.
  • the context 220 includes multiple words, namely free text, that describe a relationship between the first entity 210 and the second entity 215.
  • the free text recites barack Obama an American politician who served as the 44th President of the United States from January 20, 2009, to January 20, 2017"
  • a triple extraction using n-ary relationships 250 also is shown in the example of Figure 2.
  • the triple extraction using n-ary relationships 250 includes the first entity 210, which in the example is "Barack Obama", the second entity 215, which in the example is "44th President of US", and a relationship 255.
  • the relationship 255 comprises singular text, in this case "served” with other multi-relational data 260, which can further define the relationship 255.
  • the multi-relational data 260 can include "start date” and "end date”.
  • Embodiments of the present disclosure enable the triple extraction to capture complex relationships in a relatively easy process in text form, such as the free text of the context 220, and learn to encode the textual relation and then learn the entity dynamics of the multi-relational data.
  • Multi-relational data means that a data point (i.e., entity) has many relations to other data points.
  • FIGURES 3A and 3B illustrate an Entity Context Graph (ECG) generator and an embedding learner according to embodiments of the present disclosure.
  • ECG Entity Context Graph
  • the embodiment of the ECG generator and embedding learner 300 shown in FIGURES 3A and 3B are for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • One or more of the components illustrated in FIGURES 3A and 3B can be included in one or more processors of a system configured to generate the ECG.
  • processor 120 can include one or more of a context text extractor 310, an entity context graph generator 315, combined relationship encoder and entity embedding learner 320.
  • the ECG generator 300 is configured to perform triple extraction to capture complex relationships in a relatively easy process in text form.
  • the ECG generator 300 also learns to encode the textual relations.
  • the ECG generator 300 further learns the entity dynamics of the multi-relational data.
  • the ECG generator 300 uses an 'entity-relation-entity' representation model.
  • the relationships that link entities with meaning are the focus.
  • a graph representation is extracted from a given textual entity centric dataset and is operated on to compute the embeddings.
  • the ECG generator includes a content triple extraction block 301 and a learning network, namely, combined relationship encoder and entity embedding learner 320.
  • the content triple extraction block 301 is configured to build ECGs 303.
  • the content triple extraction block 301 performs textual knowledge processing, i.e., triple extraction one or more text strings for one or more topic entities included in one or more entity documents 304 obtained from a textual knowledge source 305.
  • the content triple extraction block 301 receives the one or more text strings from the knowledge source 305 and performs context triple extraction, entity detection, and text segmentation in a context text extractor 310.
  • the learning network is embodied as a combined relationship encoder and entity embedding learner 320 and comprises embedding learning for entities using the ECGs 303.
  • the combined relationship encoder and entity embedding learner 320 uses the ECGs 303 to obtain context text 311, topic entity 312, and a target entity 313.
  • the combined relationship encoder and entity embedding learner 320 extracts word level embeddings and dynamically encodes the relationship text in ECG 303 triples. Additionally, the combined relationship encoder and entity embedding learner 320 looks-up the entities in the embedding matrix.
  • the textual knowledge source 305 includes information regarding entities and their relationships.
  • the textual knowledge source 305 can be one or more databases that describe entities.
  • the textual knowledge source 305 can be a document, also referenced as a topic page, that describes an entity with reference to another entity.
  • the textual knowledge source 305 also can be an input text string received via an input interface, such as a keyboard, touchscreen, receiver, transceiver, or the like.
  • a context text extractor 310 is configured to extract text descriptions from the textual knowledge source 305.
  • the processor 120 can extract the context 220 text from an input text string or from a document.
  • the processor 120 extracts the contexts 220 that appear between two entities, such as between the first entity 210 and the second entity 215.
  • the processor 120 extracts the first entity 210, the second entity 215, and the contexts 220.
  • the contexts 220 can include free form text comprising multiple words or text strings.
  • An entity context graph generator 315 is configured to create an entity context graph (ECG).
  • the processor 120 processes the extracted first entity 210, second entity 215, and contexts 220 to create entity context triples to form a graph representation of the relationships, namely the ECG 303.
  • the processor 120 uses the ECG generated by the entity context graph generator 315 to learn multi-relational entity embeddings using a relation encoder network 325, initialized entity representations in an entity lookup matrix 335, and representation learning network 330. Thereafter, the processor 120 generates an output 340 of the combined relationship encoder and entity embedding learner 320.
  • FIGURE 4 illustrates sub-components of the Entity Context Graph (ECG) generator and embeddings learner according to embodiments of the present disclosure.
  • the embodiment of the ECG generator and embeddings learner 300 shown in FIGURE 4 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • Functions of the sub-components of the Entity Context Graph (ECG) generator and embeddings learner are further detailed with respect to FIGURES 5, 6, 7, and 8.
  • the context text extractor 310 includes Text Pre-Processing 405, Entity Tagger 410, and Text Segmentation & Extraction 415.
  • the processor 120 includes each of the components in the context text extractor 310.
  • one or more of the components in the context text extractor 310 are performed by different processors or different systems.
  • FIGURE 5 illustrates a process for context text extraction for entities according to embodiments of the present disclosure.
  • the embodiment of the process 500 shown in FIGURE 5 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • a document 505 is received, accessed, or obtained from a textual knowledge source 305.
  • the document 505 can be a topic page about "Blue Origin, LLC” received from a textual knowledge source 305, in this case "Wikipedia”.
  • the document 505 describes a primary entity 515, here "Blue Origin” using a secondary entity 520, here "Jeff Bezos”.
  • the processor 120 extracts text descriptions, free form text segments 510, as relations between a primary entity 515 and a secondary entity 520.
  • the processor 120 creates context triples including the extracted text segments 510, primary entity 515, and secondary entity 520.
  • a conventional triple extraction in KGs has a fixed set of relation labels to construct triples.
  • Embodiments of the present disclosure extract and put free form text segments 510 as relationships between two entities 515, 520.
  • a text segment is extracted and, for each entity in the extracted text segment, context triples are created.
  • the processor 120 can create a context triple 550 illustrating the extracted relationship between Blue Origin and Jeff Bezos.
  • the context triple 550 includes primary entity 515, which is "Blue Origin”, secondary entity 520, which is "Jeff Bezos”.
  • the entity context graph generator 315 includes Entity Assignment for Document 420, Entity Assignment for Text 425, and Context Triple Creation 430.
  • the processor 120 includes each of the components in the entity context graph generator 315.
  • one or more of the components in the entity context graph generator 315 are performed by different processors or different systems.
  • FIGURE 6 illustrates a process for generating an entity context graph according to embodiments of the present disclosure.
  • the embodiment of the process 600 shown in FIGURE 6 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • Triple c triple ⁇ getTriple(primary entity, context, secondary entity)
  • the algorithm processes a collection of documents (topic_pages) describing facts or entities. Then for each entity or secondary entity mentioned in a document, text contexts are extracted. Then by following the pattern ⁇ primary_entity,context,secondary_entity>, context triples are generated to create the ECG.
  • a fixed length sliding window is used on the text in which the secondary entity appears.
  • the surrounding text area of an entity provides a latent relationship between the entity and topic (page focus) entity. That is, in block 605, a document is selected from one or more documents that describe an entity.
  • references to entities are extracted from the text document.
  • for each entity 1.
  • a fixed window length text segment is obtained as the context text in which the entity appears; 2. Identify and/or extract the primary entity, i.e., subject of the document; 3. Identify and/or extract one or more entities of focus as the secondary entities, i.e., objects; 4. Create a triple from the text segment, primary entity, and secondary entity; and 5. Add the triple to the graph. Thereafter, the process is repeated 620 for all documents that describe other topic entities. Accordingly, by following this method of creating triples, a given text can be input to represent an edge between two entities.
  • the combined relationship encoder and entity embedding learner 320 includes Pre-Trained Word Embedding 435, Initialize Entity Embedding 440, Learn Relationship Encoder 445; and Learn Entity Embedding 450.
  • the processor 120 includes each of the components in the combined relationship encoder and entity embedding learner 320.
  • one or more of the components in the combined relationship encoder and entity embedding learner 320 are performed by different processors or different systems.
  • FIGURE 7 illustrates a process for combined relationship encoder learning and embedding learning according to embodiments of the present disclosure.
  • the embodiment of the process 700 shown in FIGURE 7 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • a convolution network is used to encode the relationship text in context triples. Previous learning methods consider the set of relations fixed and known but since certain representations of a relationship contain lengthy text, the relationship encoding network 325 learns to output the relationship embeddings for this text to work with the KG representation learning network 330 together with the initialized entity representations lookup matrix 335. The KG representation learning network 330 adjusts the relationship encoding network 325 weights and initialized entity representations in the lookup matrix 335, accordingly.
  • the learning happens in 3 steps as follows in the training stage.
  • Step 1 for a context triple, get the relationship, which is of free text and input to the relationship encoding network 325.
  • Step 2 lookup the entities in the embedding matrix 335.
  • Step 3 input the output of the relation encoding network 325 and looked up entity vectors from the entity lookup matrix 335 into the entity and relationship learning network 330. Based on the optimization function, the weights of the relationship encoding network 325 and entity embedding vectors in the entity lookup matrix 335 are updated 705.
  • FIGURE 8 illustrates an example for combined relationship encoder learning and embedding learning according to embodiments of the present disclosure.
  • the example of the combine relationship encoder learning and embedding learning 800 shown in FIGURE 8 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • the first triple 805 and the second triple 810 each express a person company "founder" relationship; but if the literal text representation is taken as the relationship label, the two triples 805, 810 have two distinct relationships, in terms of general KG triple representation.
  • the relation encoding network 325 learns to encode these relationship texts to be similar when trained with many triples.
  • the combined relationship encoder and entity embedding learner 320 uses these similarities in the triples to assist learning to output relation encoder 325 output similarly when processing the free form text in the first context 835, between Bill Gates 815 as the primary entity and MICROSOFT 825 as the secondary entity, and the second context 840, between Jeff Bezos 820 as the primary entity and AMAZON 830 as the secondary entity.
  • FIGURE 9 illustrates a relationship encoding neural network according to embodiments of the present disclosure.
  • the example of the relationship encoding neural network 900 shown in FIGURE 9 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • the relationship encoding neural network 900 can be implemented by, the same as, or similar to, the relation encoder network 325. In certain embodiments, relation encoder network 325 can be configured differently.
  • the relationship encoding neural network 900 includes a deep convolution neural network (CNN) configured to perform the relationship encoding.
  • CNN deep convolution neural network
  • the input text description such as first context 835
  • For embedding the input text let be the d dimensional word vector representation corresponding to the ith word in the relationship text.
  • a textual relationship of length m (padded when necessary) is represented as:
  • the relationship encoding neural network 900 mines different length features by applying different window size convolutions 910 and max pooling. Thereafter, the relation encoder network 325 combines 915 all the mined features (stack) to perform another convolution operation 920.
  • the convolution layer 910 operation performs as a filter (w ⁇ R ⁇ yd), which applies to a window of size y words to produce a new feature.
  • a non-linear activation f is applied according to:
  • the following feature map is obtained, where .
  • the input length (m) can be as the dimension for h.
  • a max pooling operation over window size z on feature map h is performed to select maximum values corresponding to the window size z.
  • Padding can be applied for length consistency and the window (y and z) can have sliding strides other than 1.
  • Additional convolution and max pooling layers are applied 920. Thereafter, a fully connected layer to output 925 the encoded text relationship.
  • the CNN is used to encode the textual relationships in the context triples and any encoding network can also be used and weight adjustment of this network will be done together with the entity and relationship representation algorithm.
  • representation learning network 330 uses translation-based KG embedding technique.
  • the representation learning network 330 learns to represent a triple (h,r,t) with head (h), relation (r), and tail (t) using corresponding vectors h,r, and t as follows:
  • the KG learning method optimizes the following margin based loss where S' is the set of corrupted triples from correct set S, and E is the set of entities.
  • the normalized vector representation is used for each part of the triple and mean loss.
  • the relationships (r) are dynamically encoded using the relation encoder network 325 and entity vectors are from the entity embedding look up matrix 335.
  • Equation 6 i is the regularization and x ⁇ is the normalized vector of x.
  • the relation encoder network 325 includes a long short-term memory (LSTM) neural network comprising digital circuitry configured to perform the relationship encoding using a sequence modeling approach. Additionally, in certain embodiments, the representation learning network 330 can be configured to perform different optimization methods, including optimization methods based on a graph convolution.
  • LSTM long short-term memory
  • FIGURE 10 illustrates a process for using entity context graph-based embedding learning to improve KG-based embeddings according to embodiments of the present disclosure.
  • the embodiments of the process 1000 shown in FIGURE 10 is for illustration only. Other embodiments could be used without departing from the scope of the present disclosure.
  • these embodiments are able to model 'founder_of' relation between entity pairs and also the transformation of the two entities along the direction of the vector 'founder_of' in the embedding space. This is due to additional knowledge obtained from processing text documents and modelling context triples.
  • entity context graph and representation learning are used to improve pure KG-based embeddings by jointly learning where the entity space is shared between the two learning approaches, but not the relationship space. That is, a knowledge representation format system, such as based on or using ECG generator 300, can be configured to use the generated ECG 303 to improve a KG-based embedding by sharing the entity spaces for both the KG-based and ECG-based networks and processing KG triples and ECG triples.
  • the KG-based triples include a head (h), relation (r), and tail (t).
  • the KG representation learning network uses translation-based KG embedding technique.
  • the KG representation learning network represents the KG-base triple (h, r, t) using corresponding vectors h, r, and t and outputs to the KG Entity and Relationship block 1015.
  • ECG-based triples are obtained.
  • the ECG-based triples can be in the form of an ECG 303 generated by the context text extractor 310 and the entity context graph generator 315.
  • the ECG -based triples include a head (h), free-form text relation (r), and tail (t).
  • the Context Triple Representation Learning Network such as combined relationship encoder and embedding learner 320, obtains the ECG-based triples and only the KG entity embeddings from block 1015 and dynamically encodes the relationships as discussed herein above with respect to Figures 3B, 7, and 9.
  • the Context Triple Representation Learning Network outputs to the Context Triple Relationship space block 1030 (representation of context text relationships in the vector space), which is used in further iterations and convolutions in the Context Triple Representation Learning Network block 1025.
  • a CNN is used to encode the textual relationships in the context triples and any encoding network (e.g., LSTM) can also be used and weight adjustment of this network will be done together with the entity and relationship representation algorithm.
  • FIGURE 11 illustrates a process for generating additional traditional knowledge graph triples from context triples according to embodiments of the present disclosure.
  • the embodiment of the process 1100 shown in FIGURE 11 is for illustration only. Other embodiments could be used without departing from the present disclosure.
  • a memory or database for a knowledge reference system contains KG triples and Context triples. No post-processing is performed for the Context triples.
  • the knowledge reference system searches the KG to find the answer.
  • a processor such as processor 120, searches the KG triples in the memory, such as memory 130, for a relationship mentioned in the question. If no match is found or if the KG does not contain the fact, the processor is unable to provide an answer to the question.
  • the processor determines that the memory or database contains KG triples and intermediate post-processed context triples.
  • the processor extracts a most prominent/suitable relation word/verb as the relation to represent a traditional KG triple by joining subject and object entities of the Context triple.
  • the KG has additional triples extracted from Context triples.
  • the KG is searched for triples with a matching relation identified in the question. For example, when a question is received about Barak Obama 1115, the processor can search for a triple to identify 44th president of the US 1120 as a tail, and serve 1125 as a relation.
  • the processor determines that the memory or database contains KG triples and fully post-processed Context triples.
  • the processor processes the Context text in Context triples to extract more KG triples and add to the KG. Now the KG has more additional triples extracted from Context triples.
  • the enhanced KG is searched for triples with a matching relation identified in the question. For example, when a question is received about Barak Obama 1115, the processor can search for a triple to identify 44th president of the US 1120 as a tail, and serve 1125 as a relation, as well as a start date 1135 of January 20, 2009 1140 and an end date 1145 of January 20, 2017 1150.
  • Embodiments of the present disclosure can extract traditional KG triples by processing the textual description of the context triple.
  • the context triple has already subject and object entities identified.
  • the relationship text has more details including the prominent main relationship.
  • Embodiments of the present disclosure can extract this prominent relationship by performing a dependency parsing, learning based, or mining text for focus/root term detection in the relationship text.
  • embodiments of the present disclosure can extract more KG triples within the context text description if available.
  • embodiments of the present disclosure use existing triple extraction techniques (supervised learning based or pattern based). If the extracted triples have the first triple's subject or object, links are made to those entities using subject or object entity.
  • the disclosed system and method can determine whether the main relationship that was extracted earlier (e.g., 'served' in this example) has dependency to other extracted relationships in the text description. Note that this may require dependency parsing and more processing that require algorithm training to identify such dependencies. Such a processing may determine 'start date' and 'end date' in this example that actually link to/depend on 'served' relationship.
  • context triples may support extraction of this type of complex KG triple patterns because it may contain additional information within the relationship text whereas, in a traditional KG triple extraction, most of the time, a single sentence based extraction is performed. Hence, they may miss such knowledge extractions.
  • the main triple representing the context triple is what is obtained by extracting the main relationship term (described in the first point above). Also, if the additional triples extracted do not depend on the main relationship extracted first, the newly extracted triples will be linking to subject or object entities of the first extracted triple (i.e., no n-ary triple pattern).
  • FIGURE 12 is a diagram for illustrating a method of controlling an electronic device 100 according to an embodiment of the disclosure.
  • the electronic device 100 may receive input of first text data at operation S1210.
  • the first text data may be a document including a plurality of words for describing a specific entity, but is not limited thereto.
  • the first text data may be implemented as various text data such as a text input through an input/output interface from a user, a structured texture corpus describing a specific entity (or, a concept), etc.
  • the electronic device 100 may not only receive input of the first text data from a user, but also receive the first text data from an external server storing various kinds of text data.
  • the electronic device 100 may receive the first text data from an external server wherein a database is constructed for documents describing specific entities.
  • the electronic device 100 may extract a first relation text including a first entity text, a second entity text, and a plurality of words describing the relationship between the first entity text and the second entity text from the first text data at operation S1220.
  • the electronic device 100 may extract the specific entity as the first entity text. Also, the electronic device 100 may extract at least one entity that can describe the first entity text in the first text data as the second entity text. In this case, there may be one second entity text, but the number is not limited thereto, and there may be a plurality of second entity texts.
  • a text located around the second entity text in the first text data may have high probability of including at least one word describing the relationship between the first entity and the second entity.
  • the electronic device 100 may extract text segments by a unit of a fixed window for a text in a surrounding area of the area wherein the second entity text is located in the first text data.
  • the electronic device 100 may identify a text segment including a plurality of words describing the relationship between the first entity text and the second entity text among the extracted text segments as the first relation text.
  • the electronic device 100 may extract the first relation text by using a sliding window of a fixed length to the area wherein the second entity text is located in the first text data.
  • the first relation text may be a text in a free form including a plurality of words describing the relationship between the first entity text and the second entity text.
  • the electronic device 100 may generate a plurality of first triple data defining the relationship between the first entity text and the second entity text based on the first entity text, the second entity text, and the first relation text at operation S1230.
  • the first triple data is triple data wherein a subject is constituted with the first entity text, an object is constituted with the second entity text, and a relationship is constituted with the first relation text.
  • the electronic device 100 may generate a first knowledge graph (or, an entity context graph (ECG)) based on the generated first triple data at operation S1240. That is, the electronic device 100 may automatically generate a first knowledge graph based on an entity text included in the input first text and a relation text in a free form describing the relationship between each entity.
  • a first knowledge graph or, an entity context graph (ECG)
  • first triple data defining the relationship between each of the plurality of second entity texts and the first entity text.
  • the electronic device 100 may generate a first knowledge graph that can describe the relationship between the first entity and the plurality of second entities based on the plurality of first triple data.
  • a process wherein the electronic device 100 embeds components included in the generated first knowledge graph will be described with reference to FIGURE 13. Also, a process wherein the electronic device 100 updates a knowledge graph when the first text data was updated will be described with reference to FIGURE 14. In addition, an operation of the electronic device 100 when it received input of an inquiry related to the first entity text will be described with reference to FIGURE 15. Further, a process wherein the electronic device 100 updates the prestored conventional third knowledge graph based on the first knowledge graph will be described with reference to FIGURE 16.
  • FIGURE 13 is a flow chart for illustrating an operation of the electronic device 100 of embedding the first knowledge graph according to an embodiment of the disclosure.
  • the electronic device 100 may input the first relation text into a first model and obtain an encoded first relation text at operation S1310.
  • the first model may be an artificial intelligence model trained to encode the first relation text in a free form.
  • the first model may be trained to convert (embed) an input text to vector data corresponding to the input text.
  • the electronic device 100 may input each of the first entity text and the second entity text into a second model and obtain data corresponding to each of the first entity text and the second entity text in an embedding matrix at operation S1320.
  • the second model is a module identifying data corresponding to an entity text in an embedding matrix.
  • the second model may be implemented as an artificial intelligence model trained through a predefined algorithm or a software/hardware module that executes a predefined instruction.
  • An embedding matrix means a matrix including vector data obtained by performing embedding for a plurality of entity texts or characters.
  • the electronic device 100 may identify vector data corresponding to each of the first entity text and the second entity text in the embedding matrix, and obtain the identified vector data as data corresponding to each entity text.
  • the electronic device 100 may update a weight of the first model based on the encoded first relation text, and update the embedding matrix based on the data corresponding to each of the first entity text and the second entity text at operation S1330.
  • the electronic device 100 may update the weight of the first model based on the encoded first relation text on the basis of a predefined optimization function or loss function.
  • the electronic device 100 may update the embedding matrix based on the data corresponding to each of the first entity text and the second entity text on the basis of the predefined optimization function or loss function.
  • One or a plurality of processors 120 perform control to process input data according to predefined operations rules or artificial intelligence models stored in the memory 130.
  • the one or plurality of processors are artificial intelligence-dedicated processors
  • the artificial intelligence-dedicated processors may be designed as a hardware structure specialized for processing of a specific artificial intelligence model.
  • the predefined operation rules or artificial intelligence models are characterized in that they are made through learning.
  • being made through learning means that predefined operations rules or artificial intelligence models set to perform desired characteristics (or, a purpose) are made by being trained by using a plurality of learning data through a learning algorithm.
  • Such learning may be performed in a device itself wherein artificial intelligence is performed according to the disclosure, or through a separate server/system.
  • learning algorithms there are supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but learning algorithms are not limited to the aforementioned examples.
  • An artificial intelligence model may consist of a plurality of neural network layers.
  • Each of the plurality of neural network layers has a plurality of weight values, and performs an operation of the neural network layer through an operation between the operation result of the previous layer and the plurality of weight values.
  • the plurality of weight values included by the plurality of neural network layers may be optimized by the learning result of the artificial intelligence model. For example, the plurality of weight values may be updated such that a loss value or a cost value obtained at the artificial intelligence model during a learning process is reduced or minimized.
  • FIGURE 14 is a diagram for illustrating a process wherein the electronic apparatus 100 updates a first knowledge graph according to an embodiment of the disclosure.
  • the electronic device 100 may identify whether the first text data was updated to the second text data at operation S1410.
  • the second text data means text data wherein some texts in the first text data were amended.
  • the electronic device 100 may identify that the first text data was updated to the second text data.
  • the electronic device 100 may identify that the first text data was updated to the second text data.
  • the electronic device 100 may extract a second relation text including a first entity text, a third entity text, and a plurality of words describing the relationship between the first entity text and the third entity text from the second text data.
  • the third entity text means an entity text newly included in the updated part in the first text data.
  • the second relation text means text data in a free form including a plurality of words describing the relationship between the first entity text and the third entity text in the updated part in the first text data.
  • the electronic device 100 may generate second triple data defining the relationship between the first entity text and the second entity text based on the first entity text, the third entity text, and the second relation text at operation S1430.
  • the second triple data means triple data wherein a subject is constituted with the first entity text, an object is constituted with the second entity text, and a relationship is constituted with the second relation text.
  • the electronic device 100 may update the first knowledge graph to the second knowledge graph based on the generated second triple data at operation S1440.
  • the second knowledge graph means a knowledge graph wherein the third entity text and the second relation text were added to the first knowledge graph.
  • the electronic device 100 may automatically update a knowledge graph based on the updated part.
  • a case wherein a relation text was changed in the first text data may exist. If it is identified that a part regarding the first relation text was updated in the first text data, the electronic device 100 may update the first relation text as an identified text in the first triple data. Then, the electronic device 100 may update the first knowledge graph based on the updated first triple data.
  • FIGURE 15 is a diagram for illustrating a process wherein the electronic device 100 provides a response to an input inquiry based on a knowledge graph according to an embodiment of the disclosure.
  • the electronic device 100 may receive input of an inquiry related to the first entity text at operation S1510.
  • the electronic device 100 may receive input of an inquiry related to the first entity text from a user.
  • the inquiry may be implemented as an inquiry in a voice form and an inquiry in a text form.
  • the electronic device 100 may identify whether information about a relationship matched to the inquiry is included in a prestored third knowledge graph at operation S1520.
  • the electronic device 100 may identify whether information about a relationship matched to the inquiry is included among a plurality of triple data included in the third knowledge graph.
  • the electronic device 100 may search triple data including the first entity text and a specific relationship connected to the first entity text in the third knowledge graph. For example, in case an inquiry inquiring about the age of 'a politician A' is input, the electronic device 100 may search triple data including the first entity text which is 'a politician A' in the third knowledge graph, and identify whether a relationship which is 'age' is connected to the first entity text among the searched triple data.
  • the third knowledge graph means a knowledge graph that is not the first knowledge graph (or an entity context graph) automatically generated by the electronic device 100 according to the disclosure, but a knowledge graph generated by a conventional method and stored in the memory 130 in advance.
  • the electronic device 100 may receive the third knowledge graph from an external server that generated the graph, and store it in the memory 130.
  • the electronic device 100 may provide a response to the inquiry by using the information about a relationship matched to the inquiry in the third knowledge graph at operation S1530.
  • the electronic device 100 may provide a response to the inquiry by using at least one of the first triple data included in the first knowledge graph at operation S1540.
  • the electronic device 100 may search triple data including the first entity text among the first triple data, and search information connected with the first entity text in a specific relationship among the searched triple data.
  • the electronic device 100 may provide a response to the inquiry by using the ultimately searched triple data.
  • the electronic device 100 may add the information about a relationship matched to the inquiry among the first triple data to the third knowledge graph and thereby update the third knowledge graph at operation S1550. That is, the electronic device 100 may update the third knowledge graph by adding the information about a relationship matched to the inquiry to the third knowledge graph.
  • the electronic device 100 may provide a response to the inquiry by using the triple data including the first entity text which is 'a politician A' and information connected with the first entity text by a relationship which is 'age' among the triple data included in the first knowledge graph.
  • the electronic device 100 may update the third knowledge graph by adding the first entity text which is 'a politician A' and information connected with the first entity text by a relationship which is 'age' to the third knowledge graph.
  • FIGURE 16 is a flow chart for illustrating a process wherein the electronic device 100 updates the third knowledge graph based on the generated first knowledge graph according to an embodiment of the disclosure.
  • the electronic device 100 may identify a fourth entity text related with the first entity text in the first knowledge graph by using data corresponding to the first entity text at operation S1610. For example, the electronic device 100 may store vector data wherein each of a plurality of entity texts included in the first knowledge graph is embedded. The electronic device 100 may identify vector data corresponding to the fourth entity text of which similarity to the vector data corresponding to the first entity text exceeds a threshold among the plurality of entities.
  • the electronic device 100 may identify information about a relationship connected to the fourth entity text in the first knowledge graph at operation S1620. Specifically, the electronic device 100 may search triple data including the fourth entity text as a subject or an object among the plurality of triple data included in the first knowledge graph. Then, the electronic device 100 may identify a relationship connected with the fourth entity text and another entity text connected with the fourth entity text by a specific relationship by using the searched triple data.
  • the electronic device 100 may identify whether information about the identified relationship is included in the third knowledge graph at operation S1630. That is, the electronic device 100 may identify whether a relationship connected with the fourth entity text and another entity text connected with the fourth entity text by a specific relationship are included in the third knowledge graph.
  • the electronic device 100 may repeat the operation S1610.
  • the electronic device 100 may identify an entity text connected with the another entity text in the first knowledge graph by using data corresponding to the another entity text.
  • the electronic device 100 may update the third knowledge graph by adding the information about the identified relationship to the third knowledge graph at operation S1640.
  • the electronic device 100 may update the third knowledge graph by adding a relationship connected to the fourth entity text and another entity text connected with the fourth entity text by a specific relationship to the third knowledge graph.
  • the electronic device 100 and an external device may include, for example, at least one of a smartphone, a tablet PC, a desktop PC, a laptop PC, a netbook computer, a server, a PDA, a medical device, or a wearable device.
  • the electronic device may include, for example, at least one of a television, a refrigerator, an air conditioner, an air purifier, a set top box, or a media box (e.g.: Samsung HomeSyncTM, Apple TVTM, or Google TVTM).
  • a television e.g.: a TV, a refrigerator, an air conditioner, an air purifier, a set top box, or a media box (e.g.: Samsung HomeSyncTM, Apple TVTM, or Google TVTM).
  • the user equipment can include any number of each component in any suitable arrangement.
  • the figures do not limit the scope of this disclosure to any particular configuration(s).
  • figures illustrate operational environments in which various user equipment features disclosed in this patent document can be used, these features can be used in any other suitable system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé, un dispositif électronique et un support lisible par ordinateur pour des intégrations de relations d'entités à l'aide de graphiques d'entités générés automatiquement au lieu d'un graphique de connaissances classique. Selon l'invention, un procédé consiste à recevoir, par un processeur, un texte d'entrée. Le procédé comprend également l'identification d'une entité primaire, d'une entité secondaire et d'un contexte à partir du texte d'entrée, le contexte comprenant une relation entre l'entité primaire et l'entité secondaire. Le procédé comprend en outre la génération, par le processeur, d'un graphique de contexte d'entité fondé sur l'entité primaire, l'entité secondaire, et le contexte au moyen des étapes suivantes : l'extraction, à partir du contexte, d'un ou plusieurs segments de texte comprenant une pluralité de mots décrivant une ou plusieurs relations supplémentaires entre l'entité primaire et l'entité secondaire, et la génération d'une pluralité de triplets de contexte à partir du ou des segments de texte, chacun de la pluralité de triplets de contexte définissant une relation respective entre l'entité primaire et l'entité secondaire.
PCT/KR2021/000579 2020-02-03 2021-01-15 Système et procédé pour la compréhension et l'extraction efficaces d'une entité multi-relationnelle WO2021157897A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US202062969515P 2020-02-03 2020-02-03
US62/969,515 2020-02-03
US16/900,664 US11687570B2 (en) 2020-02-03 2020-06-12 System and method for efficient multi-relational entity understanding and retrieval
US16/900,664 2020-06-12
KR10-2020-0106571 2020-08-24
KR1020200106571A KR20210098820A (ko) 2020-02-03 2020-08-24 전자 장치, 전자 장치의 제어 방법 및 판독 가능한 기록 매체

Publications (1)

Publication Number Publication Date
WO2021157897A1 true WO2021157897A1 (fr) 2021-08-12

Family

ID=77200703

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/000579 WO2021157897A1 (fr) 2020-02-03 2021-01-15 Système et procédé pour la compréhension et l'extraction efficaces d'une entité multi-relationnelle

Country Status (1)

Country Link
WO (1) WO2021157897A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116049326A (zh) * 2022-12-22 2023-05-02 广州奥咨达医疗器械技术股份有限公司 医疗器械知识库构建方法、电子设备及存储介质
CN116186295A (zh) * 2023-04-28 2023-05-30 湖南工商大学 基于注意力的知识图谱链接预测方法、装置、设备及介质
CN116610820A (zh) * 2023-07-21 2023-08-18 智慧眼科技股份有限公司 一种知识图谱实体对齐方法、装置、设备及存储介质
CN117610562A (zh) * 2024-01-23 2024-02-27 中国科学技术大学 一种结合组合范畴语法和多任务学习的关系抽取方法
CN117633328A (zh) * 2024-01-25 2024-03-01 武汉博特智能科技有限公司 基于数据挖掘的新媒体内容监测方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030236764A1 (en) * 2002-06-19 2003-12-25 Lev Shur Data architecture to support shared data resources among applications
KR20150132860A (ko) * 2013-03-15 2015-11-26 로버트 해드독 지식으로의 원-스텝 액세스를 제공하는 적응적 사용자 인터페이스를 갖춘 지능형 인터넷 시스템
US20160012110A1 (en) * 2014-07-08 2016-01-14 International Business Machines Corporation General and automatic approach to incrementally computing sliding window aggregates in streaming applications
US20190287006A1 (en) * 2018-03-16 2019-09-19 Accenture Global Solutions Limited Integrated monitoring and communications system using knowledge graph based explanatory equipment management
US20190312869A1 (en) * 2018-04-05 2019-10-10 Accenture Global Solutions Limited Data security and protection system using distributed ledgers to store validated data in a knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030236764A1 (en) * 2002-06-19 2003-12-25 Lev Shur Data architecture to support shared data resources among applications
KR20150132860A (ko) * 2013-03-15 2015-11-26 로버트 해드독 지식으로의 원-스텝 액세스를 제공하는 적응적 사용자 인터페이스를 갖춘 지능형 인터넷 시스템
US20160012110A1 (en) * 2014-07-08 2016-01-14 International Business Machines Corporation General and automatic approach to incrementally computing sliding window aggregates in streaming applications
US20190287006A1 (en) * 2018-03-16 2019-09-19 Accenture Global Solutions Limited Integrated monitoring and communications system using knowledge graph based explanatory equipment management
US20190312869A1 (en) * 2018-04-05 2019-10-10 Accenture Global Solutions Limited Data security and protection system using distributed ledgers to store validated data in a knowledge graph

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MESGAR MOHSEN, STRUBE MICHAEL: "A Neural Local Coherence Model for Text Quality Assessment Heidelberg Institute for Theoretical Studies (HITS) and Research Training Group AIPHES", PROCEEDINGS OF THE 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 4 November 2018 (2018-11-04), pages 4328 - 4339, XP055833664, Retrieved from the Internet <URL:https://aclanthology.org/D18-1464.pdf> *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116049326A (zh) * 2022-12-22 2023-05-02 广州奥咨达医疗器械技术股份有限公司 医疗器械知识库构建方法、电子设备及存储介质
CN116049326B (zh) * 2022-12-22 2024-03-08 广州奥咨达医疗器械技术股份有限公司 医疗器械知识库构建方法、电子设备及存储介质
CN116186295A (zh) * 2023-04-28 2023-05-30 湖南工商大学 基于注意力的知识图谱链接预测方法、装置、设备及介质
CN116610820A (zh) * 2023-07-21 2023-08-18 智慧眼科技股份有限公司 一种知识图谱实体对齐方法、装置、设备及存储介质
CN116610820B (zh) * 2023-07-21 2023-10-20 智慧眼科技股份有限公司 一种知识图谱实体对齐方法、装置、设备及存储介质
CN117610562A (zh) * 2024-01-23 2024-02-27 中国科学技术大学 一种结合组合范畴语法和多任务学习的关系抽取方法
CN117633328A (zh) * 2024-01-25 2024-03-01 武汉博特智能科技有限公司 基于数据挖掘的新媒体内容监测方法及系统
CN117633328B (zh) * 2024-01-25 2024-04-12 武汉博特智能科技有限公司 基于数据挖掘的新媒体内容监测方法及系统

Similar Documents

Publication Publication Date Title
WO2021157897A1 (fr) Système et procédé pour la compréhension et l&#39;extraction efficaces d&#39;une entité multi-relationnelle
US11704492B2 (en) Method, electronic device, and storage medium for entity linking by determining a linking probability based on splicing of embedding vectors of a target and a reference text
US11687570B2 (en) System and method for efficient multi-relational entity understanding and retrieval
WO2019225837A1 (fr) Procédé d&#39;apprentissage de vocabulaire personnalisé inter-domaines et dispositif électronique associé
US11593364B2 (en) Systems and methods for question-and-answer searching using a cache
WO2021132927A1 (fr) Dispositif informatique et procédé de classification de catégorie de données
WO2020111647A1 (fr) Apprentissage continu basé sur des tâches multiples
EP3811234A1 (fr) Dispositif électronique et procédé de commande du dispositif électronique
CN111737559B (zh) 资源排序方法、训练排序模型的方法及对应装置
WO2017209571A1 (fr) Procédé et dispositif électronique de prédiction de réponse
WO2015050321A1 (fr) Appareil pour générer un corpus d&#39;alignement basé sur un alignement d&#39;auto-apprentissage, procédé associé, appareil pour analyser un morphème d&#39;expression destructrice par utilisation d&#39;un corpus d&#39;alignement et procédé d&#39;analyse de morphème associé
WO2020190103A1 (fr) Procédé et système de fourniture d&#39;objets multimodaux personnalisés en temps réel
US20210049203A1 (en) Methods and systems for depth-aware image searching
WO2021080175A1 (fr) Procédé de traitement de contenu
WO2017115994A1 (fr) Procédé et dispositif destinés à fournir des notes au moyen d&#39;un calcul de corrélation à base d&#39;intelligence artificielle
WO2019107674A1 (fr) Appareil informatique et procédé d&#39;entrée d&#39;informations de l&#39;appareil informatique
KR20210145811A (ko) 지리적 위치를 검색하는 방법, 장치, 기기 및 컴퓨터 기록 매체
WO2021246642A1 (fr) Procédé de recommandation de police de caractères et dispositif destiné à le mettre en œuvre
WO2022030670A1 (fr) Système et procédé d&#39;apprentissage profond par cadre utilisant une requête
KR20210098820A (ko) 전자 장치, 전자 장치의 제어 방법 및 판독 가능한 기록 매체
WO2020141706A1 (fr) Procédé et appareil pour générer des phrases en langage naturel annotées
WO2022250354A1 (fr) Système de récupération d&#39;informations et procédé de récupération d&#39;informations
WO2022060066A1 (fr) Dispositif électronique, système et procédé de recherche de contenu
WO2020246862A1 (fr) Procédé et appareil d&#39;interaction avec un système de réponse intelligent
WO2024029939A1 (fr) Procédé permettant de construire une base de données d&#39;esg contenant des données de guide esg structurées à l&#39;aide d&#39;un outil auxiliaire de guide esg, et système de fourniture de service de guide esg pour sa mise en œuvre

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21749982

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21749982

Country of ref document: EP

Kind code of ref document: A1