WO2021098491A1 - Procédé, appareil et terminal de génération de graphe de connaissances, et support de stockage - Google Patents

Procédé, appareil et terminal de génération de graphe de connaissances, et support de stockage Download PDF

Info

Publication number
WO2021098491A1
WO2021098491A1 PCT/CN2020/125592 CN2020125592W WO2021098491A1 WO 2021098491 A1 WO2021098491 A1 WO 2021098491A1 CN 2020125592 W CN2020125592 W CN 2020125592W WO 2021098491 A1 WO2021098491 A1 WO 2021098491A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
translated
name
target
relationship
Prior art date
Application number
PCT/CN2020/125592
Other languages
English (en)
Chinese (zh)
Inventor
陈开济
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021098491A1 publication Critical patent/WO2021098491A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device, terminal, and storage medium for generating a knowledge graph based on artificial intelligence (AI).
  • AI artificial intelligence
  • Knowledge graph also known as semantic network, uses visualization technology to describe knowledge resources and their carriers, mines, analyzes, constructs, draws and displays knowledge and their interconnections.
  • the knowledge graph is used as a carrier to gather diverse knowledge resources and provide knowledge references for artificial intelligence decision-making. Therefore, the depth and accuracy of each knowledge resource in the knowledge graph directly affects artificial intelligence Accuracy of processing results.
  • the existing knowledge graph generation method is mainly based on a single language construction. The knowledge graphs between different languages are independent of each other, thereby reducing the depth of the knowledge graph. When other languages are used as the input of artificial intelligence, the processing results will be greatly reduced. The accuracy rate affects the quality of service response.
  • the embodiment of the application provides a method, device, terminal and storage medium for generating a knowledge graph, which can solve the existing knowledge graph generation technology.
  • processing different vehicle service requests they are all handled by the same server, which is easy This leads to processing logic conflicts, increases the service response time and reduces the success rate of service responses.
  • an embodiment of the present application provides a method for generating a knowledge graph, including:
  • the number of appearances of each co-representation associated with the alias name is counted, and the high-frequency co-representation is selected based on the number of appearances, and the natural language generation algorithm based on artificial intelligence (Natural Language Generation, NLG) combines the alias name with each high-frequency common entity to obtain the source language sentence.
  • NLG Natural Language Generation
  • the determining the translated name of each alias name of the target entity in the target language, and generating the translation relationship of the target entity according to the alias name and the translated name include:
  • the separately obtaining source language sentences containing each of the alias names includes:
  • one sentence template may be configured for each alias name based on a random allocation algorithm, thereby generating multiple source language sentences.
  • the extracting the translated names of the alias names in the target language from each of the target language sentences respectively includes:
  • the target language sentence contains the phrase corresponding to the target entity, identifying the target language sentence as a valid sentence
  • the separately generating the co-occurrence relationship of each of the alias names in the target entity through a preset corpus includes:
  • the co-occurrence relationship between the alias name and the associated entity is obtained.
  • the method for generating the knowledge graph further includes:
  • the calculating the matching degree between the sentence to be translated and the translated name according to the entity relationship and the co-occurrence relationship of the translated name includes:
  • the matching degree calculation function is specifically:
  • sim entity (ei,ej) ⁇ p ⁇ Prop(ei) ⁇ Prop(ej) ⁇ p Simlarity type(p) (ei[p],ej[p])
  • Sim (E1, E2) is the degree of matching between the entity to be translated and the translated name
  • Context (E1) is the co-occurrence of the entity to be translated E1 in the knowledge graph The associated entity included in the relationship
  • Context(E2) is the associated entity included in the co-occurrence relationship of the translated name E2
  • ei is the i-th associated entity in the co-occurrence relationship of the entity to be translated E1
  • ej is the j-th associated entity in the co-occurrence relationship of the translated name E2
  • Prop(ei) is the entity type of the i-th associated entity in the co-occurrence relationship of the entity E1 to be translated
  • Prop (ej) is the entity type of the j-th associated entity in the co-occurrence relationship of the translated name E2
  • ⁇ p is the weight value corresponding to the entity type
  • Similarity type(p) (ei[p], ej[p]) is the matching degree function corresponding to the entity type
  • ei[p] is the parameter value of the
  • the method for generating the knowledge graph further includes:
  • an apparatus for generating a knowledge graph including:
  • the translation relationship establishment unit is used to establish the translation relationship of multiple alias names of the target entity based on the target language
  • the co-occurrence relationship generation unit is configured to generate the co-occurrence relationship of each of the alias names in the target entity through a preset corpus;
  • the knowledge graph construction unit is configured to construct a knowledge graph according to the translation relationship and the co-occurrence relationship corresponding to all the target entities.
  • embodiments of the present application provide a terminal device, a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the The computer program implements the method for generating the knowledge graph in any one of the above-mentioned first aspects.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and is characterized in that, when the computer program is executed by a processor, any of the above-mentioned aspects of the first aspect is implemented.
  • a method for generating the knowledge graph is provided.
  • the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the method for generating the knowledge graph in any one of the above-mentioned first aspects.
  • the embodiment of the application obtains the translated name of each alias name of the target entity in other languages, where the target entity can be identified as a knowledge node, and according to the correspondence between each alias name and the translated name, generates the target entity’s information about the target language Translate the relationship, and establish the co-occurrence relationship of each alias name in the target entity through the corpus, to mine the relationship between each alias name of the target entity and other entities, to expand the depth of association of each knowledge node in the knowledge graph, according to all
  • the translation relationship and co-occurrence relationship of target entities realize the purpose of constructing a knowledge graph that supports multiple languages.
  • the embodiment of this application can establish a transfer relationship for each knowledge node in the knowledge graph, that is, the target entity, to connect the knowledge nodes between different languages, and expand each knowledge node by constructing a co-occurrence relationship.
  • the depth of knowledge of the knowledge node is not limited to the attributes of the target entity itself, it improves the associative ability of each knowledge node, and the breadth and depth of the knowledge graph, thereby improving the accuracy of artificial intelligence output results and improving the quality of service response.
  • FIG. 1 is an implementation flowchart of a method for generating a knowledge graph provided by the first embodiment of the present application
  • Figure 2 is an entity diagram of the translation relationship of the target entity provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of a co-occurrence relationship provided by an embodiment of the present application.
  • Fig. 5 is a structural block diagram of a neural machine translation model provided by an embodiment of the present application.
  • FIG. 6 is a specific implementation flow chart of a method S1011 for generating a knowledge graph provided by the third embodiment of the present application.
  • FIG. 7 is a specific implementation flowchart of a method S1013 for generating a knowledge graph provided by the fourth embodiment of the present application.
  • FIG. 8 is a specific implementation flowchart of a method S102 for generating a knowledge graph provided by the fifth embodiment of the present application.
  • FIG. 9 is a specific implementation flowchart of a method for generating a knowledge graph provided by the sixth embodiment of the present application.
  • FIG. 10 is a flowchart of translation based on a knowledge graph provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a translation system based on a knowledge graph provided by an embodiment of the present application.
  • FIG. 12 is a corresponding interaction flowchart of each unit in the apparatus for generating a knowledge graph provided by an embodiment of the present application when responding to a translation operation;
  • FIG. 13 is a specific implementation flowchart of a method for generating a knowledge graph provided by a seventh embodiment of the present application.
  • FIG. 14 is a structural block diagram of a device for generating a knowledge graph provided by an embodiment of the present application.
  • FIG. 15 is a schematic diagram of a terminal device provided by another embodiment of the present application.
  • the term “if” can be construed as “when” or “once” or “in response to determination” or “in response to detecting “.
  • the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
  • the method for generating the knowledge graph can be applied to mobile phones, tablet computers, wearable devices, in-vehicle devices, augmented reality (AR)/virtual reality (VR) devices, notebook computers, and super mobiles.
  • Personal computers (ultra-mobile personal computers, UMPC), netbooks, personal digital assistants (personal digital assistants, PDAs) and other terminal devices can also be applied to databases, servers, and service response systems based on terminal artificial intelligence. Examples of this application There are no restrictions on the specific types of terminal equipment.
  • the execution subject of the process is the generating device of the knowledge graph.
  • the device for generating a knowledge graph can be specifically a database server for receiving knowledge resources input by users or knowledge resources obtained from other databases, and generating a knowledge graph based on all the received knowledge data for Support the related logic operations of terminal artificial intelligence.
  • Fig. 1 shows an implementation flow chart of the method for generating a knowledge graph provided by the first embodiment of the present application, and the details are as follows:
  • the translated name of each alias name of the target entity in the target language is determined, and the translation relationship of the target entity is generated according to the alias name and the translated name.
  • an entity also referred to as an object
  • an entity can specifically be an objectively existing object, concept or virtual object that can be interacted and operated.
  • computers, mobile phones, servers, etc. are objectively existing objects
  • Virtual objects that exist in the field of electronic information such as databases, middleware, and software programs can also belong to entities.
  • Different entities may have multiple alias names according to different usage scenarios, and the above alias names are used to indicate the same entity object. For example, for the entity "orange”, there are other alias names used to indicate the same entity, such as "citrus" and "orange”, that is, there are three alias names for the entity "orange” mentioned above.
  • the generating device can obtain the alias name corresponding to each entity through user input, database download, corpus-based intelligent learning, etc.
  • a corresponding name list can be established for each entity, and the name list is stored There is an alias name of the target entity.
  • all the alias names in the name list are specifically based on the alias names in the same language.
  • the above examples of “citrus”, “mandarin” and “mandarin orange” are based on the alias names corresponding to the Chinese language, and
  • For the entity “Orange” in English there can be three different terms “orange”, “tangerine” and “citrus”, and it is constructed based on the three alias names of "orange”, “tangerine” and “citrus”
  • the generating device can set a certain language as an active language, and obtain a name list of each entity based on the source language, and the name list contains all the alias names of the above entities based on the source language.
  • the device for generating the knowledge graph establishes the transfer relationship
  • another language different from the source language can be selected as the target language, and the translated name corresponding to each alias name in the target language can be determined.
  • the method for obtaining the translated name of the alias name may be to determine the translated name associated with the alias name through a preset translation algorithm between the source language and the target language.
  • the device for generating the knowledge graph can obtain multiple reference texts containing alias names, obtain the target language-based translation text of each reference text, and locate the alias name from each translation text For the corresponding phrase, identify the phrase as a candidate translated name of the alias name, and count the number of occurrences of each candidate translated name in all translated texts, and identify the translated name corresponding to the alias name according to the number of occurrences, for example, select the occurrence probability to be greater than the preset probability threshold The candidate translated name of is used as the translated name of the alias name; or the candidate translated name with the highest occurrence probability is selected as the translated name corresponding to the alias name.
  • an alias name based on the source language can have multiple translated names in the target language.
  • different alias names when different alias names are mapped to the target language, they can also correspond to the same translated name.
  • the generating device may use the alias name as the node, establish a mapping relationship between each alias name and the associated translated name, and construct the translation relationship of the target entity by all the mapping relationships established above.
  • each node in the knowledge graph will combine the alias names of all languages into the same node, and different alias names cannot be determined.
  • the mapping relationship between each other will reduce the accuracy of the output results in scenarios such as translation or semantic analysis.
  • this application can establish an independent knowledge node for each alias name, and record its corresponding translated name in the knowledge node, thereby constructing a mapping relationship between the translated name and the alias name.
  • FIG. 2 shows an entity diagram of a translation relationship of a target entity provided by an embodiment of the present application.
  • the entity “Orange” has three different alias names under the Chinese grammar, namely “Orange”, “Orange” and “Citrus”.
  • “Orange” and “Orange” will be translated as “orange”
  • “Orange” has two translated names, namely "tangerine” and "citrus”.
  • the mapping relationship between Chinese and English of each alias name can be established, so that all the mapping relationships are aggregated to obtain the translation relationship corresponding to the target entity.
  • the corpus can be stored in the knowledge graph generation device.
  • the generation device can obtain the text data pre-stored in the corpus by local calling, and generate the co-occurrence relationship through the text data; the corpus can also be stored
  • the knowledge graph generation device can establish a communication connection with the corpus server, and generate data query instructions about the target entity, and send the data query instructions to the corpus server, and the corpus server receives the data After the query instruction, all text data including the target entity can be extracted and fed back to the knowledge graph generating device.
  • the corpus server can extract the text data The sentence or paragraph containing the target entity is fed back to the generating device without sending other paragraphs or sentences that do not contain the target entity to the generating device, thereby improving the accuracy of subsequent co-occurrence relationship establishment operations.
  • the device for generating the knowledge graph obtains the training sentence containing the target entity through the corpus, and the entity labeling algorithm identifies the associated entity contained in each training sentence, and according to the appearance of the target entity in the current training sentence Create an association relationship between the alias name and each associated entity, thereby generating the co-occurrence relationship of the alias name.
  • the training sentence extracted from the corpus can be a sentence containing the target entity appearing under each alias name.
  • each training sentence can be divided into different sentence groups according to different alias names, and the alias names for the target entities in the same sentence group are consistent, and then the co-occurrence relationship corresponding to the alias names can be determined through the sentence group.
  • FIG. 3 shows a schematic diagram of a co-occurrence relationship provided by an embodiment of the present application.
  • a certain target entity is "National Stadium", and the target entity has two alias names, namely "National Stadium” and "Bird's Nest".
  • a training sentence is stored in the corpus as "Bird's Nest is located in the Water Cube”. Opposite is the 2008 Beijing Olympic Stadium.
  • the entities other than “Bird’s Nest” in the training sentence can be identified as “Water Cube”, “Gymnasium”, “Beijing” and “Olympics”.
  • the device for generating the knowledge graph when the device for generating the knowledge graph establishes the co-occurrence relationship, it also constructs the co-occurrence relationship based on the alias name, that is, distinguish the co-occurrence relationship of different alias names, and distinguish the co-occurrence relationship of different alias names.
  • the existing relationship can determine the common usage scenarios of each alias name and other related entity objects. While improving the accuracy of translation operations, it has high application value in the fields of information recommendation and word association, so that each alias can be mined.
  • the associated entity of the name increases the depth of the knowledge graph.
  • the number of occurrences of the associated entity may be multiple times, and the device for generating the knowledge graph is establishing the target entity and In the co-occurrence relationship between each associated entity, the number of sentences that appear together with each associated entity and the target entity can be counted, that is, the number of co-occurrences, and corresponding associated weights are configured for each associated object based on the number of co-occurrences.
  • the number of co-occurrences may be marked on the connecting line between the target entity and the associated entity.
  • the device for generating the knowledge graph can perform the operations of S101 and S102 on all target entities, establish the translation relationship of each target entity, and the co-occurrence relationship of each alias name of the target entity, and set it in the preset
  • the alias name is used as the granularity page to create an independent knowledge node for each alias name
  • the co-occurrence relationship and the translated name corresponding to the alias name are added to the knowledge node corresponding to the alias name
  • the knowledge node corresponding to each alias name is added
  • Encapsulate the knowledge node of the corresponding target entity and create the knowledge node corresponding to the target entity on the page with the granularity of the entity, and construct the knowledge graph according to the association relationship between each target entity.
  • the knowledge graph includes at least two levels, the first graph level with the entity as the granularity, and the second graph level with the alias name as the granularity.
  • the user can click on any target entity on the first graph level, and the knowledge graph will switch to the second graph level with the alias name as the granularity, and display the semantic network of each alias name under the target recognition in the second graph level .
  • the method for generating a knowledge graph obtains the translated names of each alias name of the target entity in other languages, where the target entity can be identified as a knowledge node, and according to each alias name and the translated name Correspondence between the names, generate the translation relationship of the target entity with respect to the target language, and establish the co-occurrence relationship of each alias name in the target entity through the corpus to mine the association relationship between each alias name of the target entity and other entities. Expand the depth of association of each knowledge node in the knowledge graph, and realize the purpose of constructing a knowledge graph that supports multiple languages according to the translation relationship and co-occurrence relationship of all target entities.
  • the embodiment of this application can establish a transfer relationship for each knowledge node in the knowledge graph, that is, the target entity, to connect the knowledge nodes between different languages, and expand each knowledge node by constructing a co-occurrence relationship.
  • the depth of knowledge of the knowledge node is not limited to the attributes of the target entity itself, it improves the associative ability of each knowledge node, and the breadth and depth of the knowledge graph, thereby improving the accuracy of artificial intelligence output results and improving the quality of service response.
  • FIG. 4 shows a specific implementation flow chart of a method S101 for generating a knowledge graph provided by the second embodiment of the present application.
  • S101 in a method for generating a knowledge graph provided by this embodiment includes: S1011 to S1014, which are detailed as follows:
  • the device for generating the knowledge graph can extract source language sentences containing each alias name from the corpus corresponding to the source language, that is, each source language sentence is recorded in the historical text data.
  • the generating device may also be provided with a sentence template, import each alias name into the sentence template, and output the source language sentence corresponding to each alias name.
  • the device for generating a knowledge graph can count the number of occurrences of each co-real entity associated with the alias name according to the co-occurrence relationship corresponding to the alias name, and select high-frequency co-occurrences based on the number of occurrences.
  • the source language sentence is obtained by combining the alias name with each high-frequency co-real body through the natural language generation algorithm (NLG) based on artificial intelligence.
  • the device for generating the knowledge graph can select any language other than the source language as the target language, and obtain a translation model between the source language and the target language.
  • the translation model can be generated based on a machine translation (MT) algorithm.
  • MT machine translation
  • the MT algorithm uses computer programs or computer-readable instructions to translate one natural language text (source language) into another natural language text (target language).
  • source language natural language text
  • target language natural language text
  • neural Machine Translation NMT
  • NMT neural Machine Translation
  • NMT neural Machine Translation
  • NMT can construct a translation model through Long Short-Term Memory-Recurrent Neural Network (LSTM-RNN).
  • the translation model is good at modeling natural language and transforming sentences of any length into specific dimensions.
  • the floating-point number vector converts text data into vector data so that computer programs can "understand” the semantics of the text and translate sentences based on the semantics.
  • the generating device can import the obtained source language sentence into the translation model, and output the corresponding target language sentence.
  • the way to output the target language sentence can be: divide the source language sentence into multiple phrases, and import each phrase into the coding module in the NMT model to obtain each The encoding value corresponding to the phrase generates a sentence vector about the source language sentence, obtains the decoding module of the target language, and uses the generated sentence vector as the input vector of the encoding module to generate the target language sentence.
  • Fig. 5 shows a structural block diagram of a neural machine translation model provided by an embodiment of the present application.
  • the NMT model includes an encoding module Encoder based on the source language and a decoding module Decoder based on the target language. Each word in the original target language is mapped to the corresponding vector value according to the word meaning, and the decoding module recognizes the The vector value is associated with the word in the target language to complete the translation operation.
  • the device for generating the knowledge graph can mark the phrase corresponding to each entity contained in the target language sentence through the entity tagging algorithm corresponding to the target language, and select the phrase corresponding to the target entity as the alias name under the target language The translated name.
  • the translated name is the name based on the semantic output of the entire sentence, and the context and current language Context matching can improve the accuracy of translation, especially when the target entity has multiple translated names in the target language, it can accurately determine the translated name associated with the target entity under the alias name of the current translation.
  • the device for generating the knowledge graph determines the translated name associated with the alias name, the translation relationship between the above two can be established.
  • the translated name corresponding to the alias name can be determined based on the context and the actual use context, and the translation relationship can be established, which can improve the accuracy of the translation relationship.
  • FIG. 6 shows a specific implementation flowchart of a method S1011 for generating a knowledge graph provided by the third embodiment of the present application.
  • S1011 in a method for generating a knowledge graph provided by this embodiment includes: S601 to S602, which are detailed as follows:
  • the obtaining the source language sentences containing each of the alias names respectively includes:
  • the device for generating the knowledge graph can manually configure corresponding sentence templates for different entity types, and build a sentence template library.
  • the device for generating the knowledge graph may use a remote supervision algorithm to identify entities contained in each training text from the corpus, determine the entity type of each entity, select multiple training texts with the same entity type, and identify each training text Corresponding sentence structure, selecting a sentence structure whose occurrence number of sentence structure is greater than a preset occurrence threshold as a common structure corresponding to the entity type, and generating at least one sentence template about the entity type based on the common structure.
  • the device for generating the knowledge graph extracts sentence templates matching the entity type from the sentence template library according to the entity type corresponding to the target entity associated with the alias name.
  • the number of sentence templates can be one or more.
  • multiple sentence templates matching the number of alias names can be extracted, and a separate configuration for each alias name
  • the sentence template for each alias name can be assigned differently.
  • the sentence template is provided with an import area of the entity type, and the knowledge graph generation device can import the alias name into the preset import area in the sentence template, thereby generating a sentence with complete meaning, that is, the aforementioned source Language statements.
  • each alias name can be imported into the same sentence template to generate multiple source language sentences with different alias names but the same other content.
  • a sentence template is "this is a [fruit type entity] tree”
  • the target entity is "orange”
  • the entity type of the target entity is fruit type, that is, it matches the sentence template above
  • the The target entity has three alias names, namely "Orange”, “Orange” and "Citrus”.
  • one sentence template may be configured for each alias name based on a random allocation algorithm, thereby generating multiple source language sentences.
  • the number of sentence templates for fruit type entities is 3, which are "this is a [fruit type entity] tree", "eat some [fruit type entity]”, and "buy a [fruit type entity] ]”, then import the three alias names of the target entity "Orange” into any of the above sentence templates, and you can get "This is a [orange] tree", "eat some [citrus]” and " Buy an [orange]”.
  • each sentence template Preferably, other entities included in each sentence template are identified, and the number of occurrences of each other entity is identified from the co-occurrence relationship corresponding to the alias name, the matching degree between the sentence template and the alias name is calculated based on the number of occurrences, and the highest matching degree is selected.
  • import the alias name into the statement template to generate a source language statement As a statement template associated with the alias name, import the alias name into the statement template to generate a source language statement.
  • multiple source language sentences can be output for each alias name, that is, the same alias name is imported into each sentence template to generate multiple source language sentences of the alias name. For example, if the number of sentence templates is M and the number of alias names is N, then M*N source language sentences can be output.
  • FIG. 7 shows a specific implementation flowchart of a method S1013 for generating a knowledge graph provided by the fourth embodiment of the present application.
  • S1013 in a method for generating a knowledge graph provided by this embodiment includes: S701 to S702, which are detailed as follows:
  • extracting the translated names of the alias names in the target language from each of the target language sentences respectively includes:
  • the target language sentence contains the phrase corresponding to the target entity, then the target language sentence is identified as a valid sentence.
  • the generating device of the knowledge graph can filter the generated target language sentences, delete the target language sentences that do not contain the target object, and only translate the name of the target language sentences containing the target entity To improve the accuracy of the translated name recognition. Because in the process of translating the source language sentence into the target language sentence, the alias name and the adjacent characters in the sentence template may be combined to form new words, resulting in the ambiguity of the source language sentence in the translation process, resulting in An error occurs when converting to the same vector code, and the output target language sentence may not contain the target entity.
  • the alias name of a target entity is "sentence", and importing "sentence” into a sentence template constitutes "generating sentence”.
  • "idiom” may be recognized as A phrase splits the target entity of "sentence”, resulting in that the translated sentence in the target language does not have the target entity.
  • the device for generating the knowledge graph can identify the entities contained in each target language sentence. If the target language sentence does not contain the target entity, then the target language sentence is identified as an invalid sentence; otherwise, if the target language sentence is If the target entity is included in the target language sentence, the target language sentence is identified as a valid sentence, and the phrase corresponding to the target entity in the target language sentence is marked.
  • the device for generating the knowledge graph can identify the source language sentence corresponding to the invalid sentence, and determine the alias name corresponding to the source language sentence. If there are multiple sentence templates, the source language sentence is regenerated from another template different from the previous sentence template for the aforementioned alias name to re-identify the translated name corresponding to the alias name.
  • the generating device of the knowledge graph uses the phrase corresponding to the target entity in the effective sentence as the translated name of the alias name, and establishes the mapping relationship between the alias name and the translated name.
  • the recognition operation of the translated name can be made more accurate, thereby improving the accuracy of the transfer relationship.
  • FIG. 8 shows a specific implementation flowchart of a method S102 for generating a knowledge graph provided by the fifth embodiment of the present application.
  • S102 in a method for generating a knowledge graph provided by this embodiment includes: S1021 to S1023, which are detailed as follows:
  • the respectively generating the co-occurrence relationship of each of the alias names in the target entity through a preset corpus includes:
  • training texts collected from multiple different channels can be stored in the corpus.
  • the corpus can receive text data input by the user, such as articles imported by the user, interaction records of social applications (including chat records and interaction information), and can also automatically download text data from the Internet.
  • the generating device of the knowledge graph can identify the entities contained in the training text, establish the corresponding relationship between the entities and the training text, and establish the entity index table.
  • the device for generating the knowledge graph can extract the target text containing the target entity from the corpus based on the above-mentioned entity index table.
  • the device for generating the knowledge graph can locate the entities contained in the target text through a named entity recognition (NER) algorithm, and recognize entities other than the target entity as related entities of the target entity.
  • NER named entity recognition
  • a certain target text is specifically "The Bird's Nest is located opposite the Water Cube, which is the stadium of the 2008 Beijing Olympic Games", and the target entity is "Bird's Nest".
  • the NER algorithm can identify the entity contained in the above target text as “Bird's Nest", “Water Cube”, “Beijing”, “Olympics” and "WORK", therefore, it can be determined that other identifications other than "Bird's Nest" are related entities of the target entity "Bird's Nest".
  • the relationship between related entities is bidirectional, that is, the "Water Cube” is the related entity of the "Bird's Nest", and the "Bird's Nest” is also the related entity of the "Water Cube”.
  • the co-occurrence relationship between the alias name and the associated entity is obtained according to the alias name corresponding to the target entity in the target text.
  • the device for generating the knowledge graph can identify the alias name used by the target entity in the target text based on the source language, create a name node for the alias name, and create a co-occurrence relationship between the alias name and the associated entity. If there are multiple target texts for an alias name, all associated entities recorded in each target text can be added to the co-occurrence relationship corresponding to the name node.
  • the target text containing the alias name is extracted from the text data recorded in the corpus, and the co-occurrence relationship of the alias name is established based on the associated entities recorded in the target text, which realizes the name-granularity
  • the construction of the co-occurrence relationship can accurately identify the context and scene used by each alias name, thereby improving the accuracy of the response of the artificial intelligence service.
  • FIG. 9 shows a specific implementation flowchart of a method for generating a knowledge graph provided by the sixth embodiment of the present application.
  • the method for generating a knowledge graph provided by this embodiment further includes: S901 to S904, which are detailed as follows :
  • the method further includes:
  • a sentence to be translated based on the source language is received, and the entity to be translated included in the sentence to be translated is identified to construct an entity relationship of the sentence to be translated.
  • the knowledge graph generating device constructs a knowledge graph containing multiple target entities, it can use the knowledge graph to provide technical support for translation services, thereby improving translation quality.
  • the commonly used translation technology is the NMT model based on LSTM-RNN.
  • the NMT model can adopt an end-to-end translation scheme.
  • the encoding module-decoding module model converts the source language sentence into a hidden state vector, and then uses the decoding module of the target language to convert the hidden state The vector is converted into natural language text based on the target language.
  • FIG. 10 shows a translation flow chart based on a knowledge graph provided by an embodiment of the present application.
  • the text data is first preprocessed, namely The text data is imported into the translation preprocessing module to identify the source language of the text data and the target language to which it needs to be translated.
  • the preprocessing module After determining the source language and the target language, the preprocessing module sends the above-identified information to the knowledge graph module to switch the knowledge graph to the detection mode corresponding to the source language, that is, select the natural language understanding corresponding to the source language ( Natural Language Understanding (NLU) algorithm, through the knowledge graph module combined with the knowledge data to perform NLU analysis on the text data, mark the entity contained in the text data, determine the entity name corresponding to the entity in the target language in the generated knowledge graph and return it to Preprocessing module.
  • NLU Natural Language Understanding
  • the preprocessing module removes the entities in the text data according to the entity list returned by the knowledge graph module, and replaces them with the agreed special characters.
  • the special characters can be determined according to the entity type, and the text data after replacing the special characters is sent to the NMT module Perform the standard translation process and obtain the translation results. The replaced special characters will be retained in the results to determine the correspondence between the entities in the text data and the entities in the translated text. Finally, merge the entity translation result returned by the knowledge graph and the original translation result returned by NMT to obtain the final translation result. It can be seen that if the knowledge graph is constructed with entity as the granularity, when obtaining the translated name of each entity in the text data in the target language, the translated name corresponding to different alias names will not be distinguished, thereby reducing the translation operation. accuracy.
  • this application is based on the granularity of the alias name to construct the translation relationship between the alias name and the translated name, so that the alias name used by the entity in the text data can be identified, and the alias name in the current text data can be determined.
  • the corresponding translated name so that the translated name matches the current context and grammatical habits, making the translated translation more accurate.
  • the device for generating the knowledge graph can perform semantic analysis on the sentence to be translated, identify the translation entities contained in the sentence to be translated through the NLU algorithm, and construct the entity relationship of all the identified translation entities with respect to the sentence to be translated.
  • the NLU algorithm can identify the translation entities including "China” and “National “Grand Theatre", “France”, “architect”, “Asia”, “theatre” and “complex” establish the co-occurrence relationship of the translation entities mentioned above, and the co-occurrence relationship is the entity relationship of the sentence to be translated.
  • a translation relationship corresponding to the entity to be translated based on the target language is extracted from the knowledge graph; the translation relationship includes at least one translated name of the entity to be translated.
  • the knowledge graph generating device can query the knowledge graph for the entity node corresponding to each translation entity, and extract the corresponding translation relationship from the entity node.
  • the translation relationship records at least one translated name of the translation entity.
  • the generating device of the knowledge graph can identify the alias name used in the sentence to be translated, and based on the difference between the alias name and the translated name The translation relationship between the two determines the target translation name corresponding to the translation entity in the sentence to be translated without performing the matching degree calculation operation of S903. If the translation relationship between each alias name of the translation entity and the translated name is not recorded in the knowledge graph, or one alias name corresponds to multiple translated names, perform the operation of S903 to determine the specific translated name used in the sentence to be translated name.
  • the degree of matching between the sentence to be translated and the translated name is calculated according to the entity relationship and the co-occurrence relationship of the translated name.
  • the device for generating the knowledge graph can determine the degree of matching between each translated name and the current sentence to be translated based on the entity relationship and the co-occurrence relationship of each translated name corresponding to the translation entity. Since the translated names used in different contexts are different, it is necessary to determine the degree of matching between each translated name and the sentence to be translated in the context of the sentence to be translated, so as to select the translated name that best fits the context. Thereby improving the accuracy of translation operations.
  • the way of calculating the degree of matching between the translated sentence and the translated name may be: the knowledge graph generating device can identify the entity to be translated corresponding to the translated name as the reference entity, and identify other entities in the entity relationship except for the reference entity The entity is identified as a reference entity, and it is judged whether there is a reference entity in the co-occurrence relationship of the translated name. If it exists, the co-occurrence relationship is used to determine the number of co-occurrences between the reference entity and the translated name, and based on the translated name and all references The number of co-occurrences between entities and the number of entities of reference entities that have a co-occurrence relationship determine the degree of matching between the sentence to be translated and the translated name.
  • S903 may specifically be:
  • the matching degree calculation function is specifically:
  • sim entity (ei,ej) ⁇ p ⁇ Prop(ei) ⁇ Prop(ej) ⁇ p Simlarity type(p) (ei[p],ej[p])
  • Sim (E1, E2) is the degree of matching between the entity to be translated and the translated name
  • Context (E1) is the co-occurrence of the entity to be translated E1 in the knowledge graph The associated entity included in the relationship
  • Context(E2) is the associated entity included in the co-occurrence relationship of the translated name E2
  • ei is the i-th associated entity in the co-occurrence relationship of the entity to be translated E1
  • ej is the j-th associated entity in the co-occurrence relationship of the translated name E2
  • Prop(ei) is the entity type of the i-th associated entity in the co-occurrence relationship of the entity E1 to be translated
  • Prop (ej) is the entity type of the j-th associated entity in the co-occurrence relationship of the translated name E2
  • ⁇ p is the weight value corresponding to the entity type
  • Similarity type(p) (ei[p], ej[p]) is the matching degree function corresponding to the entity type
  • ei[p] is the parameter value of the
  • E1 is the entity to be translated based on the source language
  • E2 is the translated name of the entity to be translated based on the target language.
  • the generating device of the knowledge graph can calculate the similarity between each entity in the entity set corresponding to the co-occurrence relationship of the entity to be translated in the source language and the co-occurrence relationship of the translated name, and select the maximum value of the matching degree as the feature matching
  • the degree of matching of all features is accumulated, and the degree of matching between the translated name obtained by calculation and the entity to be translated in the sentence to be translated is calculated.
  • the matching degree calculation between different entities can refer to the sim entity (ei, ej) function.
  • the knowledge graph generation device only calculates the mutual similarity between two entities of the same entity type. If one of the entity relationships is In the co-occurrence relationship between the entity and the translated name, if one of the entities is between two entities of different types, the similarity between the above two entities will not be calculated, which can greatly reduce a large number of invalid similarity calculation operations.
  • the generating device of the knowledge graph selects the corresponding similarity calculation model according to the entity type, namely Similarity type(p) (ei[p],ej[p]). For example, the two entities are "old man” and "teenager” respectively.
  • the entity type corresponding to each entity is "age", then the age similarity calculation model is obtained to calculate the similarity between the above two entities.
  • ei[p] is the parameter value of the entity type of the i-th entity to be translated
  • ej[p] is the parameter value of the entity type of the j-th associated entity, continue to
  • the two entities "old man” and "young man” are used as examples to illustrate.
  • the corresponding age of "old man” is 70 years old or above, and the parameter value of the entity type can be set to 70, while the age corresponding to "teenager” is 18.
  • Age to 30 the parameter value for the entity type can be set to 20, and the above two parameter values can be imported into the age similarity calculation model to calculate the similarity between the two entities.
  • the translated name with the highest matching degree value can be selected as the target translated name corresponding to the entity to be translated in this translation operation, and Each translated name is imported into the corresponding area in the translated name that does not contain the entity output by the NMT algorithm, so that the translated sentence of the sentence to be translated in the target language is obtained, and the operation of sentence translation is completed.
  • the generating device of the knowledge graph may establish the relationship between the alias to be translated and the target translated name based on the alias to be translated appearing in the sentence to be translated.
  • the translation relationship between the two, and the translation relationship is added to the knowledge graph, which realizes the intelligent learning translation relationship.
  • the translated name of the entity to be translated in the current context is determined, and the knowledge graph is used to support the translation decision. Improve the accuracy of translation.
  • FIG. 11 shows a schematic structural diagram of a translation system based on a knowledge graph provided by an embodiment of the present application.
  • the knowledge graph-based translation system includes: a translation service cloud service system 111, a knowledge graph generating device 112, an intelligent annotation server 113, a cloud database server 114, a user terminal 115, and a third-party application platform 116.
  • the translation service cloud system 111 includes a text retrieval module, a translation service response module, and a data access module.
  • the data access module is used to send and receive data with various other devices
  • the translation service response module is used to receive the translation service sent by the user terminal for data encapsulation, obtain the translation result and return it to the user terminal
  • text retrieval The module is used to extract the text data in the translation request and perform preprocessing operations on the text data.
  • the knowledge graph generating device 112 includes a translation error correction module, a knowledge graph module, a translation module, and a data management module.
  • the translation error correction module is used to detect whether the sentence to be translated contained in the translation request contains the content that needs to be corrected, and correct the sentence to be translated through terminology correction, name correction, whole sentence correction, etc., and The sentence to be translated after the error correction process is sent to the translation module, and the translation operation is performed through the translation module.
  • the specific translation process can be referred to the translation process shown in FIG. 10, which will not be repeated here.
  • the data management module can be used to cache the received data and shield the sensitive fields containing the user's identity information, so as to protect the user's private information.
  • the intelligent labeling server 113 includes a login authentication module, a web page web module, and a server Service module. Identify the identity through the login authentication module of the smart label server, determine the validity of the service request, and display the data table of the cloud database server through the web module, and update the data stored in the cloud database server through the server module with the collected data .
  • the cloud database server 114 can include a database based on the MySQL framework, a database based on the Hadhoop framework, etc.
  • the cloud database server can be used to store cloud data required for translation operations, such as corpus learned from various channels, and initiated by user terminals. Historical translation records and knowledge required to construct a knowledge graph, etc.
  • the user terminal 115 can initiate a service request through a built-in application, and the intelligent translation engine can determine the translation channel required for the service request.
  • voice translation the voice data of the corresponding translated word can be obtained through the corresponding third-party platform;
  • word text translation you can obtain the translation data of the corresponding word translation through the corresponding third-party platform;
  • sentence text translation you can output the translated sentence of the sentence to be translated through the built-in translation module of the intelligent graph generating device, which is the same as for The type of translation request is different, and the corresponding translation response path can be determined through the intelligent translation engine.
  • the third-party application platform 116 may include multiple different third-party translation applications to support part of the translation operations of the entire translation system, such as word translation, word voice query, and so on.
  • the process of the user initiating a sentence translation request illustrates the workflow of the translation system.
  • the user terminal 115 receives the sentence translation request initiated by the user through the application program, and then the intelligent translation engine of the user terminal 115 determines the translation channel required for this translation operation. Since this operation is sentence translation, it needs to pass the knowledge graph generation device 112 to Support this translation operation, and send the sentence translation request carrying the channel identifier to the translation service cloud system 111.
  • the translation service cloud system 111 obtains the sentence translation request through the data access module, and sends the sentence translation request to the knowledge graph generating device 112, and the knowledge graph generating device 112 uses the translation error correction module to perform translation errors in the sentence translation request.
  • the preprocessing unit identifies the source language and target language of the sentence to be translated, and uses the knowledge graph to identify the sentence to be translated. According to the transfer relationship, determine the corresponding translated name of each alias name in the target language, feedback the translated name to the translation unit, and output the translated sentence of the sentence to be translated through the translation unit, preprocess the translated sentence, and pass
  • the data management module returns to the data access module of the translation service cloud system 111, and encapsulates the translation result through the translation service response module in the translation service cloud system, and returns the translation result to the user terminal.
  • FIG. 12 shows a corresponding interaction flow chart of each unit in the apparatus for generating a knowledge graph provided by an embodiment of the present application when responding to a translation operation.
  • the device for generating a knowledge graph may include a translation preprocessing unit, a knowledge graph service unit, a knowledge graph index unit, and a knowledge graph graph engine unit.
  • the knowledge graph generation device After the knowledge graph generation device receives the translation request, it can extract the sentence to be translated from the translation request, and send the sentence to be translated to the translation preprocessing unit, and the translation preprocessing unit identifies the source language and target language of the sentence to be translated , Send the pre-processed sentence to be translated and the above two parameter information to the knowledge graph service unit, select the NLP model corresponding to the source language through the knowledge graph service unit, and use the NLP model to identify the sentence to be translated by NER to determine the translation
  • Each entity to be translated contained in the sentence is sent to the knowledge graph index unit through each entity to be translated, and the entity node of each entity to be translated is located in the knowledge graph through the knowledge graph index unit, and each entity node is determined according to the knowledge graph index unit
  • the associated name list is to obtain the translated name of each entity to be translated based on the target language.
  • the knowledge graph service unit sends a co-occurrence relationship query request to the knowledge graph engine unit to determine the associated entities that have a co-occurrence relationship with each translated name.
  • the knowledge graph engine unit returns the co-occurrence relationship obtained by the query to the knowledge graph service unit, and through the knowledge graph service unit, selects the target translated name from the translated names corresponding to multiple different alias names, and generates the sentence to be translated according to all the target translated names
  • the translated sentence is returned to the translation preprocessing unit, and the translation result is output.
  • FIG. 13 shows a specific implementation flowchart of a method for generating a knowledge graph provided by the seventh embodiment of the present application.
  • the method for generating a knowledge graph provided by this embodiment further includes: S1301 to S1302, which are detailed as follows :
  • the method further includes:
  • the knowledge graph generating device constructs a knowledge graph containing multiple target entities, it can use the knowledge graph to provide technical support for the recommendation service, because the knowledge graph is determined according to the corpus
  • the co-occurrence relationship of each alias name in the target entity is further explored in the depth of the knowledge graph.
  • the co-occurrence relationship of different alias names can be explored, so as to determine the relationship between different aliases and related objects. Difference, which can improve the accuracy of the recommended information. For example, for the entity "fen”, there are two different alias names of "rice noodles” and "rice noodles”, and different alias names often match other entities differently, such as "fatchang rice noodles” and "crossing bridge rice noodles".
  • the matching entity Corresponding to the matching entity that is different from the alias name, it can identify the user's associated tastes, eating habits, etc., for the "entity" as the granularity to determine the recommended information, the co-occurrence relationship established by the "alias name" as the granularity can be mined The recommended information obtained is more accurate.
  • the device for generating the knowledge graph can receive the keyword input by the user, and identify the entity corresponding to the keyword, and the alias name used by the keyword, and obtain the knowledge associated with the alias name in the knowledge graph Node, and extract the co-occurrence relationship of the alias name from the knowledge node.
  • the device for generating a knowledge graph can select a corresponding recommended entity according to the number of co-occurrences of each associated entity in the co-occurrence relationship, and output recommendation information based on the recommended entity.
  • the recommendation information can obtain different recommendation results according to different scenarios. For example, in a search scenario, the associated keyword of the input keyword can be output, and the associated keyword is the keyword corresponding to the entity with more co-occurrences.
  • Product keywords are obtained based on the co-occurrence relationship corresponding to the alias name used by the input keywords; for example, in the output scene of a user portrait, multiple co-occurrence relationships can be identified from the co-occurrence relationship based on the keywords entered by the user Reality, and output the user tag of the user based on the common reality and keywords.
  • FIG. 14 shows a structural block diagram of a device for generating a knowledge graph provided by an embodiment of the present application. For ease of description, only the information related to the embodiment of the present application is shown. section.
  • the device for generating the knowledge graph includes:
  • the translation relationship establishment unit 141 is configured to establish a translation relationship of multiple alias names of the target entity based on the target language;
  • the co-occurrence relationship generating unit 142 is configured to generate the co-occurrence relationship of each of the alias names in the target entity through a preset corpus;
  • the knowledge graph construction unit 143 is configured to construct a knowledge graph according to the translation relationship and the co-occurrence relationship corresponding to all the target entities.
  • the translation relationship establishment unit 141 includes:
  • the source language sentence acquiring unit is used to separately acquire the source language sentence including each of the alias names;
  • the target language sentence acquiring unit is configured to output the target language sentence corresponding to each source language sentence according to the translation model between the source language and the target language;
  • the translated name recognition unit is configured to extract the translated name of the alias name in the target language from each sentence in the target language;
  • the translation relationship determining unit is used to establish the translation relationship between the alias name and the translated name name.
  • the source language sentence acquisition unit includes:
  • the sentence template obtaining unit is used to obtain the sentence template associated with the entity type according to the entity type of the target entity; the sentence template importing unit is used to import each of the alias names into the sentence template to generate the source Language statements.
  • the translated name recognition unit includes:
  • a valid sentence selection unit is used to identify the target language sentence as a valid sentence if it is detected that the target language sentence contains the phrase corresponding to the target entity; the keyword group recognition unit is used to compare the valid sentence with The phrase corresponding to the target entity is identified as the translated name.
  • the co-occurrence relationship generation unit 142 includes:
  • the target text extraction unit is used to extract the target text containing the target entity from the corpus; the associated entity recognition unit is used to identify the associated entities other than the target entity in the target text; the co-occurrence relationship establishment unit, It is used to obtain the co-occurrence relationship between the alias name and the associated entity according to the alias name corresponding to the target entity in the target text.
  • the device for generating the knowledge graph further includes: an entity-to-be-translated recognition unit for receiving a sentence to be translated based on the source language, and identifying the entity to be translated included in the sentence to be translated, so as to construct the The entity relationship of the sentence to be translated; a translation relationship extraction unit for extracting the translation relationship of the entity to be translated based on the target language from the knowledge graph; the translation relationship includes at least one translated name of the entity to be translated Name; the matching degree calculation unit is used to calculate the matching degree between the sentence to be translated and the translated name based on the entity relationship and the co-occurrence relationship of the translated name; the translated sentence output unit is used to calculate the degree of matching between the sentence to be translated and the translated name; According to the matching degree, the target translated name of the entity to be translated is determined from all the translated names, and the translation sentence based on the target language of the sentence to be translated is output according to all the target translated names.
  • an entity-to-be-translated recognition unit for receiving a sentence to be translated based on the source language
  • the matching degree calculation unit is specifically configured to:
  • the matching degree calculation function is specifically:
  • sim entity (ei,ej) ⁇ p ⁇ Prop(ei) ⁇ Prop(ej) ⁇ p Simlarity type(p) (ei[p],ej[p])
  • Sim (E1, E2) is the degree of matching between the entity to be translated and the translated name
  • Context (E1) is the co-occurrence of the entity to be translated E1 in the knowledge graph The associated entity included in the relationship
  • Context(E2) is the associated entity included in the co-occurrence relationship of the translated name E2
  • ei is the i-th associated entity in the co-occurrence relationship of the entity to be translated E1
  • ej is the j-th associated entity in the co-occurrence relationship of the translated name E2
  • Prop(ei) is the entity type of the i-th associated entity in the co-occurrence relationship of the entity E1 to be translated
  • Prop (ej) is the entity type of the j-th associated entity in the co-occurrence relationship of the translated name E2
  • ⁇ p is the weight value corresponding to the entity type
  • Similarity type(p) (ei[p], ej[p]) is the matching degree function corresponding to the entity type
  • ei[p] is the parameter value of the
  • the device for generating the knowledge graph further includes:
  • the keyword receiving unit is configured to receive keywords input by the user, and query the co-occurrence relationship corresponding to the keywords from the knowledge graph;
  • the recommendation information output unit is configured to output the recommendation information of the user according to the co-occurrence relationship.
  • the device for generating a knowledge graph can also establish a transfer relationship for each knowledge node in the knowledge graph, that is, the target entity, to connect the knowledge nodes between different languages, and expand each knowledge node by constructing a co-occurrence relationship.
  • the depth of knowledge of the knowledge node is not limited to the attributes of the target entity itself, it improves the associative ability of each knowledge node, and the breadth and depth of the knowledge graph, thereby improving the accuracy of artificial intelligence output results and improving the quality of service response.
  • FIG. 15 is a schematic structural diagram of a terminal device provided by an embodiment of this application.
  • the terminal device 15 of this embodiment includes: at least one processor 150 (only one is shown in FIG. 15), a processor, a memory 151, and a processor stored in the memory 151 and capable of being processed in the at least one processor.
  • the computer program 152 running on the processor 150 when the processor 150 executes the computer program 152, implements the steps in any of the above-mentioned methods for generating the knowledge graph.
  • the terminal device 15 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the terminal device may include, but is not limited to, a processor 150 and a memory 151.
  • FIG. 15 is only an example of the terminal device 15 and does not constitute a limitation on the terminal device 15. It may include more or less components than shown in the figure, or a combination of certain components, or different components. , For example, can also include input and output devices, network access devices, and so on.
  • the so-called processor 150 may be a central processing unit (Central Processing Unit, CPU), and the processor 150 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (Application Specific Integrated Circuits). , ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 151 may be an internal storage unit of the terminal device 15 in some embodiments, such as a hard disk or a memory of the terminal device 15. In other embodiments, the memory 151 may also be an external storage device of the ** device/terminal device 15, for example, a plug-in hard disk equipped on the terminal device 15, a smart memory card (Smart Media Card, SMC). ), Secure Digital (SD) card, Flash Card, etc. Further, the memory 151 may also include both an internal storage unit of the terminal device 15 and an external storage device. The memory 151 is used to store an operating system, an application program, a boot loader (BootLoader), data, and other programs, such as the program code of the computer program. The memory 151 can also be used to temporarily store data that has been output or will be output.
  • BootLoader boot loader
  • An embodiment of the present application also provides a network device, which includes: at least one processor, a memory, and a computer program stored in the memory and running on the at least one processor, and the processor executes The computer program implements the steps in any of the foregoing method embodiments.
  • the embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be realized.
  • the embodiments of the present application provide a computer program product.
  • the steps in the foregoing method embodiments can be realized when the mobile terminal is executed.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the computer program can be stored in a computer-readable storage medium.
  • the computer program can be stored in a computer-readable storage medium.
  • the steps of the foregoing method embodiments can be implemented.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may at least include: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), and random access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium.
  • ROM read-only memory
  • RAM random access memory
  • electric carrier signal telecommunications signal and software distribution medium.
  • U disk mobile hard disk, floppy disk or CD-ROM, etc.
  • computer-readable media cannot be electrical carrier signals and telecommunication signals.
  • the disclosed apparatus/network equipment and method may be implemented in other ways.
  • the device/network device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

L'invention concerne un procédé, un appareil et un terminal de génération de graphe de connaissances basés sur l'intelligence artificielle, ainsi qu'un support de stockage. Le procédé consiste à : déterminer, dans une langue cible, le nom de traduction de chaque nom d'alias d'une entité cible, et selon le nom d'alias et le nom de traduction, générer une relation de traduction de l'entité cible (S101) ; au moyen d'un corpus prédéfini, générer séparément une relation de cooccurrence de chaque nom d'alias de l'entité cible (S102) ; et construire un graphe de connaissances en fonction des relations de traduction et des relations de cooccurrence correspondant à toutes les entités cibles (S103). Le procédé, l'appareil, le terminal et le support de stockage peuvent construire un graphe de connaissances prenant en charge de multiples langues, et améliorer la capacité d'association de chaque nœud de connaissance dans le graphe de connaissances, ainsi que la largeur et la profondeur du graphe de connaissances, ce qui permet d'améliorer la précision d'un résultat d'intelligence artificielle et d'améliorer ainsi la qualité d'une réponse de service.
PCT/CN2020/125592 2019-11-22 2020-10-30 Procédé, appareil et terminal de génération de graphe de connaissances, et support de stockage WO2021098491A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911156483.3 2019-11-22
CN201911156483.3A CN112836057B (zh) 2019-11-22 2019-11-22 知识图谱的生成方法、装置、终端以及存储介质

Publications (1)

Publication Number Publication Date
WO2021098491A1 true WO2021098491A1 (fr) 2021-05-27

Family

ID=75921937

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/125592 WO2021098491A1 (fr) 2019-11-22 2020-10-30 Procédé, appareil et terminal de génération de graphe de connaissances, et support de stockage

Country Status (2)

Country Link
CN (1) CN112836057B (fr)
WO (1) WO2021098491A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113204651A (zh) * 2021-05-28 2021-08-03 华侨大学 一种华文教育领域的多源知识图谱融合方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070233656A1 (en) * 2006-03-31 2007-10-04 Bunescu Razvan C Disambiguation of Named Entities
CN105677913A (zh) * 2016-02-29 2016-06-15 哈尔滨工业大学 一种基于机器翻译的中文语义知识库的构建方法
CN106598947A (zh) * 2016-12-15 2017-04-26 山西大学 一种基于同义词扩展的贝叶斯词义消歧方法
CN107038158A (zh) * 2016-02-01 2017-08-11 松下知识产权经营株式会社 对译语料库制作方法、装置、程序以及机器翻译系统
CN108460026A (zh) * 2017-02-22 2018-08-28 华为技术有限公司 一种翻译方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101840397A (zh) * 2009-03-20 2010-09-22 日电(中国)有限公司 词义消歧方法和系统
CN108170662A (zh) * 2016-12-07 2018-06-15 富士通株式会社 缩简词的消歧方法和消歧设备
US20190188324A1 (en) * 2017-12-15 2019-06-20 Microsoft Technology Licensing, Llc Enriching a knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070233656A1 (en) * 2006-03-31 2007-10-04 Bunescu Razvan C Disambiguation of Named Entities
CN107038158A (zh) * 2016-02-01 2017-08-11 松下知识产权经营株式会社 对译语料库制作方法、装置、程序以及机器翻译系统
CN105677913A (zh) * 2016-02-29 2016-06-15 哈尔滨工业大学 一种基于机器翻译的中文语义知识库的构建方法
CN106598947A (zh) * 2016-12-15 2017-04-26 山西大学 一种基于同义词扩展的贝叶斯词义消歧方法
CN108460026A (zh) * 2017-02-22 2018-08-28 华为技术有限公司 一种翻译方法及装置

Also Published As

Publication number Publication date
CN112836057A (zh) 2021-05-25
CN112836057B (zh) 2024-03-26

Similar Documents

Publication Publication Date Title
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
WO2022022045A1 (fr) Procédé et appareil de comparaison de texte basée sur un graphe de connaissances, dispositif, et support de stockage
CN107992585B (zh) 通用标签挖掘方法、装置、服务器及介质
US10120861B2 (en) Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time
CN110019732B (zh) 一种智能问答方法以及相关装置
WO2020108063A1 (fr) Procédé, appareil et serveur de détermination de mots caractéristiques
KR20200094627A (ko) 텍스트 관련도를 확정하기 위한 방법, 장치, 기기 및 매체
CN108304375A (zh) 一种信息识别方法及其设备、存储介质、终端
JP2020027649A (ja) エンティティ関係データ生成方法、装置、機器、及び記憶媒体
CN111831911A (zh) 查询信息的处理方法、装置、存储介质和电子装置
CN110347790B (zh) 基于注意力机制的文本查重方法、装置、设备及存储介质
US9940355B2 (en) Providing answers to questions having both rankable and probabilistic components
CN110619051A (zh) 问题语句分类方法、装置、电子设备及存储介质
WO2024099037A1 (fr) Procédé et appareil de traitement de données, procédé et appareil de liaison d'entité, et dispositif informatique
CN109063184A (zh) 多语言新闻文本聚类方法、存储介质及终端设备
CN111178076A (zh) 命名实体识别与链接方法、装置、设备及可读存储介质
CN111459977B (zh) 自然语言查询的转换
CN107832447A (zh) 用于移动终端的用户反馈纠错方法、装置及其设备
WO2021098491A1 (fr) Procédé, appareil et terminal de génération de graphe de connaissances, et support de stockage
US20230112385A1 (en) Method of obtaining event information, electronic device, and storage medium
WO2018214956A1 (fr) Procédé et appareil de traduction automatique, et support d'informations
CN112069267A (zh) 一种数据处理方法和装置
WO2021135103A1 (fr) Procédé et appareil d'analyse sémantique, dispositif informatique et support de stockage
CN115544218A (zh) 一种信息搜索方法、装置及存储介质
CN110990451A (zh) 基于句子嵌入的数据挖掘方法、装置、设备及存储装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20888914

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20888914

Country of ref document: EP

Kind code of ref document: A1