CN105550253B - Method and device for acquiring type relationship - Google Patents

Method and device for acquiring type relationship Download PDF

Info

Publication number
CN105550253B
CN105550253B CN201510900876.6A CN201510900876A CN105550253B CN 105550253 B CN105550253 B CN 105550253B CN 201510900876 A CN201510900876 A CN 201510900876A CN 105550253 B CN105550253 B CN 105550253B
Authority
CN
China
Prior art keywords
type
types
groups
description text
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510900876.6A
Other languages
Chinese (zh)
Other versions
CN105550253A (en
Inventor
葛宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing duxiaoman Youyang Technology Co.,Ltd.
Original Assignee
Shanghai Youyang New Media Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Youyang New Media Information Technology Co ltd filed Critical Shanghai Youyang New Media Information Technology Co ltd
Priority to CN201510900876.6A priority Critical patent/CN105550253B/en
Publication of CN105550253A publication Critical patent/CN105550253A/en
Application granted granted Critical
Publication of CN105550253B publication Critical patent/CN105550253B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5854Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques

Abstract

The embodiment of the invention provides a method and a device for acquiring a type relation. On one hand, the embodiment of the invention obtains each entity and the description text of each entity; thus, the types corresponding to the entities are obtained, and the description text of each type is generated according to the description text of each entity corresponding to each type; and further, according to the specified type relation, extracting M groups of types which accord with the specified type relation from the description text of each type, wherein M is a positive integer. Therefore, the technical scheme provided by the embodiment of the invention can automatically obtain the relationship between the types of the entities, improve the efficiency of obtaining the relationship between the types of the entities and reduce the cost of obtaining the relationship between the types of the entities.

Description

Method and device for acquiring type relationship
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of internet application, in particular to a method and a device for acquiring a type relation.
[ background of the invention ]
The knowledge graph mainly comprises entities and relationships between the entities, so that obtaining the relationships between the two entities is an essential step for constructing and perfecting the knowledge graph. In the prior art, more entities and relationships among entities can be mined according to the types and relationships among the types of the entities, so that the knowledge graph is continuously improved.
In the prior art, the relationships between types need to be manually collected and added into a knowledge graph, the manual collection mode mainly depends on experience knowledge for manual collection and analysis, the requirement on the knowledge level of personnel is high, and the manual collection process is complex, so that the manual collection mode for the relationships between the types has low acquisition efficiency and high acquisition cost.
[ summary of the invention ]
In view of this, embodiments of the present invention provide a method and an apparatus for acquiring a type relationship, which can automatically acquire a relationship between types of an entity, improve acquisition efficiency of the relationship between the types of the entity, and reduce acquisition cost of the relationship between the types of the entity.
In one aspect of the embodiments of the present invention, a method for obtaining a type relationship is provided, including:
obtaining each entity and description texts of each entity;
obtaining the type corresponding to each entity;
generating a description text of each type according to the description text of each entity corresponding to each type;
and according to the specified type relation, extracting M groups of types which accord with the specified type relation from the description text of each type, wherein M is a positive integer.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where the obtaining the type corresponding to each entity includes:
classifying knowledge according to types, and aggregating the entities according to the types to obtain the types corresponding to the entities; alternatively, the first and second electrodes may be,
and respectively inputting each entity into a type classification model so as to enable the type classification model to classify the types of the entities to obtain the types corresponding to the entities.
The above-mentioned aspect and any possible implementation manner further provide an implementation manner, where generating the description text of each type according to the description text of each entity corresponding to each type includes:
performing word segmentation on the description text of each entity corresponding to each type to obtain word segmentation results;
matching in each word segmentation result by utilizing a type knowledge base;
if one word cutting result contains the key words defined in the type knowledge base, extracting text segments containing the word cutting result;
and generating a description text of each type according to the extracted text segments.
The foregoing aspect and any possible implementation manner further provide an implementation manner, where the extracting, according to a specified type relationship, an M group of types that conform to the specified type relationship from a description text of each type includes:
obtaining a designated relation template, wherein the relation template corresponds to a type relation and comprises text content indicating the type relation between two types;
performing character matching in the description text of each type by using the relation template, and extracting N groups of types from the description text of each type; n is greater than or equal to M and is a positive integer;
and obtaining M groups of types according with the specified type relation according to the extracted N groups of types.
As to the above-mentioned aspect and any possible implementation manner, further providing an implementation manner, where obtaining, according to the extracted N groups of types, M groups of types that conform to the specified type relationship includes:
carrying out name normalization processing on P types in the N groups of types, wherein P is a positive integer;
and for the N groups of types after the normalization processing, combining the N groups of types into the M groups of types according to the same type belonging to different groups and the specified type relation.
The above-described aspects and any possible implementations further provide an implementation, and the method further includes: and adding the specified type relation and the M groups of types which accord with the specified type relation to a knowledge graph.
In one aspect of the embodiments of the present invention, an apparatus for obtaining a type relationship is provided, including:
the receiving module is used for obtaining each entity and the description text of each entity;
the classification module is used for obtaining the type corresponding to each entity;
the generating module is used for generating the description text of each type according to the description text of each entity corresponding to each type;
and the acquisition module is used for extracting M groups of types which accord with the specified type relation from the description text of each type according to the specified type relation, wherein M is a positive integer.
The above-described aspect and any possible implementation further provide an implementation, where the classification module is specifically configured to:
classifying knowledge according to types, and aggregating the entities according to the types to obtain the types corresponding to the entities; alternatively, the first and second electrodes may be,
and respectively inputting each entity into a type classification model so as to enable the type classification model to classify the types of the entities to obtain the types corresponding to the entities.
The above-described aspect and any possible implementation further provide an implementation, where the generating module is specifically configured to:
performing word segmentation on the description text of each entity corresponding to each type to obtain word segmentation results;
matching in each word segmentation result by utilizing a type knowledge base;
if one word cutting result contains the key words defined in the type knowledge base, extracting text segments containing the word cutting result;
and generating a description text of each type according to the extracted text segments.
The above-described aspect and any possible implementation manner further provide an implementation manner, where the obtaining module is specifically configured to:
obtaining a designated relation template, wherein the relation template corresponds to a type relation and comprises text content indicating the type relation between two types;
performing character matching in the description text of each type by using the relation template, and extracting N groups of types from the description text of each type; n is greater than or equal to M and is a positive integer;
and obtaining M groups of types according with the specified type relation according to the extracted N groups of types.
As to the above-mentioned aspects and any possible implementation manner, there is further provided an implementation manner, where the obtaining module is configured to, when obtaining, according to the extracted N groups of types, an M group of types that conform to the specified type relationship, specifically:
carrying out name normalization processing on P types in the N groups of types, wherein P is a positive integer;
and for the N groups of types after the normalization processing, combining the N groups of types into the M groups of types according to the same type belonging to different groups and the specified type relation.
The above-described aspects and any possible implementations further provide an implementation, where the apparatus further includes: and the processing module is used for adding the specified type relation and the M groups of types which accord with the specified type relation to the knowledge graph.
According to the technical scheme, the embodiment of the invention has the following beneficial effects:
compared with the mode of manually acquiring the relationship between the types of the entities in the prior art, the method and the device for acquiring the relationship between the types of the entities automatically can avoid manual acquisition by acquiring the relationship between the types of the entities automatically, thereby improving the acquisition efficiency of the relationship between the types of the entities and reducing the acquisition cost of the relationship between the types of the entities.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a schematic flow chart of a method for acquiring a type relationship according to an embodiment of the present invention;
fig. 2 is a functional block diagram of an apparatus for acquiring a type relationship according to an embodiment of the present invention.
[ detailed description ] embodiments
For better understanding of the technical solutions of the present invention, the following detailed descriptions of the embodiments of the present invention are provided with reference to the accompanying drawings.
It should be understood that the described embodiments are only some embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
Example one
An embodiment of the present invention provides a method for acquiring a type relationship, please refer to fig. 1, which is a schematic flow chart of the method for acquiring a type relationship according to an embodiment of the present invention, and as shown in the figure, the method includes the following steps:
s101, obtaining each entity and description texts of each entity.
In particular, a given number of entities, and a description text for each of the given number of entities, may be obtained.
Taking the entity as a company as an example, the description text of the company may be the description text of the business of the company. For example, the description text of the company business may include the operation range description of the company, the name of the company, the product produced by the company, the name of the supplier of the raw material and necessary parts required by the company, and the like.
For example, for the entity "Huashi technology Co., Ltd", the description text may be "Huashi technology Co., Ltd is a civil communication technology Co., Ltd. producing and selling communication devices, and the headquarters is located in Shenzhen, Dragon hillregion, Bantian, China, Guangdong province. Huayishi products mainly relate to switching networks, transmission networks, wireless and wired fixed access networks, data communication networks and wireless terminal products in communication networks, and provide hardware equipment, software, services and solutions for communication operators and professional network owners all over the world.
S102, obtaining the corresponding type of each entity.
Specifically, for each given entity, the type corresponding to each entity needs to be obtained first.
For example, the method for obtaining the type corresponding to each entity may include, but is not limited to, the following two methods:
the first method comprises the following steps: and classifying the knowledge according to the types, and aggregating the entities according to the types to obtain the types corresponding to the entities.
It is understood that the type classification knowledge includes a type classification tree, nodes in the type classification tree are types, and child nodes of the types are type expansion words.
In a specific implementation process, word segmentation may be performed on the given description text of each entity to obtain a word segmentation result corresponding to the description text of each entity. And then carrying out character matching in each word segmentation result by using the type or the type expansion word in the type classification tree. If the word segmentation result can hit a certain type or a certain type of extension word, the type of the entity corresponding to the word segmentation result can be considered as the type hit by the word segmentation result, or the type corresponding to the type extension word hit by the word segmentation result. Therefore, for an entity, the type of the entity can be determined according to the word segmentation result corresponding to the description text of the entity, so that the entities are aggregated according to the type to obtain the type corresponding to each entity, namely, a plurality of entities corresponding to each type in each type are obtained.
In the following, the entity is a company, and the type of the entity is the industry to which the company belongs.
For the entity "Huashi technology Co., Ltd", the description text is "Huashi technology Co., Ltd is a civil-camp communication technology Co., Ltd for producing and selling communication equipment, and the headquarter is located in Hua as the base in Dragon sentry region of Shenzhen city, Guangdong province, China. Huayishi products mainly relate to switching networks, transmission networks, wireless and wired fixed access networks, data communication networks and wireless terminal products in communication networks, and provide hardware equipment, software, services and solutions for communication operators and professional network owners all over the world.
After the description text is cut into words, character matching is carried out in the word cutting result by utilizing various types or various types of extension words, the word cutting result can be found to be capable of hitting the type 'communication equipment manufacturing industry', the 'communication equipment manufacturing industry' is used as a subtype of the 'manufacturing industry', and the subtype 'communication equipment manufacturing industry' can be used as the type of the entity in the embodiment of the invention. Or, the word segmentation result may also hit a type extension word, such as "telephone, mobile phone, network switch", and the type of the entity may also be determined to be "communication equipment manufacturing industry" according to the type corresponding to the type extension word hit by the word segmentation result. Thus, the type of the entity 'Huashi technology limited company' can be obtained, and a plurality of entities corresponding to the type 'communication equipment manufacturing industry' can be obtained by the method, so that the type of each entity and a plurality of types corresponding to each entity in each type are obtained by clustering the entities.
And the second method comprises the following steps: and respectively inputting each entity into a type classification model so as to enable the type classification model to classify the types of the entities to obtain the types corresponding to the entities.
In a specific implementation process, a large number of entities, description texts of the entities, and types corresponding to the entities can be used as training samples, and the training samples are subjected to machine learning to generate a type classification model. Then, for each given entity, the name of each entity or the description text of the entity may be input into the type classification model, so that the type classification model performs type identification on the entity thereof, and the type classification model may obtain and output the type of each entity, which is equivalent to obtaining a plurality of entities corresponding to each type in each type.
And S103, generating a description text of each type according to the description text of each entity corresponding to each type.
Specifically, in S102, since the type of each entity can be obtained, a plurality of entities corresponding to each type in each type can be obtained according to the type of each entity. In this step, a description text needs to be generated for each type according to the description text of each entity corresponding to each type.
For example, according to the description text of each entity corresponding to each type, the method for generating the description text of each type may include, but is not limited to:
firstly, the description text of each entity corresponding to each type is cut into words to obtain word cutting results. Then, matching is carried out in each word segmentation result by utilizing a type knowledge base; and if one word cutting result contains the key words defined in the type knowledge base, extracting text segments containing the word cutting result. And finally, generating a description text of each type according to the extracted text segments.
It should be noted that the type knowledge base may include names, alternative names, type expansion words, and the like of the types.
In a specific implementation process, for each type, the description text of each entity corresponding to the type may be cut into words, and a word cutting result corresponding to the description text of the entity is obtained. Then, character matching is performed in each word segmentation result by using a type knowledge base, and if a word segmentation result hits a name, an alternative name or a type extension word of the type, a text segment containing the word segmentation result, such as a sentence or a section of text containing the word segmentation result, can be extracted from the description text of the entity. Therefore, corresponding text segments can be extracted from the description text of each entity corresponding to the type, and then a set is formed by using the extracted text segments, and the set is taken as the description text of the type. For each type, the description text of the type can be obtained in the above manner, so that the description texts of the types can be obtained.
Taking an entity as a company and a type as an industry to which the entity belongs as an example, the description text of the type can include that "the raw material of the industry a is provided by the industry b", "the industry a depends on the industry b", or "the industry a is influenced by the industry b", and the like, which can be used as the description text of the industry a, and the industry a belongs to the type of the company.
S104, according to the specified type relation, extracting M groups of types which accord with the specified type relation from the description text of each type, wherein M is a positive integer.
Specifically, after obtaining the description text of each entity, a plurality of groups of types that conform to the specified relationship may be extracted from the description text of each type according to the specified type relationship, where each group of types may include two different types.
For example, in the embodiment of the present invention, according to a specified type relationship, a method for extracting, from a description text of each type, M groups of types that conform to the specified type relationship may include, but is not limited to:
first, a specified relationship template is obtained, the relationship template corresponding to a type relationship, the relationship template including text content indicating a type relationship between two types. Then, character matching is carried out in the description text of each type by using the relation template, and N groups of types are extracted from the description text of each type; n is greater than or equal to M and is a positive integer; and finally, obtaining M groups of types which accord with the specified type relation according to the extracted N groups of types.
It can be understood that, because the relationship template defines the type relationship between the two types, when the relationship template is used for matching in the description texts of the types, if the content in the description text matches with the characters of the relationship template, a group of types can be extracted from the description text, each type of description text can include a plurality of text segments, for the type of description text, a plurality of groups of types can be extracted, and two types in each extracted group of types are types that conform to the type relationship corresponding to the relationship template, so that the type relationship between the two types can be obtained.
Taking an entity as a company and taking a type as an industry to which the entity belongs as an example, the type relation corresponding to the specified relation template is an industry relation. For example, the relationship templates may include "xx raw material is provided by xx", "xx depends on xx", or "xx is influenced by xx", and the corresponding industry relationship of these relationship templates is "downstream and upstream relationship", that is, in two industries included in each group of industries extracted from the description text of the type by using these relationship templates, the former is downstream industry of the latter, and the latter is upstream industry of the former, so that the relationship of the two industries is the relationship between the upstream industry and the downstream industry.
For example, in the embodiment of the present invention, the method for obtaining the M groups of types meeting the specified type relationship according to the extracted N groups of types may include, but is not limited to:
firstly, name normalization processing is carried out on P types in the N groups of types, wherein P is a positive integer and is smaller than 2N. Then, for the N groups of types after the normalization processing, the N groups of types are merged into the M groups of types according to the same type belonging to different groups and the specified type relationship.
For example, if two groups of types include the same type, but the types have different names in the two groups, one of the names is an exact name, and the other is an alias, the names of the types in the two groups can be unified into the exact name. After names are unified, two groups of types containing the same type can be conveniently merged to generate a new group of types, and the new group of types still conform to the type relation conformed by the two groups of types.
Taking the type as an industry as an example, if the first group of types conforming to the relationship between the upstream industry and the downstream industry comprises 'industry 1-industry 2', and the second group of types comprises 'industry 2-industry 3', the two groups of industries can be combined and combined into one group of industries, namely 'industry 1-industry 2-industry 3', the former industry in the group of industries is the upstream industry of the latter industry, and the latter industry is the downstream industry of the former industry.
Optionally, in a possible implementation manner of this embodiment, after obtaining M groups of types that conform to the specified type relationship in S104, the type relationship and the M groups of types that conform to the type relationship may be added to the knowledge graph.
For example, the obtained M groups of types may be added under the relationship of the types in the knowledge graph, and more than two types may be included in the M groups of types. Alternatively, each type in the knowledge-graph may be labeled with another type and with a type relationship with another type.
Taking the type as an industry as an example, the relationship between two industries can be added to a knowledge graph related to business or a knowledge graph related to market, and a plurality of groups of industries can be added to the industry relationship, which means that more than two industries in each group of industries have the industry relationship. Therefore, the mining of the relationship between the industries can be realized, the relationship between the two industries can be obtained, and the relationship between the industries can be used for revealing the supply chain of the business and is an important component for processing the competitive information of the business. Therefore, the relationship between industries has a great role in real life.
The embodiment of the invention further provides an embodiment of a device for realizing the steps and the method in the embodiment of the method.
Please refer to fig. 2, which is a functional block diagram of an apparatus for obtaining type relationships according to an embodiment of the present invention. As shown, the apparatus comprises:
a receiving module 21, configured to obtain each entity and a description text of each entity;
a classification module 22, configured to obtain a type corresponding to each entity;
the generating module 23 is configured to generate a description text of each type according to the description text of each entity corresponding to each type;
and the obtaining module 24 is configured to extract M groups of types meeting the specified type relationship from the description text of each type according to the specified type relationship, where M is a positive integer.
In a specific implementation process, the classification module 22 is specifically configured to:
classifying knowledge according to types, and aggregating the entities according to the types to obtain the types corresponding to the entities; or respectively inputting each entity into the type classification model, so that the type classification model performs type classification on each entity to obtain the type corresponding to each entity.
In a specific implementation process, the generating module 23 is specifically configured to:
performing word segmentation on the description text of each entity corresponding to each type to obtain word segmentation results;
matching in each word segmentation result by utilizing a type knowledge base;
if one word cutting result contains the key words defined in the type knowledge base, extracting text segments containing the word cutting result;
and generating a description text of each type according to the extracted text segments.
In a specific implementation process, the obtaining module 24 is specifically configured to:
obtaining a designated relation template, wherein the relation template corresponds to a type relation and comprises text content indicating the type relation between two types;
performing character matching in the description text of each type by using the relation template, and extracting N groups of types from the description text of each type; n is greater than or equal to M and is a positive integer;
and obtaining M groups of types according with the specified type relation according to the extracted N groups of types.
In a specific implementation process, when the obtaining module 24 is configured to obtain, according to the extracted N groups of types, M groups of types that conform to the specified type relationship, specifically:
carrying out name normalization processing on P types in the N groups of types, wherein P is a positive integer;
and for the N groups of types after the normalization processing, combining the N groups of types into the M groups of types according to the same type belonging to different groups and the specified type relation.
Optionally, in a possible implementation manner of this embodiment, the apparatus further includes:
and the processing module 25 is used for adding the specified type relation and the M groups of types which accord with the specified type relation to the knowledge graph.
Since each unit in the present embodiment can execute the method shown in fig. 1, reference may be made to the related description of fig. 1 for a part of the present embodiment that is not described in detail.
The technical scheme of the embodiment of the invention has the following beneficial effects:
in the embodiment of the invention, each entity and the description text of each entity are obtained; thus, the types corresponding to the entities are obtained, and the description text of each type is generated according to the description text of each entity corresponding to each type; and further, according to the specified type relation, extracting M groups of types which accord with the specified type relation from the description text of each type, wherein M is a positive integer.
Compared with the mode of manually acquiring the relationship between the types of the entities in the prior art, the method and the device for acquiring the relationship between the types of the entities automatically can avoid manual acquisition by acquiring the relationship between the types of the entities automatically, thereby improving the acquisition efficiency of the relationship between the types of the entities, reducing the acquisition cost of the relationship between the types of the entities and saving manpower and material resources.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions in actual implementation, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a Processor (Processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for obtaining a type relationship, the method comprising:
obtaining each entity and description texts of each entity;
obtaining the type corresponding to each entity;
generating a description text of each type according to the description text of each entity corresponding to each type;
according to the specified type relation, extracting M groups of types which accord with the specified type relation from the description text of each type, wherein M is a positive integer;
generating a description text of each type according to the description text of each entity corresponding to each type, wherein the generating of the description text of each type comprises:
performing word segmentation on the description text of each entity corresponding to each type to obtain word segmentation results;
matching in each word segmentation result by utilizing a type knowledge base;
if one word cutting result contains the key words defined in the type knowledge base, extracting text segments containing the word cutting result;
and generating a description text of each type according to the extracted text segments.
2. The method of claim 1, wherein the obtaining the type corresponding to each entity comprises:
classifying knowledge according to types, and aggregating the entities according to the types to obtain the types corresponding to the entities; alternatively, the first and second electrodes may be,
and respectively inputting each entity into a type classification model so as to enable the type classification model to classify the types of the entities to obtain the types corresponding to the entities.
3. The method according to claim 1, wherein the extracting, according to the specified type relationship, M groups of types that conform to the specified type relationship from the description text of each type includes:
obtaining a designated relation template, wherein the relation template corresponds to a type relation and comprises text content indicating the type relation between two types;
performing character matching in the description text of each type by using the relation template, and extracting N groups of types from the description text of each type; n is greater than or equal to M and is a positive integer;
and obtaining M groups of types according with the specified type relation according to the extracted N groups of types.
4. The method according to claim 3, wherein said obtaining M groups of types that conform to the specified type relationship according to the extracted N groups of types comprises:
carrying out name normalization processing on P types in the N groups of types, wherein P is a positive integer;
and for the N groups of types after the normalization processing, combining the N groups of types into the M groups of types according to the same type belonging to different groups and the specified type relation.
5. The method according to any one of claims 1 to 4, further comprising: and adding the specified type relation and the M groups of types which accord with the specified type relation to a knowledge graph.
6. An apparatus for obtaining a type of relationship, the apparatus comprising:
the receiving module is used for obtaining each entity and the description text of each entity;
the classification module is used for obtaining the type corresponding to each entity;
the generating module is used for generating the description text of each type according to the description text of each entity corresponding to each type;
the acquisition module is used for extracting M groups of types which accord with the specified type relation from the description text of each type according to the specified type relation, wherein M is a positive integer;
the generation module is specifically configured to:
performing word segmentation on the description text of each entity corresponding to each type to obtain word segmentation results;
matching in each word segmentation result by utilizing a type knowledge base;
if one word cutting result contains the key words defined in the type knowledge base, extracting text segments containing the word cutting result;
and generating a description text of each type according to the extracted text segments.
7. The apparatus according to claim 6, wherein the classification module is specifically configured to:
classifying knowledge according to types, and aggregating the entities according to the types to obtain the types corresponding to the entities; alternatively, the first and second electrodes may be,
and respectively inputting each entity into a type classification model so as to enable the type classification model to classify the types of the entities to obtain the types corresponding to the entities.
8. The apparatus of claim 6, wherein the obtaining module is specifically configured to:
obtaining a designated relation template, wherein the relation template corresponds to a type relation and comprises text content indicating the type relation between two types;
performing character matching in the description text of each type by using the relation template, and extracting N groups of types from the description text of each type; n is greater than or equal to M and is a positive integer;
and obtaining M groups of types according with the specified type relation according to the extracted N groups of types.
9. The apparatus according to claim 8, wherein the obtaining module, when obtaining, according to the extracted N groups of types, M groups of types that conform to the specified type relationship, is specifically configured to:
carrying out name normalization processing on P types in the N groups of types, wherein P is a positive integer;
and for the N groups of types after the normalization processing, combining the N groups of types into the M groups of types according to the same type belonging to different groups and the specified type relation.
10. The apparatus of any one of claims 6 to 9, further comprising:
and the processing module is used for adding the specified type relation and the M groups of types which accord with the specified type relation to the knowledge graph.
CN201510900876.6A 2015-12-09 2015-12-09 Method and device for acquiring type relationship Active CN105550253B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510900876.6A CN105550253B (en) 2015-12-09 2015-12-09 Method and device for acquiring type relationship

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510900876.6A CN105550253B (en) 2015-12-09 2015-12-09 Method and device for acquiring type relationship

Publications (2)

Publication Number Publication Date
CN105550253A CN105550253A (en) 2016-05-04
CN105550253B true CN105550253B (en) 2021-02-12

Family

ID=55829442

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510900876.6A Active CN105550253B (en) 2015-12-09 2015-12-09 Method and device for acquiring type relationship

Country Status (1)

Country Link
CN (1) CN105550253B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156286B (en) * 2016-06-24 2019-09-17 广东工业大学 Type extraction system and method towards technical literature knowledge entity
CN107844471B (en) * 2016-09-20 2021-06-04 科大讯飞股份有限公司 Text description type identification method and device
CN107031617B (en) * 2017-02-22 2019-06-14 湖北文理学院 A kind of method and device that automobile intelligent drives
CN108154198B (en) * 2018-01-25 2021-07-13 北京百度网讯科技有限公司 Knowledge base entity normalization method, system, terminal and computer readable storage medium
CN109558584A (en) * 2018-10-26 2019-04-02 平安科技(深圳)有限公司 Business connection prediction technique, device, computer equipment and storage medium
CN111209389B (en) * 2019-12-31 2023-08-11 天津外国语大学 Movie story generation method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049490B (en) * 2012-12-05 2016-09-07 北京海量融通软件技术有限公司 Between knowledge network node, attribute generates system and the method for generation
CN103488789B (en) * 2013-10-08 2017-08-18 百度在线网络技术(北京)有限公司 Recommendation method, device and search engine

Also Published As

Publication number Publication date
CN105550253A (en) 2016-05-04

Similar Documents

Publication Publication Date Title
CN105550253B (en) Method and device for acquiring type relationship
CN109271512B (en) Emotion analysis method, device and storage medium for public opinion comment information
CN104572958B (en) A kind of sensitive information monitoring method based on event extraction
US20150207704A1 (en) Public opinion information display system and method
CN103336766B (en) Short text garbage identification and modeling method and device
CN108491388B (en) Data set acquisition method, classification method, device, equipment and storage medium
CN111797210A (en) Information recommendation method, device and equipment based on user portrait and storage medium
CN110457672B (en) Keyword determination method and device, electronic equipment and storage medium
CN105335496A (en) Customer service repeated call treatment method based on cosine similarity text mining algorithm
CN112506951B (en) Processing method, server, computing device and system for database slow query log
JP2018509664A (en) Model generation method, word weighting method, apparatus, device, and computer storage medium
CN105389341A (en) Text clustering and analysis method for repeating caller work orders of customer service calls
CN105843796A (en) Microblog emotional tendency analysis method and device
CN106897290B (en) Method and device for establishing keyword model
CN111309910A (en) Text information mining method and device
CN103020645A (en) System and method for junk picture recognition
CN113076735A (en) Target information acquisition method and device and server
CN106033444B (en) Text content clustering method and device
CN104077320B (en) method and device for generating information to be issued
CN105512270B (en) Method and device for determining related objects
CN116881430A (en) Industrial chain identification method and device, electronic equipment and readable storage medium
CN104462065A (en) Event emotion type analyzing method and device
CN104298786B (en) A kind of image search method and device
CN103092838A (en) Method and device for obtaining English words
CN104991920A (en) Label generation method and apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20191209

Address after: 201210 room j1328, floor 3, building 8, No. 55, Huiyuan Road, Jiading District, Shanghai

Applicant after: SHANGHAI YOUYANG NEW MEDIA INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 100085 Baidu building, No. 10, ten Street, Haidian District, Beijing

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 401120 b7-7-2, Yuxing Plaza, No.5, Huangyang Road, Yubei District, Chongqing

Patentee after: Chongqing duxiaoman Youyang Technology Co.,Ltd.

Address before: 201210 room j1328, 3 / F, building 8, 55 Huiyuan Road, Jiading District, Shanghai

Patentee before: SHANGHAI YOUYANG NEW MEDIA INFORMATION TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address