CN111753020A - Method and device for establishing relational network model - Google Patents

Method and device for establishing relational network model Download PDF

Info

Publication number
CN111753020A
CN111753020A CN201910246823.5A CN201910246823A CN111753020A CN 111753020 A CN111753020 A CN 111753020A CN 201910246823 A CN201910246823 A CN 201910246823A CN 111753020 A CN111753020 A CN 111753020A
Authority
CN
China
Prior art keywords
structured data
attribute
knowledge base
words
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910246823.5A
Other languages
Chinese (zh)
Inventor
崔莉
李圣
李非凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910246823.5A priority Critical patent/CN111753020A/en
Publication of CN111753020A publication Critical patent/CN111753020A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a method and a device for establishing a relational network model. The method comprises the following steps: acquiring structured data to be processed; obtaining the relationship between entity information and each entity according to the structured data and the corresponding knowledge base; and establishing a relation network model corresponding to the scene according to the entity information and the relation between the entities. By adopting the method provided by the application, the problem of low modeling efficiency caused by the need of carrying out manual labeling on the structured data in the prior art is solved.

Description

Method and device for establishing relational network model
Technical Field
The present application relates to the field of data processing, and in particular, to a method and an apparatus for establishing a relational network model.
Background
Under a plurality of application scenes in the field of Internet, mass data generated by various activities under the application scenes can be obtained; for example, in the network shopping mall scene, a large amount of various data generated by network sales and payment can be generated every day; how to analyze and utilize the data so as to obtain effective understanding and analysis of the real world reflected by the data is important work in the field of big data; a main analysis target is to abstract real world entities reflected by the data from the data and obtain attributes of the entities and relations among different entities; this analysis is called relational network analysis.
The premise of the relational network analysis is that a relational network model needs to be established for a preset scene, and specific data can be further effectively utilized to analyze specific entities and entity relations on the basis of the relational network model. The relational network model needs to be able to reflect which types of entities are in the scene, which relationships each type of entity has among each other, and which attributes these entities and relationships have. Currently, the main form of the relational network model is the OLP relational network model.
OLP is an abbreviation for Object-Link-Property, which represents an entity in a business scenario, either a "person" (e.g., member, retailer) or an "Object" (e.g., merchandise, store); link represents the relationship in a business scene, the relationship is divided into behavior relationship and fact relationship, a user purchases commodities, wherein purchasing is a behavior relationship, a shelf belongs to a store, and the belonging is a fact relationship; property represents properties of entity and relation, properties of 'store' (object) having address and city, properties of 'purchase' (relation) having payment time and payment form.
Currently, structured data (e.g., an Excel table, a CSV file, a database table) generated in a preset scene is generally analyzed, and the OLP model is obtained according to the analysis of the structured data.
In the prior art, when a user needs to perform relational network analysis on an application scene in which the user is interested, the user needs to manually label the obtained typical structured data to obtain an OLP model in the scene, which is used as a basis for data analysis.
In the prior art, a user needs to perform manual modeling in a background management system, that is, to construct a desired entity and relationship by analyzing structured data, which is used as a basis for subsequent data analysis.
Then, for a complex application scenario, when the user data table is too many, the workload of manual modeling is huge, and the popularization of the OLP relational network model is seriously hindered.
Disclosure of Invention
The application provides a method and a device for establishing a relational network model, which aim to solve the problem of low modeling efficiency caused by the fact that attributes, entities and relations in structured data are acquired and manual marking is needed for the structured data in the prior art.
The application provides a method for establishing a relation network model, which comprises the following steps:
acquiring structured data to be processed;
obtaining the relationship between entity information and each entity according to the structured data and the corresponding knowledge base;
and establishing a relation network model corresponding to the scene according to the entity information and the relation between the entities.
Optionally, the structured data is structured data of a corresponding scene.
Optionally, the obtaining of the relationship between the entity information and each entity according to the structured data and the corresponding knowledge base includes:
according to the attribute information of the structured data, identifying entity information contained in the structured data by taking the knowledge base as a basis;
and determining the relationship between the entities corresponding to the entity information according to the attribute information and the entity information obtained by identification and the knowledge base.
Optionally, the attribute information of the structured data is obtained by the following steps, including:
and identifying the structured data according to the knowledge provided by the pre-prepared knowledge base to obtain the attribute information of the structured data.
Optionally, the pre-prepared knowledge base is constructed by the following steps:
acquiring attribute words corresponding to the knowledge base;
acquiring entity words corresponding to the knowledge base according to the attribute words;
and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
Optionally, the attribute words are obtained by adopting the basic attribute words corresponding to the knowledge base through the following dimension expansion, including:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
Optionally, the method for establishing a relationship network model further includes:
setting a weight value range corresponding to each dimension aiming at the expansion mode of each dimension; and setting the specific weight of each attribute word in the weight value range corresponding to the expansion mode according to the specific attribute word.
Optionally, the identifying, according to the structured data, the attribute information of the structured data based on the knowledge provided by the pre-prepared knowledge base includes:
searching in synonyms of the basic attribute words, regular expressions of the basic attribute words and enumeration words of the basic attribute words aiming at the structured data to obtain corresponding attribute information;
and identifying the attribute information contained in the structured data according to the corresponding attribute information.
Optionally, the searching, for the structured data, in the synonym of the basic attribute word, the regular expression of the basic attribute word, and the enumeration word of the basic attribute word, to obtain corresponding attribute information includes:
searching in synonyms of the basic attribute words aiming at the structured data to obtain first candidate attribute information corresponding to the structured data;
searching in the regular expression of the basic attribute words aiming at the structured data to obtain second candidate attribute information corresponding to the structured data;
searching in the enumeration words of the basic attribute words aiming at the structured data to obtain third candidate attribute information corresponding to the structured data;
and acquiring attribute information corresponding to the structured data according to the first candidate attribute information, the second candidate attribute information, the third candidate attribute information and the corresponding weight values.
Optionally, the identifying, according to the attribute information of the structured data and based on the knowledge base, entity information included in the structured data includes:
and searching in the knowledge base according to the primary key attribute in the attribute information of the structured data to obtain the entity information of the structured data.
Optionally, the determining, according to the attribute information and the entity information obtained through identification and according to the knowledge base, a relationship between entities corresponding to the entity information includes:
searching in the knowledge base according to the attribute information of the structured data and the entity information of the structured data, and determining a candidate relationship between entities corresponding to the entity information;
and determining the relation between the entities corresponding to the entity information from the candidate relation between the entities corresponding to the entity information.
Optionally, the establishing a relationship network model corresponding to a scene according to the relationship between the entity information and each entity includes:
establishing a mapping relation between the structured data and the entity and a mapping relation between the structured data and the relation according to the attribute information of the structured data, the entity information of the structured data and the relation between the entities corresponding to the entity information;
and establishing a relation network model corresponding to the scene according to the mapping relation between the structured data and the entity and the mapping relation between the structured data and the relation.
Optionally, the method for establishing a relationship network model further includes:
acquiring a request for acquiring a network relation queried by a client;
acquiring a relation network model from the electronic equipment storing the relation network model aiming at the request for inquiring the network relation;
acquiring network relationship data from a relationship network model according to the request for inquiring the network relationship;
providing the network relationship data to the client.
The application provides a device for establishing a relation network model, which comprises:
the data acquisition unit is used for acquiring structured data to be processed;
the relation determining unit is used for obtaining the relation between the entity information and each entity according to the structured data and the corresponding knowledge base;
and the model establishing unit is used for establishing a relation network model of the corresponding scene according to the relation between the entity information and each entity.
The application provides an electronic device, including:
a processor;
and the number of the first and second groups,
and the memory is used for storing a computer program, and the equipment executes the method for establishing the relational network model or the method for establishing the knowledge base after running the computer program through the processor.
The present application provides a computer storage medium storing a computer program that is executed by a processor to perform the above-described method of building a relational network model or to perform the following method of building a knowledge base. .
The application provides a method for constructing a knowledge base, which comprises the following steps:
acquiring attribute words corresponding to the knowledge base;
acquiring entity words corresponding to the knowledge base according to the attribute words;
and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
Optionally, the attribute words are obtained by adopting the basic attribute words corresponding to the knowledge base through the following dimension expansion, including:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
By adopting the method for establishing the relational network model, firstly, structural data to be processed is obtained; further, according to the structured data and the corresponding knowledge base, acquiring the relationship between entity information and each entity; and finally, establishing a relation network model corresponding to the scene according to the entity information and the relation between the entities. By the method, semi-automatic and even automatic modeling can be realized by using the established knowledge base from the structured data obtained from the preset scene to be analyzed, and the problem of low modeling efficiency caused by manual marking on the structured data in the prior art is solved.
Drawings
FIG. 1 is a flowchart of a method for building a relational network model according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a relational network model to which a first embodiment of the present application relates;
FIG. 3 is a diagram illustrating an apparatus for building a relational network model according to a second embodiment of the present disclosure;
fig. 4 is a flowchart of a method for building a knowledge base according to a fifth embodiment of the present application.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.
A first embodiment of the present application provides a method for building a relational network model. Please refer to fig. 1, which is a flowchart illustrating a first embodiment of the present application. A method for establishing a relational network model according to a first embodiment of the present application is described in detail below with reference to fig. 1.
Step S101: and acquiring structural data to be processed.
The method comprises the step of obtaining structural data to be processed, wherein the structural data correspond to a preset scene.
Structured data refers to data that is stored in a database and can be implemented logically in a two-dimensional table structure. For example, data stored in an Excel table, data stored in a database, and the like.
The structured data can be derived from existing data obtained in the field. The field refers to specific industries such as public security, finance, telecommunication and the like. Each industry is differentiated by its own unique characteristics. For example, in the field of telecommunications, data stored in a database for recording information such as call duration and call party information is a typical structured data.
The structured data is structured data of a corresponding scene. The scenario refers to the use environment of structured data in the field, for example, in the financial field, network payment is a scenario. The related structured data for the network payment, such as the payer information and the payee information of the network payment, corresponds to the network payment scenario.
Generally, the structured data here is various business related data that has been collected in the domain. For example, in the field of telecommunications, as the amount of information grows, a very large pool of information resources is formed. In this information repository, a lot of structured data is kept. The structured data can be stored in a local database or a cloud database.
Step S102: and obtaining the relationship between the entity information and each entity according to the structured data and the corresponding knowledge base.
This step is used for obtaining the relation between the entity information and each entity according to the structured data and the corresponding knowledge base.
The obtaining of the relationship between the entity information and each entity according to the structured data and the corresponding knowledge base includes:
according to the attribute information of the structured data, identifying entity information contained in the structured data by taking the knowledge base as a basis;
and determining the relationship between the entities corresponding to the entity information according to the attribute information and the entity information obtained by identification and the knowledge base.
The attribute information of the structured data is obtained by adopting the following steps of:
and identifying the structured data according to the knowledge provided by the pre-prepared knowledge base to obtain the attribute information of the structured data.
After identifying the attribute information of the structured data, it is possible to further identify entities and relationships between entities on this basis.
The knowledge base refers to a rule set applied by expert system design, and comprises facts and data related to the rules, and all of the facts and data form the knowledge base. In this application, a knowledge base refers to a collection of entities, relationships, attributes, and attribute synonyms that are related to a domain. In this embodiment, under the condition of low samples, relevant people can manually construct a knowledge base of a certain field by using the characteristics of field words and industries. The constructed knowledge base can be stored in a local database or a cloud database.
The construction steps of the knowledge base are explained in detail below.
The pre-prepared knowledge base is constructed by the following steps of:
acquiring attribute words corresponding to the knowledge base;
acquiring entity words corresponding to the knowledge base according to the attribute words;
and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
Firstly, collecting the domain knowledge corresponding to the knowledge base to obtain the attribute words.
The domain knowledge may contain multiple meanings; for example, the attribute words that may be included in the specific field can be obtained from general knowledge, which is knowledge in the specific field. Specifically, from the common sense of online shopping, it may be known that attribute words such as buyer ID and seller ID are required in the online shopping field. Of course, many omissions may be made merely based on the general knowledge, and actually structured data that has been generated in a specific field may be collected, and attribute words that may appear in the specific field may be obtained based on information in the structured data. The structured data can be Excel files, CSV files, structured database tables. For example, if the structured data is data stored in Excel table form, the field names in the table can be used as the main source of the attribute words; of course, the field value, label, even the name of the Excel table, etc. can be the source of the attribute word.
And secondly, acquiring entity words in the field according to the attribute words in the field.
The attribute words can often reflect entities existing in the real world, for example, the unified social credit code actually reflects legal entities having a corresponding relationship, and further, in combination with other attributes of the structured data, can reflect specific information of the legal in a specific field. For example, if an Excel form includes several attributes such as a uniform social credit code, a vehicle brand, a vehicle use time, a vehicle use condition, a vehicle license plate, etc., it can reflect that the form corresponds to an entity called "legal person" and the entity "vehicle"; the present embodiment assumes that the specific domain is a business domain.
The entity of 'legal person' and 'vehicle' can be obtained by the expert who constructs the knowledge base through the summary of the table content according to the knowledge of the expert, or can be obtained by the derivation of a machine according to an initial knowledge base; in addition, it is also known that "legal person" has an attribute of "unified social credit code", and "vehicle" has attributes of "brand", "time of use", "license plate", "use status", and the like.
And thirdly, determining the relation words in the field according to the attribute words in the field and the entity words in the field.
For example, it can be known that "legal" and "vehicle" have a relationship of "own" by Excel form including several attributes of uniform social credit code, vehicle license plate, vehicle use time, etc.
In the above example, two entities, "legal" and "vehicle" are obtained by a form including several attributes of a unified social credit code, a license plate of the vehicle, a time of use of the vehicle, etc.; the 'legal person' has the attribute of 'unified social credit code', and the attribute can be used as the primary key information of the 'legal person' entity as the attribute has one-to-one relationship with the legal person entity; similarly, a "vehicle" has attributes such as license plate, usage time, usage status, etc., and the use of the license plate of the vehicle can correspond to a specific vehicle one-to-one, and thus can be used as the main key information of the "vehicle" entity, which is, of course, premised on the research of this specific field of commerce. The relationship of 'own' is between 'legal person' and 'vehicle', and 'own' has the attribute of 'time of possession'.
The attribute words can be expanded in some ways on the basis of some basic attribute words so as to complete the knowledge of the knowledge base; some common extensions are provided below.
The attribute words are obtained by adopting basic attribute words corresponding to the knowledge base through the following dimension expansion, and the method comprises the following steps:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
Here, the base attribute word is a normalized representation of the attribute. For example, for the basic attribute of a cell phone number, it can be represented in the unstructured data as a phone number, a cell phone, a contact, a number 1, a number 2, a string 1, a string 2.
For a mobile phone number, the synonym of the basic attribute word may include: mobile, phone number, shouji.
Regular expressions, also known as regular expressions, are a concept of computer science. Regular expressions are typically used to retrieve, replace, text that conforms to a certain pattern (rule).
For a mobile phone number, the regular expression of the basic attribute word may be: string regex ″ ((13[0-9]) | (14[579]) | (15([0-3] | [5-9])) | (16[6]) | (17[0135678]) | (18[0-9]) | (19[89])) \ \ d {8}) $ ";
[1] indicating that the first bit of the string begins with 1;
[3, 4, 5, 7, 8], indicating that the second bit of the character string can be any one of 3, 4, 5, 7, 8;
0-9, which represents the third bit to the eleventh bit of the character string, and is 9 integers with the value range of 0-9.
Through the regular expression, whether the unstructured field content is the mobile phone number can be judged according to the characteristics of the mobile phone number.
The enumerations of the basic attribute words may be enumerated, for example, for the provincial administrative regions of china, such as zhejiang, beijing, shanghai, henna, hubei, xinjiang, shandong, henbei, and hunan.
The method for establishing the relation network model further comprises the following steps:
setting a weight value range corresponding to each dimension aiming at the expansion mode of each dimension; and setting the specific weight of each attribute word in the weight value range corresponding to the expansion mode according to the specific attribute word.
For the dimensions of the three basic attribute words, synonyms, regular expressions and enumerators, a weight value range corresponding to the dimensions may be set, for example, the weight value range of the synonyms may be [0.6-0.8], the weight value range of the regular expressions may be [0.1-0.5], and the weight value range of the enumerators may be [0.2-0.4 ]. And setting the specific weight of each attribute word in the weight value range corresponding to the expansion mode according to the specific attribute word. For example, for the attribute word of the mobile phone number, the weight value of the synonym may be set to 0.8, the weight value of the regular expression may be set to 0.1, and the weight value of the enumeration word may be set to 0.2.
The above specific knowledge base is the basis for implementing the method for establishing the relational network model provided by the embodiment, and the knowledge base provides knowledge of a specific field, and the knowledge itself already includes the entities possibly existing in the field, the attributes of the entities, the relations between the entities, the attributes of the relations between the entities, and the like; the method for establishing the relationship network model provided in this embodiment is to establish the relationship network model fitting the scene on the basis of having the knowledge to describe the entities, the relationships of the entities, the attributes of the entities and the attributes of the relationships in the scene. The knowledge base is rich in content, and the relation network model of the preset scene is a specific model serving the preset scene, so that a cognitive model can be provided for accurately describing the preset scene.
The knowledge base in this application may also be referred to as a domain knowledge base.
The specific implementation manner of identifying the attribute information of the structured data according to the structured data and based on the knowledge provided by the pre-prepared corresponding knowledge base can adopt the following specific steps:
searching in synonyms of the basic attribute words, regular expressions of the basic attribute words and enumeration words of the basic attribute words aiming at the structured data to obtain corresponding attribute information;
and identifying the attribute information contained in the structured data according to the corresponding attribute information.
For example, for a structured data table, the field name is shouji, the field is annotated as "mobile phone number", the regular expression for the content of the field is String regex ═ a (((13[0-9]) | (14[579]) | (15([0-3] | [5-9 ]))) | (16[6]) | (17[0135678]) | (18[0-9]) | (19[89])) \ \ d {8}), ", the synonym for the field is" mobile phone number "," mobile phone "," shoji "," contact manner "," mobile ", and the enumerated word for the field is chinese mobile, china unicom, chinese telecom. And searching the structured data table in synonyms of the basic attribute words, regular expressions of the basic attribute words and enumeration words of the basic attribute words to obtain attribute information of each dimension. Then, according to the corresponding attribute information, the attribute information included in the structured data is identified, for example, it is determined that the structured data table includes the normalized attribute information of the mobile phone number.
Searching in the synonym of the basic attribute word, the regular expression of the basic attribute word and the enumeration word of the basic attribute word aiming at the structured data to obtain corresponding attribute information, wherein the searching comprises the following steps:
searching in synonyms of the basic attribute words aiming at the structured data to obtain first candidate attribute information corresponding to the structured data;
searching the regular expression of the basic attribute words aiming at the structured data to obtain second candidate attribute information corresponding to the structured data;
searching in the enumerated words of the basic attribute words aiming at the structured data to obtain third candidate attribute information corresponding to the structured data;
and acquiring attribute information corresponding to the structured data according to the first candidate attribute information, the second candidate attribute information, the third candidate attribute information and the corresponding weight values.
For example, for a structured data table, the field name is "shouji", the field is annotated with "mobile phone number", the regular expression for the content of the field is String regex ═ a (((13[0-9]) | (14[579]) | (15([0-3] | [5-9])) | (16[6]) | (17[0135678]) | (18[0-9]) | (19[89])) \ \ d {8}), $ ", the synonym for the field is" mobile phone number "," shoji "," contact manner "," mobile ", and the enumerated word for the field is chinese mobile, china unicom, chinese telecom.
For the structured data table, if the field of the structured data table contains the mobile phone, first candidate attribute information corresponding to the structured data is obtained by searching in the synonym of the basic attribute word, and in this example, the first candidate attribute information is the mobile phone number.
For the structured data table, if the field content of the structured data table includes a digital character String, second candidate attribute information corresponding to the structured data is obtained by searching in a regular expression of the basic attribute word, in this example, the regular expression is String regex ^ ((((13 [0-9]) | (14[579]) | (15([0-3] | [5-9 ])))) | (16[6]) | (17[0135678]) | (18[0-9]) | (19[89 ]))) \\ \ d {8}) $ ", if the digital character String conforms to the content specified by the regular expression, the second candidate attribute information is a mobile phone number, and if the digital character String does not conform to the content specified by the regular expression, for example, 19900001111, the second candidate attribute information is determined to be a non-mobile phone number.
For the structured data table, if the field content of the structured data table includes operator information, searching is performed in the enumerated words of the basic attribute words to obtain third candidate attribute information corresponding to the structured data, in this example, the enumerated words are china mobile, china unicom and china telecom, and if the field content includes the enumerated word content, it is determined that the third candidate attribute information of the structured data is a mobile phone number.
And acquiring attribute information corresponding to the structured data according to the first candidate attribute information, the second candidate attribute information, the third candidate attribute information and the corresponding weight values. Here, the corresponding weight value is a weight value of each dimension of the basic attribute word. For example, for the attribute word of the mobile phone number, the weight value of the synonym may be set to 0.8, the weight value of the regular expression may be set to 0.1, and the weight value of the enumeration word may be set to 0.2.
The obtaining of the relationship between the entity information and each entity according to the structured data and the corresponding knowledge base includes:
according to the attribute information of the structured data, identifying entity information contained in the structured data by taking the knowledge base as a basis;
and determining the relationship between the entities corresponding to the entity information according to the attribute information and the entity information obtained by identification and the knowledge base.
The identifying entity information contained in the structured data according to the attribute information of the structured data and based on the knowledge base comprises:
and searching in the knowledge base according to the primary key attribute in the attribute information of the structured data to obtain the entity information of the structured data.
The primary key attribute in the attribute information of the structured data is used as an attribute which can have a unique corresponding relation with an entity, and the entity information contained in the structured data can be obtained through knowledge provided by the knowledge base.
For example, the structured data is a vehicle information table of a company, in which information of a large number of companies is recorded, wherein the information includes field attributes such as "unified social credit code", "vehicle license plate", "vehicle use time", and the like, and through a knowledge base, it can be judged that there are two entities of "company" and "vehicle" in the scene. The knowledge base contains entities such as companies and vehicles, and related attributes of the two entities, and various forms of descriptors of the attributes, so that although specific names of field attributes in the tables may be different, the knowledge base can summarize the two entities in the scene from similar attribute words.
For particularly complex situations, it may be necessary to make decisions by some strategy based on knowledge provided by the knowledge base. For example, the fields, the contents and the remarks (comments) of the input table are respectively identified, the identification is based on 3 dimensions defined by an attribute library in a knowledge base to identify the attributes, so as to determine how many attributes can be matched with the input table, which attributes are primary key attributes, and the weight value of each identification result is given; and finally, comprehensively judging the entity corresponding to the table according to the attribute with the high weight value.
The determining the relationship between the entities corresponding to the entity information according to the attribute information, the entity information obtained by identification and the knowledge base includes:
searching in the knowledge base according to the attribute information of the structured data and the entity information of the structured data, and determining a candidate relationship between entities corresponding to the entity information;
and determining the relation between the entities corresponding to the entity information from the candidate relation between the entities corresponding to the entity information.
And searching in the knowledge base according to the attribute information of the structured data and the entity information of the structured data, so as to find out the relationship among the entities corresponding to a plurality of entity information. And determining the most probable relationship among the entities corresponding to the entity information from the candidate relationship among the entities corresponding to the entity information according to the attribute information and the entity information of the structure data.
When determining relationships between entities in an input form (primary form of structured data), it may also be desirable to determine the most appropriate relationship from a plurality of candidate relationships using, for example, the weights mentioned above in the manner of determining entities.
After determining the relationship between the entities, the relationship between the entities needs to be stored in a local memory or a nonvolatile storage medium, or may be stored in a relevant storage area of the cloud.
Step S103: and establishing a relation network model corresponding to the scene according to the entity information and the relation between the entities.
The step is used for establishing a relation network model corresponding to the scene according to the entity information and the relation between the entities.
The establishing of the relationship network model corresponding to the scene according to the relationship between the entity information and the entities comprises:
establishing a mapping relation between the structured data and the entity and a mapping relation between the structured data and the relation according to the attribute information of the structured data, the entity information of the structured data and the relation between the entities corresponding to the entity information;
and establishing a relation network model corresponding to the scene according to the mapping relation between the structured data and the entity and the mapping relation between the structured data and the relation.
Here, the established relationship network model may be stored in a memory of the electronic device or a nonvolatile storage medium, or may be stored in a related storage area of the cloud.
The relational network model here is an OLP-based model. OLP is an abbreviation of Object-Link-Property, which represents an entity in a business scenario, and can be "person" (e.g. member, retailer) or "thing" (e.g. commodity, store); link represents the relationship in a business scene, the relationship is divided into behavior relationship and fact relationship, a user purchases commodities, wherein purchasing is a behavior relationship, a shelf belongs to a store, and the belonging is a fact relationship; property represents properties of entity and relationship, such as properties of 'store' (object) having address and city, and 'purchase' (relationship) having payment time and payment form.
Please refer to fig. 2, which is a diagram illustrating a relational network model. In fig. 2, a unified social credit code as an entity, an OLP relationship model between vehicles as an entity is provided. The relationship of the unified social credit code to the vehicle may be that a certain unified social credit code owns a certain vehicle.
The method for establishing the relation network model further comprises the following steps:
acquiring a request for acquiring a network relation queried by a client;
acquiring a relation network model from the electronic equipment storing the relation network model aiming at the request for inquiring the network relation;
acquiring network relationship data from a relationship network model according to the request for inquiring the network relationship;
providing the network relationship data to the client.
The above steps are actually the acquisition process of the relational network model and the related data at the client.
Taking fig. 2 as an example, for example, a traffic management department needs to query vehicle information of a certain company, the traffic management department inputs a corresponding unified social credit code at a client, and first, a relationship network model is obtained from an electronic device storing the relationship network model according to a request for querying a network relationship; then, according to the request for inquiring the network relationship, network relationship data is obtained from a relationship network model; and finally, providing the inquired vehicle information of the company to the client.
In the using process, the relationship network model provides information which can be obtained through the relationship network model in a preset scene, and related information can be inquired when the main key information of a specific entity is provided; due to the establishment of the relational network model, various structured data can be identified, and specific structured data (such as a certain record table) is classified into an associated entity or a certain relation, so that a basic path and related data of data analysis can be rapidly obtained when the data analysis is needed.
For example, for a specific usage scenario of "hotel management", for a large amount of obtained structured data, firstly analyzing which entities and relations between the entities, and establishing a relation network model, for example, there are two entities of "hotel" and "hotel supplies supplier"; further, the obtained structured data may be classified into different entities or relationships, respectively, for example, the structured data recording hotel supply provider information is classified into an entity of "hotel supply provider", the structured data recording hotel information is classified into an entity of "hotel", a "supply" relationship may be established between the two entities, and for a hotel supply record form, the structured data may be classified into a form describing the "supply" relationship. Thus, when a large amount of data in the scene is analyzed and utilized, a basic analysis framework is provided, and a basic analysis view is provided for all the structured data.
In the foregoing embodiment, a method for building a relationship network model is provided, and correspondingly, the present application also provides a device for building a relationship network model. Please refer to fig. 3, which is a flowchart illustrating an embodiment of an apparatus for building a relationship network model according to the present application. Since this embodiment, i.e., the second embodiment, is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The device embodiments described below are merely illustrative.
The device for establishing the relation network model comprises the following steps:
a data acquisition unit 301, configured to acquire structured data to be processed;
a relation determining unit 302, configured to obtain, according to the structured data and a corresponding knowledge base, entity information and relations between entities;
a model establishing unit 303, configured to establish a relationship network model of the corresponding scene according to the relationship between the entity information and each entity.
In this embodiment, the structured data is structured data of a corresponding scene.
In this embodiment, the relationship determining unit is specifically configured to:
according to the attribute information of the structured data, identifying entity information contained in the structured data by taking the knowledge base as a basis;
and determining the relationship between the entities corresponding to the entity information according to the attribute information and the entity information obtained by identification and the knowledge base.
In this embodiment, the attribute information of the structured data is obtained by the following steps, including:
and identifying the structured data according to the knowledge provided by the pre-prepared knowledge base to obtain the attribute information of the structured data.
In this embodiment, the pre-prepared knowledge base is constructed by the following steps, including:
acquiring attribute words corresponding to the knowledge base;
acquiring entity words corresponding to the knowledge base according to the attribute words;
and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
In this embodiment, the attribute words are obtained by expanding the following dimensions by using the basic attribute words corresponding to the knowledge base, and the method includes:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
In this embodiment, the apparatus for building a relational network model further includes a weight setting unit, configured to:
setting a weight value range corresponding to each dimension aiming at the expansion mode of each dimension; and setting the specific weight of each attribute word in the weight value range corresponding to the expansion mode according to the specific attribute word.
In this embodiment, the relationship determining unit is further configured to:
searching synonyms of basic attribute words corresponding to the knowledge base, regular expressions of the basic attribute words corresponding to the knowledge base and enumeration words of the basic attribute words corresponding to the knowledge base aiming at the structured data to obtain corresponding attribute information;
and identifying the attribute information contained in the structured data according to the corresponding attribute information.
In this embodiment, the relationship determining unit is further configured to:
searching in synonyms of the basic attribute words aiming at the structured data to obtain first candidate attribute information corresponding to the structured data;
searching the regular expression of the basic attribute words aiming at the structured data to obtain second candidate attribute information corresponding to the structured data;
searching in the enumerated words of the basic attribute words aiming at the structured data to obtain third candidate attribute information corresponding to the structured data;
and acquiring attribute information corresponding to the structured data according to the first candidate attribute information, the second candidate attribute information, the third candidate attribute information and the corresponding weight values.
In this embodiment, the relationship determining unit is further configured to:
and searching in the knowledge base according to the primary key attribute in the attribute information of the structured data to obtain the entity information of the structured data.
In this embodiment, the relationship determining unit is further configured to:
searching in the knowledge base according to the attribute information of the structured data and the entity information of the structured data, and determining a candidate relationship between entities corresponding to the entity information;
and determining the relation between the entities corresponding to the entity information from the candidate relation between the entities corresponding to the entity information.
In this embodiment, the model establishing unit is specifically configured to:
establishing a mapping relation between the structured data and the entity and a mapping relation between the structured data and the relation according to the attribute information of the structured data, the entity information of the structured data and the relation between the entities corresponding to the entity information;
and establishing a relation network model corresponding to the scene according to the mapping relation between the structured data and the entity and the mapping relation between the structured data and the relation.
In this embodiment, the apparatus for establishing a relationship network model further includes a query network relationship unit, configured to:
acquiring a request for acquiring a network relation queried by a client;
acquiring a relation network model from the electronic equipment storing the relation network model aiming at the request for inquiring the network relation;
acquiring network relationship data from a relationship network model according to the request for inquiring the network relationship; providing the network relationship data to the client.
A third embodiment of the present application provides an electronic apparatus, including:
a processor;
and the number of the first and second groups,
and a memory, configured to store a computer program, where the apparatus executes the computer program through the processor to perform the method for building a relational network model provided in the first embodiment of the present application, or perform the method for building a knowledge base provided in the fifth embodiment of the present application.
A fourth embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program is executed by a processor to perform the method for building a relational network model provided in the first embodiment of the present application, or to perform the method for building a knowledge base provided in the fifth embodiment of the present application.
A fifth embodiment of the present application provides a method for constructing a knowledge base, including:
step S401: acquiring attribute words corresponding to the knowledge base;
step S402: acquiring entity words corresponding to the knowledge base according to the attribute words;
step S403: and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
In this embodiment, the attribute words are obtained by expanding the following dimensions by using the basic attribute words corresponding to the knowledge base, and the method includes:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
Because of this embodiment, a detailed description has been provided in the first embodiment, and will not be described here again. Please refer to the related description of the first embodiment.
Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.
In a typical configuration, a computing device includes one or more operators (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (18)

1. A method of building a relational network model, comprising:
acquiring structured data to be processed;
obtaining the relationship between entity information and each entity according to the structured data and the corresponding knowledge base;
and establishing a relation network model corresponding to the scene according to the entity information and the relation between the entities.
2. The method of building a relational network model according to claim 1, wherein the structured data is structured data corresponding to a scene.
3. The method of building a relational network model according to claim 1, wherein the obtaining entity information and relationships between entities from the structured data and corresponding knowledge bases comprises:
according to the attribute information of the structured data, identifying entity information contained in the structured data by taking the knowledge base as a basis;
and determining the relationship between the entities corresponding to the entity information according to the attribute information and the entity information obtained by identification and the knowledge base.
4. The method for building the relational network model according to claim 3, wherein the attribute information of the structured data is obtained by the following steps, comprising:
and identifying the structured data according to the knowledge provided by the pre-prepared knowledge base to obtain the attribute information of the structured data.
5. The method for building a relational network model according to claim 4, wherein the knowledge base prepared in advance is constructed by the steps comprising:
acquiring attribute words corresponding to the knowledge base;
acquiring entity words corresponding to the knowledge base according to the attribute words;
and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
6. The method for establishing a relational network model according to claim 5, wherein the attribute words are obtained by adopting basic attribute words corresponding to the knowledge base through dimension expansion as follows, and the method comprises the following steps:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
7. The method of building a relational network model according to claim 6, further comprising:
setting a weight value range corresponding to each dimension aiming at the expansion mode of each dimension; and setting the specific weight of each attribute word in the weight value range corresponding to the expansion mode according to the specific attribute word.
8. The method for building the relational network model according to claim 6, wherein the identifying the attribute information of the structured data according to the structured data and the knowledge provided by the knowledge base prepared in advance comprises:
searching in synonyms of the basic attribute words, regular expressions of the basic attribute words and enumeration words of the basic attribute words aiming at the structured data to obtain corresponding attribute information;
and identifying the attribute information contained in the structured data according to the corresponding attribute information.
9. The method according to claim 8, wherein the searching for the structured data in the synonym of the basic attribute word, the regular expression of the basic attribute word, and the enumeration word of the basic attribute word to obtain corresponding attribute information includes:
searching in synonyms of the basic attribute words aiming at the structured data to obtain first candidate attribute information corresponding to the structured data;
searching in the regular expression of the basic attribute words aiming at the structured data to obtain second candidate attribute information corresponding to the structured data;
searching in the enumeration words of the basic attribute words aiming at the structured data to obtain third candidate attribute information corresponding to the structured data;
and acquiring attribute information corresponding to the structured data according to the first candidate attribute information, the second candidate attribute information, the third candidate attribute information and the corresponding weight values.
10. The method according to claim 3, wherein the identifying entity information contained in the structured data based on the knowledge base according to the attribute information of the structured data comprises:
and searching in the knowledge base according to the primary key attribute in the attribute information of the structured data to obtain the entity information of the structured data.
11. The method of claim 3, wherein the determining the relationship between the entities corresponding to the entity information according to the attribute information, the entity information obtained by the identification and the knowledge base comprises:
searching in the knowledge base according to the attribute information of the structured data and the entity information of the structured data, and determining a candidate relationship between entities corresponding to the entity information;
and determining the relation between the entities corresponding to the entity information from the candidate relation between the entities corresponding to the entity information.
12. The method according to claim 1, wherein the establishing a relational network model corresponding to a scene according to the relationship between the entity information and each entity comprises:
establishing a mapping relation between the structured data and the entity and a mapping relation between the structured data and the relation according to the attribute information of the structured data, the entity information of the structured data and the relation between the entities corresponding to the entity information;
and establishing a relation network model corresponding to the scene according to the mapping relation between the structured data and the entity and the mapping relation between the structured data and the relation.
13. The method of building a relational network model according to claim 1, further comprising:
acquiring a request for acquiring a network relation queried by a client;
acquiring a relation network model from the electronic equipment storing the relation network model aiming at the request for inquiring the network relation;
acquiring network relationship data from a relationship network model according to the request for inquiring the network relationship;
providing the network relationship data to the client.
14. An apparatus for building a relational network model, comprising:
the data acquisition unit is used for acquiring structured data to be processed;
the relation determining unit is used for obtaining the relation between the entity information and each entity according to the structured data and the corresponding knowledge base;
and the model establishing unit is used for establishing a relation network model of the corresponding scene according to the relation between the entity information and each entity.
15. An electronic device, comprising:
a processor;
and the number of the first and second groups,
a memory for storing a computer program, which when executed by the processor, performs the method of any one of claims 1-13, 17, 18.
16. A computer storage medium, characterized in that it stores a computer program which is run by a processor for performing the method according to any of claims 1-13, 17, 18.
17. A method of building a knowledge base, comprising:
acquiring attribute words corresponding to the knowledge base;
acquiring entity words corresponding to the knowledge base according to the attribute words;
and determining the relation words corresponding to the knowledge base according to the attribute words and the entity words.
18. The method for constructing a knowledge base according to claim 17, wherein the attribute words are obtained by adopting basic attribute words corresponding to the knowledge base through dimension expansion as follows, and the method comprises the following steps:
synonyms of basic attribute words corresponding to the knowledge base;
regular expressions of basic attribute words corresponding to the knowledge base;
and enumerating words of the basic attribute words corresponding to the knowledge base.
CN201910246823.5A 2019-03-28 2019-03-28 Method and device for establishing relational network model Pending CN111753020A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910246823.5A CN111753020A (en) 2019-03-28 2019-03-28 Method and device for establishing relational network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910246823.5A CN111753020A (en) 2019-03-28 2019-03-28 Method and device for establishing relational network model

Publications (1)

Publication Number Publication Date
CN111753020A true CN111753020A (en) 2020-10-09

Family

ID=72671928

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910246823.5A Pending CN111753020A (en) 2019-03-28 2019-03-28 Method and device for establishing relational network model

Country Status (1)

Country Link
CN (1) CN111753020A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559773A (en) * 2021-02-24 2021-03-26 北京通付盾人工智能技术有限公司 Knowledge graph system building method and device

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020058533A (en) * 2000-12-30 2002-07-12 오길록 Construction of Knowledge Base for Question/Answering on Internet
CN103425714A (en) * 2012-05-25 2013-12-04 北京搜狗信息服务有限公司 Query method and system
CN104462460A (en) * 2014-12-16 2015-03-25 武汉理工大学 Method of constructing REST (representational state transfer) styled ontology annotation visualization system
CN104866593A (en) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 Database searching method based on knowledge graph
CN107169078A (en) * 2017-05-10 2017-09-15 京东方科技集团股份有限公司 Knowledge of TCM collection of illustrative plates and its method for building up and computer system
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
US20180089576A1 (en) * 2016-09-23 2018-03-29 International Business Machines Corporation Identifying and analyzing impact of an event on relationships
US9965726B1 (en) * 2015-04-24 2018-05-08 Amazon Technologies, Inc. Adding to a knowledge base using an ontological analysis of unstructured text
CN108804408A (en) * 2017-04-27 2018-11-13 安徽富驰信息技术有限公司 Information extraction system based on domain-specialist knowledge system and information extraction method
CN109192321A (en) * 2018-09-26 2019-01-11 北京理工大学 The construction method and calculating storage device of drug knowledge mapping
CN109509556A (en) * 2018-11-09 2019-03-22 天津开心生活科技有限公司 Knowledge mapping generation method, device, electronic equipment and computer-readable medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020058533A (en) * 2000-12-30 2002-07-12 오길록 Construction of Knowledge Base for Question/Answering on Internet
CN103425714A (en) * 2012-05-25 2013-12-04 北京搜狗信息服务有限公司 Query method and system
CN104462460A (en) * 2014-12-16 2015-03-25 武汉理工大学 Method of constructing REST (representational state transfer) styled ontology annotation visualization system
US9965726B1 (en) * 2015-04-24 2018-05-08 Amazon Technologies, Inc. Adding to a knowledge base using an ontological analysis of unstructured text
CN104866593A (en) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 Database searching method based on knowledge graph
US20180089576A1 (en) * 2016-09-23 2018-03-29 International Business Machines Corporation Identifying and analyzing impact of an event on relationships
CN108804408A (en) * 2017-04-27 2018-11-13 安徽富驰信息技术有限公司 Information extraction system based on domain-specialist knowledge system and information extraction method
CN107169078A (en) * 2017-05-10 2017-09-15 京东方科技集团股份有限公司 Knowledge of TCM collection of illustrative plates and its method for building up and computer system
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN109192321A (en) * 2018-09-26 2019-01-11 北京理工大学 The construction method and calculating storage device of drug knowledge mapping
CN109509556A (en) * 2018-11-09 2019-03-22 天津开心生活科技有限公司 Knowledge mapping generation method, device, electronic equipment and computer-readable medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
王汀;冀付军;徐天晟;: "一种面向中文网络百科非结构化信息的知识获取方法", 图书情报工作, no. 13, 5 July 2016 (2016-07-05), pages 127 - 134 *
葛斌;谭真;张;肖卫东;: "军事知识图谱构建技术", 指挥与控制学报, no. 04, 15 December 2016 (2016-12-15), pages 42 - 48 *
鄂世嘉;林培裕;向阳;: "自动化构建的中文知识图谱系统", 计算机应用, no. 04, 10 April 2016 (2016-04-10), pages 116 - 120 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559773A (en) * 2021-02-24 2021-03-26 北京通付盾人工智能技术有限公司 Knowledge graph system building method and device

Similar Documents

Publication Publication Date Title
Kościelniak et al. BIG DATA in decision making processes of enterprises
US8874600B2 (en) System and method for building a cloud aware massive data analytics solution background
CN106126630B (en) A kind of collection of business object, searching method and device
CN111459985B (en) Identification information processing method and device
TW201600985A (en) Data query method and apparatus
US8892545B2 (en) Generating a compiler infrastructure
US8145619B2 (en) Method and system for identifying companies with specific business objectives
US20040122826A1 (en) Data model and applications
CN107220266B (en) Method and device for creating service database, storing service data and determining service data
CN111046237B (en) User behavior data processing method and device, electronic equipment and readable medium
CN109284323B (en) Management method and device for detection data
CN110929969A (en) Supplier evaluation method and device
US8799177B1 (en) Method and apparatus for building small business graph from electronic business data
US20110145005A1 (en) Method and system for automatic business content discovery
CN111382279A (en) Order examination method and device
CN104579909A (en) Method and equipment for classifying user information and acquiring user grouping information
CN103455335A (en) Multilevel classification Web implementation method
CN113360676A (en) Method and device for determining potential relation of enterprise based on knowledge graph
CN113205402A (en) Account checking method and device, electronic equipment and computer readable medium
CN107729330B (en) Method and apparatus for acquiring data set
CN101963993B (en) Method for fast searching database sheet table record
US20240127379A1 (en) Generating actionable information from documents
CN111753020A (en) Method and device for establishing relational network model
CN111144987A (en) Abnormal shopping behavior limiting method, limiting assembly and shopping system
CN106156904A (en) A kind of cross-platform fictitious assets source tracing method based on eID

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination