CN111104524A - Method for identifying television end user set - Google Patents

Method for identifying television end user set Download PDF

Info

Publication number
CN111104524A
CN111104524A CN201911355096.2A CN201911355096A CN111104524A CN 111104524 A CN111104524 A CN 111104524A CN 201911355096 A CN201911355096 A CN 201911355096A CN 111104524 A CN111104524 A CN 111104524A
Authority
CN
China
Prior art keywords
data
entity
relation
knowledge graph
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911355096.2A
Other languages
Chinese (zh)
Inventor
童奥
梁炬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Casicloud Co ltd
Original Assignee
Casicloud-Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Casicloud-Tech Co ltd filed Critical Casicloud-Tech Co ltd
Priority to CN201911355096.2A priority Critical patent/CN111104524A/en
Publication of CN111104524A publication Critical patent/CN111104524A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

The invention discloses a method for identifying a television end user set, which comprises the following steps: defining an entity naming rule; constructing relationships among entities; establishing a mapping relation between the entity attribute name and the standard attribute name; and reading the original data from the database to construct a knowledge graph. By the method, a knowledge graph construction rule is customized to embody a certain semantic relation between data, and the relation between each data sheet is defined byJSONThe data format is stored in a database, and then a program reads a data table and constructs a graph to describe a data model; the method can be easily applied to the project of constructing a knowledge graph by using a plurality of data tables with association.

Description

Method for identifying television end user set
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a method for identifying a television end user set.
Background
In recent years, knowledge maps are introduced into more and more application scenes, and the knowledge maps are essentially large-scale semantic networks and comprise entities, concepts and various semantic relationships among the entities and the concepts. The knowledge graph is one of the most important knowledge representation forms in the big data era and is a core technology for realizing cognitive intelligence. Meanwhile, with the rapid development of the internet, the content of the network data shows an explosive growth situation. The knowledge graph is actually a product of knowledge engineering reappeared in a big data era, the dependence of the knowledge graph on data is emphasized, but the characteristics of large scale, heterogeneous and multivariate internet content and loose organization structure provide challenges for the construction of the knowledge graph.
Most of the traditional knowledge engineering applications are limited, most of the traditional knowledge engineering applications are successful in a scene with clear rules and clear boundaries and closed application, and the construction method is called as a top-down method. Although there are many papers and results related to the construction of knowledge graph recently, when the conclusion of these papers is really applied to the self-research scenario, various problems and poor mobility are discovered.
Disclosure of Invention
In view of the above technical problems in the related art, the present invention provides a method for identifying a tv end user set, which can overcome the above disadvantages in the prior art.
In order to achieve the technical purpose, the technical scheme of the invention is realized as follows:
a method of identifying a set of television end users, the method comprising the steps of:
s1: defining an entity naming rule;
s2: constructing relationships among entities;
s3: establishing a mapping relation between the entity attribute name and the standard attribute name;
s4: and reading the original data from the database to construct a knowledge graph.
Further, the step S2 includes the following steps:
s21: predefining entity relationships;
s22: establishing a table and a query rule between tables;
s23: entity relationships in the table are constructed.
Further, the step S3 includes the following steps:
s31: predefining a standard for attribute naming;
s32: field names of the same attribute but not standard in different data tables are mapped to a uniform name.
Further, the step S4 includes the following steps:
s41: acquiring original data;
s42: labeling the data;
s43: encapsulating data asDomain Event
S44: sendingDomain EventToKafka
S45: graph database readingKafkaThe data of (1);
s46: and synthesizing the knowledge graph according to the label of the entity data and the label of the relation data.
Further, the step S42 further includes the following steps:
s421: labeling entity data;
s422: and labeling the relation data.
Further, in the step S43, the encapsulation data encapsulates the entity data and the relationship data.
Further, the relationship data is in one-to-one correspondence with the entity data by labeling.
The invention has the beneficial effects that: by the method, a knowledge graph construction rule is customized to embody a certain semantic relation between data, the relation between each data sheet is stored in a database in a JSON data format, and then the data sheets are read by a program and a graph is constructed to describe a data model; the method can be easily applied to the project of constructing a knowledge graph by using a plurality of data tables with association.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flow chart of a method for identifying a set of tv end users according to an embodiment of the present invention;
fig. 2 is a block diagram of a knowledge graph construction rule of a method for identifying a set of tv end users according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
As shown in fig. 1, a method for identifying a set of tv end users according to an embodiment of the present invention includes the following steps:
s1: defining an entity naming rule;
s2: constructing relationships among entities;
s3: establishing a mapping relation between the entity attribute name and the standard attribute name;
s4: and reading the original data from the database to construct a knowledge graph.
Step S2 includes the following steps:
s21: predefining entity relationships;
s22: establishing a table and a query rule between tables;
s23: entity relationships in the table are constructed.
Step S3 includes the following steps:
s31: predefining a standard for attribute naming;
s32: field names of the same attribute but not standard in different data tables are mapped to a uniform name.
Step S4 includes the following steps:
s41: acquiring original data;
s42: labeling the data;
s43: encapsulating data asDomain Event
S44: sendingDomain EventToKafka
S45: graph database readingKafkaThe data of (1);
s46: and synthesizing the knowledge graph according to the label of the entity data and the label of the relation data.
Step S42 further includes the steps of:
s421: labeling entity data;
s422: and labeling the relation data.
In an embodiment of the invention, in the step S43, the encapsulation data encapsulates the entity data and the relationship data.
In a specific embodiment of the present invention, the relationship data is in one-to-one correspondence with the entity data by tagging.
In order to facilitate understanding of the above-described technical aspects of the present invention, the above-described technical aspects of the present invention will be described in detail below in terms of specific usage.
As shown in fig. 2, the knowledge graph construction method based on the custom rule is composed of the following three parts:
a first part: an entity naming rule.
A second part: and establishing rules for the relationships among the entities.
And a third part: and mapping rules of the attribute names of the entities and the standard attribute names.
Entity naming rules:
the construction of the knowledge graph network firstly establishes a triple relation, and entity triples are mainly constructed based onRDF(Resource Description Framework) Is a description ofWebMarkup languages for resources are a general way of describing information that can be read and understood by a computer.WebResource means can ownURI(Uniform Resource IdentifierI.e., uniform resource identifier), which are located by a unique universal resource identifier;
the entity naming rule is used for embodying each resourceURIIs customized for uniqueness and legibility. For example, for the naming of an object, there may be duplicates in the name database of the object, but other attributes of the object are different,the description is of two different entities, and their use is thereforeURIWhen they are named, the corresponding real objects are addedidNumber, since id is unique in the same data table; if a non-real object is an entity represented by a string of numbers, for example, an order is named, and the naming of the order by the order number alone is unique but has poor readability, an legibility principle is embodied at this time, a table which is associated with an order table and has fields named by Chinese characters in the table is subjected to joint check, the relevant fields are taken out and spliced with the order number, or the order number and the table name are spliced, so that uniqueness and legibility are embodied.
And (3) establishing a rule of the relationship among the entities:
the rule is the core of knowledge graph construction and represents the relationship between entities. The relationships between the entities are predefined, and the query rules between the tables and the relationships between the entities in the tables are constructed according to the relationships. The relation between tables is combed according to the existing data, each relation is a connecting link between two tables, a knowledge network map is formed by combining, the relations are directional, the two tables point to different directions, and the relations are different.
Mapping rules of the attribute names of the entities and the standard attribute names are as follows:
because the field names are not named according to the uniform standard when each entity table is customized, the same attribute naming mode of different entities is different, and the direct processing is difficult, so that the standard of the attribute naming needs to be predefined, and the field names which represent the same attribute but are not standard in different data tables are mapped to the uniform naming.
Rule content details:
with a tabular data extraction entity listed belowJSONConfiguration of the format, where the first two keys: "entityConfigJson"and"entityTagConfigJson"correspondent entity naming rule, wherein"entityConfigJsonIs configured for an entityJson,"entityTagConfigJsonIs an entity labelJson;"relationTableIdMapJson"rule is constructed by the relationship between key-corresponding entities, wherein"relationTableIdMapJson"is a table relationship mappingjson;"property_key_mapping"mapping relationship of attribute name of corresponding entity to standard attribute name, wherein"property_key_mapping"is the attribute field mapping.
{
"entityConfigJson": {
"joinXIdColumn": ["relId"],
"joinOtherTableName": ["industry_tenant_release"],
"joinOtherIdColumn": ["id"],
"joinOtherColumn": ["capAndproName"]
},
"entityTagConfigJson": {
"JointTableNameFlag": true,
"JointIdFlag": true,
"descriptionColumns": ["id", "capAndproName"]
},
"relationTableIdMapJson": {
"industry_buyer_inquiry": {
"beRelatedKey": "id",
"relatedKey": "inqId"
},
"industry_tenant_release": {
"beRelatedKey": "id",
"relatedKey": "relId"
}
},
"property_key_mapping": {
"deli_time": "deliTime",
"order_status": "status",
"amount": "amount"
}
}
Firstly, the entity naming rule, because there are no fields which can be directly used for entity naming in the data table, all fields are numbers, and the readability is not strong although the uniqueness can be ensured, therefore, a field in another table is associated for naming, which is used here "entityConfigJson"and"entityTagConfigJson"two keys" define the naming convention for the entities in this table. Wherein "entityConfigJson"field names for defining associable queries in the present data table, data tables to associate queries, and associated field names in the associated tables that are available for naming. For example,') "joinXIdColumn"field name for associated query in this data sheet is indicated"relId","joinOtherTableName"data sheet name indicating associated query"industry_tenant_release"data sheet"joinOtherIdColumn"name of associated field in data table for indicating associated query"id","joinOtherColumn"indicating associated in associated tableidCorresponding "capAndproName"is used. I.e. in this data sheetrelIdAndindustry_tenant_releasein the tableidValue is equal torelIdIs/are as followscapAndproNameThe field concatenation constitutes the entity name, but this is not intuitive enough because of the string "number +capAndproNameCan be mistaken for "industry_tenant_release"entities in tables, therefore, lower bonds"entityTagConfigJson"serves to distinguish entities, among"JointTableNameFlag"Boolean-type value indicates whether the name of its own data table is to be spliced in the entity name"JointIdFlag"also of Boolean type, indicating whether or not to spliceid,"descriptionColumns"then the fields that need to be concatenated in the entity name are listed.
Then is "relationTableIdMapJson"is used to define the table and the association relationship between the tables.
Finally, the mapping rule of the actual entity attribute name and the standard attribute name is "property_key_mapping"to complete the mapping. In this example "deli_time": "deliTime"the key represents the standard name of the attribute, the value represents the field name corresponding to the attribute in the data table, the attribute names need to be unified, and convenience is providedAnd (5) carrying out subsequent treatment.
And (4) finishing the series of rules, namely finishing the extraction work of the entity, and then further processing the entity data to construct the knowledge graph.
Constructing a knowledge graph:
the above rules are used to read the original data from the database, which is only the first step of constructing the knowledge graph, and are extracted as the entity data for constructing the knowledge graph. The subsequent steps include labeling the data and packagingDomain EventIs then stored inKafkaAnd, finally, graph database readingKafkaAnd generating a knowledge graph by the data.
Data is tagged and encapsulated, including both entity data and relationship data. The labeling is to describe each piece of read data in detail and package the data into a wholeJSONData or data in dictionary format, and the label of each entity data comprises:idsourceclassrelationobjectentityandhandleType. Therein, in addition tohandleTypeOther labels than encapsulateddataInfoOf the output tagged entity dataJSONThe type structure is as follows:
{
"handleType":"category, entity extraction or relationship extraction",
"dataInfo":{
"id":the "data ID",
"source":"Source, typically a table name",
"entity":"data abstract naming, naming of entities output according to naming rules",
"class":"the category to which the data belongs",
"relation":the "relationship",
"objects":"associated objectJSONArray "
}
}
The relationship data is a preset relationship, and the relationship data is in one-to-one correspondence with the entity data through labeling. Relationship label dataThe first part istagA second part ispropertyIn whichtagThe section is used to indicate the relationship between two data tables,tagthe Chinese medicine also comprises two parts, one part ishandleTypeAnd the other part isdataInfopropertyPart of the data is used for respectively listing the data with relationship in two data tables in pairs, only one pair is listed in each relationship label,propertythe method comprises the following three parts:subjecobjectandrelationwherein the direction of the relationship is "subject"and"object"come to distinguish"subject"representing a body of a relationship"object"denotes an object of a relationship. The tag details are as follows:
{
"tag": {
"handleType"category, entity extraction or relationship extraction,
"dataInfo":{
"relation":"relationship name",
"subjectSoure""Table A",
"objectSource"table B "
}
},
"property":{
"subject":{
"id": 0,
"entity":"description about an entity",
"class":"Categories"
},
"object": {
"id":0,
"entity":"description about an entity",
"class":"Categories"
}
"relation":{
"name":"relationship name",
"description":"description of relationship"
}
}
}
The data label firstly labels the entity data, and then labels the relation data. After the data is labeled, the data is packaged, namely, the labeled data and an input predefined data are packaged "topic", and time stamp are packaged together to form oneDomain EventAn object.topicBy usingclassNames distinguished by commontopicI.e. entity data of the same category share a message queue.
Finally, willDomain EventIs sent toKafkaAnd then graph databases fromKafkaAnd consumption data, and forming a map according to the label of the entity data and the label of the relation data.
In summary, with the technical scheme of the invention, by the method, a knowledge graph construction rule is customized to embody a certain semantic relationship between data, and the relationship between each data sheet is defined by the methodJSONThe data format is stored in a database, and then a program reads a data table and constructs a graph to describe a data model; the method can be easily applied to the project of constructing a knowledge graph by using a plurality of data tables with association.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (7)

1. A method for identifying a set of television end users, comprising the steps of:
s1: defining an entity naming rule;
s2: constructing relationships among entities;
s3: establishing a mapping relation between the entity attribute name and the standard attribute name;
s4: and reading the original data from the database to construct a knowledge graph.
2. The method of claim 1, wherein the step S2 includes the steps of:
s21: predefining entity relationships;
s22: establishing a table and a query rule between tables;
s23: entity relationships in the table are constructed.
3. The method of claim 1, wherein the step S3 includes the steps of:
s31: predefining a standard for attribute naming;
s32: field names of the same attribute but not standard in different data tables are mapped to a uniform name.
4. The method of claim 1, wherein the step S4 includes the steps of:
s41: acquiring original data;
s42: labeling the data;
s43: encapsulating data asDomain Event
S44: sendingDomain EventToKafka
S45: graph database readingKafkaThe data of (1);
s46: and synthesizing the knowledge graph according to the label of the entity data and the label of the relation data.
5. The method of claim 4, wherein the step S42 further comprises the steps of:
s421: labeling entity data;
s422: and labeling the relation data.
6. The method for identifying a set of TV end users as claimed in claim 4, wherein in step S43, the package data encapsulates entity data and relationship data.
7. The method of claim 5, wherein the relationship data is one-to-one mapped to the entity data by tagging.
CN201911355096.2A 2019-12-25 2019-12-25 Method for identifying television end user set Pending CN111104524A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911355096.2A CN111104524A (en) 2019-12-25 2019-12-25 Method for identifying television end user set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911355096.2A CN111104524A (en) 2019-12-25 2019-12-25 Method for identifying television end user set

Publications (1)

Publication Number Publication Date
CN111104524A true CN111104524A (en) 2020-05-05

Family

ID=70425204

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911355096.2A Pending CN111104524A (en) 2019-12-25 2019-12-25 Method for identifying television end user set

Country Status (1)

Country Link
CN (1) CN111104524A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035676A (en) * 2020-09-02 2020-12-04 中国银行股份有限公司 User operation behavior knowledge graph construction method and device
CN112507108A (en) * 2020-11-25 2021-03-16 北京明略软件系统有限公司 Knowledge extraction method and system based on json rule file and rule analysis engine
CN113157866A (en) * 2021-04-27 2021-07-23 平安科技(深圳)有限公司 Data analysis method and device, computer equipment and storage medium
CN113157866B (en) * 2021-04-27 2024-05-14 平安科技(深圳)有限公司 Data analysis method, device, computer equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955531A (en) * 2014-05-12 2014-07-30 南京提坦信息科技有限公司 Online knowledge map based on named entity library
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN108446368A (en) * 2018-03-15 2018-08-24 湖南工业大学 A kind of construction method and equipment of Packaging Industry big data knowledge mapping
CN110019560A (en) * 2017-12-28 2019-07-16 中国移动通信集团上海有限公司 A kind of querying method and device of knowledge based map
US10496678B1 (en) * 2016-05-12 2019-12-03 Federal Home Loan Mortgage Corporation (Freddie Mac) Systems and methods for generating and implementing knowledge graphs for knowledge representation and analysis

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955531A (en) * 2014-05-12 2014-07-30 南京提坦信息科技有限公司 Online knowledge map based on named entity library
US10496678B1 (en) * 2016-05-12 2019-12-03 Federal Home Loan Mortgage Corporation (Freddie Mac) Systems and methods for generating and implementing knowledge graphs for knowledge representation and analysis
CN106776711A (en) * 2016-11-14 2017-05-31 浙江大学 A kind of Chinese medical knowledge mapping construction method based on deep learning
CN110019560A (en) * 2017-12-28 2019-07-16 中国移动通信集团上海有限公司 A kind of querying method and device of knowledge based map
CN108446368A (en) * 2018-03-15 2018-08-24 湖南工业大学 A kind of construction method and equipment of Packaging Industry big data knowledge mapping

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
万倩;朱里越;欧阳峰;: "基于人工智能的广电舆情分析系统", 广播与电视技术, no. 12 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112035676A (en) * 2020-09-02 2020-12-04 中国银行股份有限公司 User operation behavior knowledge graph construction method and device
CN112035676B (en) * 2020-09-02 2024-02-23 中国银行股份有限公司 User operation behavior knowledge graph construction method and device
CN112507108A (en) * 2020-11-25 2021-03-16 北京明略软件系统有限公司 Knowledge extraction method and system based on json rule file and rule analysis engine
CN113157866A (en) * 2021-04-27 2021-07-23 平安科技(深圳)有限公司 Data analysis method and device, computer equipment and storage medium
CN113157866B (en) * 2021-04-27 2024-05-14 平安科技(深圳)有限公司 Data analysis method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US10061754B2 (en) Method and apparatus for declarative updating of self-describing, structured documents
CN103077185B (en) A kind of method of object-based self-defined extension information
US6665662B1 (en) Query translation system for retrieving business vocabulary terms
CN103886046B (en) Automatic semanteme extraction method for Web data exchange
WO2018153266A1 (en) Knowledge map construction method and apparatus, and storage medium
US8806345B2 (en) Information exchange using generic data streams
US20060253540A1 (en) Method and system for transferring information
US9361398B1 (en) Maintaining a relational database and its schema in response to a stream of XML messages based on one or more arbitrary and evolving XML schemas
US20050071347A1 (en) System and method for conversion between graph-based representations and structural text-based representations of business processes
JP2010225181A (en) Registry driven interoperability and exchange of document
CN111104524A (en) Method for identifying television end user set
US10489024B2 (en) UI rendering based on adaptive label text infrastructure
CN111782216A (en) Page generation method, server and storage medium
CN102365619A (en) Method and apparatus for processing user interface composed of component objects
JP4181080B2 (en) Hierarchical database management system, hierarchical database management method, and hierarchical database management program
TWI289261B (en) System and method for dynamically generating a HTTP query
CN107515866B (en) Data operation method, device and system
Musyaffa et al. Minimally invasive semantification of light weight service descriptions
WO2018028127A1 (en) Method and apparatus for parsing storage files
US20140025661A1 (en) Method of displaying search result data, search server and mobile device
CN113094614A (en) Data distribution method, system and device
García et al. Facilitating business interoperability from the semantic web
CN104021216A (en) Message proxy server and information publish subscription method and system
CN105024923B (en) The method and device that message category based on XMPP extension message is realized
CN108763512A (en) A kind of information processing method, device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221226

Address after: 100144 1206, Floor 12, Building 7, Yard 49, Badachu Road, Shijingshan District, Beijing

Applicant after: BEIJING CASICLOUD CO.,LTD.

Address before: 100039 1201-3, 12 / F, building 7, yard 16, West Fourth Ring Middle Road, Haidian District, Beijing

Applicant before: CASICLOUD-TECH CO.,LTD.