CN111104524A - Method for identifying television end user set - Google Patents
Method for identifying television end user set Download PDFInfo
- Publication number
- CN111104524A CN111104524A CN201911355096.2A CN201911355096A CN111104524A CN 111104524 A CN111104524 A CN 111104524A CN 201911355096 A CN201911355096 A CN 201911355096A CN 111104524 A CN111104524 A CN 111104524A
- Authority
- CN
- China
- Prior art keywords
- data
- entity
- relation
- knowledge graph
- steps
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000013507 mapping Methods 0.000 claims abstract description 15
- 238000002372 labelling Methods 0.000 claims description 13
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000010276 construction Methods 0.000 abstract description 10
- 238000013499 data model Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 description 5
- 238000005538 encapsulation Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000013075 data extraction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Abstract
The invention discloses a method for identifying a television end user set, which comprises the following steps: defining an entity naming rule; constructing relationships among entities; establishing a mapping relation between the entity attribute name and the standard attribute name; and reading the original data from the database to construct a knowledge graph. By the method, a knowledge graph construction rule is customized to embody a certain semantic relation between data, and the relation between each data sheet is defined byJSONThe data format is stored in a database, and then a program reads a data table and constructs a graph to describe a data model; the method can be easily applied to the project of constructing a knowledge graph by using a plurality of data tables with association.
Description
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a method for identifying a television end user set.
Background
In recent years, knowledge maps are introduced into more and more application scenes, and the knowledge maps are essentially large-scale semantic networks and comprise entities, concepts and various semantic relationships among the entities and the concepts. The knowledge graph is one of the most important knowledge representation forms in the big data era and is a core technology for realizing cognitive intelligence. Meanwhile, with the rapid development of the internet, the content of the network data shows an explosive growth situation. The knowledge graph is actually a product of knowledge engineering reappeared in a big data era, the dependence of the knowledge graph on data is emphasized, but the characteristics of large scale, heterogeneous and multivariate internet content and loose organization structure provide challenges for the construction of the knowledge graph.
Most of the traditional knowledge engineering applications are limited, most of the traditional knowledge engineering applications are successful in a scene with clear rules and clear boundaries and closed application, and the construction method is called as a top-down method. Although there are many papers and results related to the construction of knowledge graph recently, when the conclusion of these papers is really applied to the self-research scenario, various problems and poor mobility are discovered.
Disclosure of Invention
In view of the above technical problems in the related art, the present invention provides a method for identifying a tv end user set, which can overcome the above disadvantages in the prior art.
In order to achieve the technical purpose, the technical scheme of the invention is realized as follows:
a method of identifying a set of television end users, the method comprising the steps of:
s1: defining an entity naming rule;
s2: constructing relationships among entities;
s3: establishing a mapping relation between the entity attribute name and the standard attribute name;
s4: and reading the original data from the database to construct a knowledge graph.
Further, the step S2 includes the following steps:
s21: predefining entity relationships;
s22: establishing a table and a query rule between tables;
s23: entity relationships in the table are constructed.
Further, the step S3 includes the following steps:
s31: predefining a standard for attribute naming;
s32: field names of the same attribute but not standard in different data tables are mapped to a uniform name.
Further, the step S4 includes the following steps:
s41: acquiring original data;
s42: labeling the data;
s43: encapsulating data asDomain Event;
S44: sendingDomain EventToKafka;
S45: graph database readingKafkaThe data of (1);
s46: and synthesizing the knowledge graph according to the label of the entity data and the label of the relation data.
Further, the step S42 further includes the following steps:
s421: labeling entity data;
s422: and labeling the relation data.
Further, in the step S43, the encapsulation data encapsulates the entity data and the relationship data.
Further, the relationship data is in one-to-one correspondence with the entity data by labeling.
The invention has the beneficial effects that: by the method, a knowledge graph construction rule is customized to embody a certain semantic relation between data, the relation between each data sheet is stored in a database in a JSON data format, and then the data sheets are read by a program and a graph is constructed to describe a data model; the method can be easily applied to the project of constructing a knowledge graph by using a plurality of data tables with association.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flow chart of a method for identifying a set of tv end users according to an embodiment of the present invention;
fig. 2 is a block diagram of a knowledge graph construction rule of a method for identifying a set of tv end users according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
As shown in fig. 1, a method for identifying a set of tv end users according to an embodiment of the present invention includes the following steps:
s1: defining an entity naming rule;
s2: constructing relationships among entities;
s3: establishing a mapping relation between the entity attribute name and the standard attribute name;
s4: and reading the original data from the database to construct a knowledge graph.
Step S2 includes the following steps:
s21: predefining entity relationships;
s22: establishing a table and a query rule between tables;
s23: entity relationships in the table are constructed.
Step S3 includes the following steps:
s31: predefining a standard for attribute naming;
s32: field names of the same attribute but not standard in different data tables are mapped to a uniform name.
Step S4 includes the following steps:
s41: acquiring original data;
s42: labeling the data;
s43: encapsulating data asDomain Event;
S44: sendingDomain EventToKafka;
S45: graph database readingKafkaThe data of (1);
s46: and synthesizing the knowledge graph according to the label of the entity data and the label of the relation data.
Step S42 further includes the steps of:
s421: labeling entity data;
s422: and labeling the relation data.
In an embodiment of the invention, in the step S43, the encapsulation data encapsulates the entity data and the relationship data.
In a specific embodiment of the present invention, the relationship data is in one-to-one correspondence with the entity data by tagging.
In order to facilitate understanding of the above-described technical aspects of the present invention, the above-described technical aspects of the present invention will be described in detail below in terms of specific usage.
As shown in fig. 2, the knowledge graph construction method based on the custom rule is composed of the following three parts:
a first part: an entity naming rule.
A second part: and establishing rules for the relationships among the entities.
And a third part: and mapping rules of the attribute names of the entities and the standard attribute names.
Entity naming rules:
the construction of the knowledge graph network firstly establishes a triple relation, and entity triples are mainly constructed based onRDF(Resource Description Framework) Is a description ofWebMarkup languages for resources are a general way of describing information that can be read and understood by a computer.WebResource means can ownURI(Uniform Resource IdentifierI.e., uniform resource identifier), which are located by a unique universal resource identifier;
the entity naming rule is used for embodying each resourceURIIs customized for uniqueness and legibility. For example, for the naming of an object, there may be duplicates in the name database of the object, but other attributes of the object are different,the description is of two different entities, and their use is thereforeURIWhen they are named, the corresponding real objects are addedidNumber, since id is unique in the same data table; if a non-real object is an entity represented by a string of numbers, for example, an order is named, and the naming of the order by the order number alone is unique but has poor readability, an legibility principle is embodied at this time, a table which is associated with an order table and has fields named by Chinese characters in the table is subjected to joint check, the relevant fields are taken out and spliced with the order number, or the order number and the table name are spliced, so that uniqueness and legibility are embodied.
And (3) establishing a rule of the relationship among the entities:
the rule is the core of knowledge graph construction and represents the relationship between entities. The relationships between the entities are predefined, and the query rules between the tables and the relationships between the entities in the tables are constructed according to the relationships. The relation between tables is combed according to the existing data, each relation is a connecting link between two tables, a knowledge network map is formed by combining, the relations are directional, the two tables point to different directions, and the relations are different.
Mapping rules of the attribute names of the entities and the standard attribute names are as follows:
because the field names are not named according to the uniform standard when each entity table is customized, the same attribute naming mode of different entities is different, and the direct processing is difficult, so that the standard of the attribute naming needs to be predefined, and the field names which represent the same attribute but are not standard in different data tables are mapped to the uniform naming.
Rule content details:
with a tabular data extraction entity listed belowJSONConfiguration of the format, where the first two keys: "entityConfigJson"and"entityTagConfigJson"correspondent entity naming rule, wherein"entityConfigJsonIs configured for an entityJson,"entityTagConfigJsonIs an entity labelJson;"relationTableIdMapJson"rule is constructed by the relationship between key-corresponding entities, wherein"relationTableIdMapJson"is a table relationship mappingjson;"property_key_mapping"mapping relationship of attribute name of corresponding entity to standard attribute name, wherein"property_key_mapping"is the attribute field mapping.
{
"entityConfigJson": {
"joinXIdColumn": ["relId"],
"joinOtherTableName": ["industry_tenant_release"],
"joinOtherIdColumn": ["id"],
"joinOtherColumn": ["capAndproName"]
},
"entityTagConfigJson": {
"JointTableNameFlag": true,
"JointIdFlag": true,
"descriptionColumns": ["id", "capAndproName"]
},
"relationTableIdMapJson": {
"industry_buyer_inquiry": {
"beRelatedKey": "id",
"relatedKey": "inqId"
},
"industry_tenant_release": {
"beRelatedKey": "id",
"relatedKey": "relId"
}
},
"property_key_mapping": {
"deli_time": "deliTime",
"order_status": "status",
"amount": "amount"
}
}
Firstly, the entity naming rule, because there are no fields which can be directly used for entity naming in the data table, all fields are numbers, and the readability is not strong although the uniqueness can be ensured, therefore, a field in another table is associated for naming, which is used here "entityConfigJson"and"entityTagConfigJson"two keys" define the naming convention for the entities in this table. Wherein "entityConfigJson"field names for defining associable queries in the present data table, data tables to associate queries, and associated field names in the associated tables that are available for naming. For example,') "joinXIdColumn"field name for associated query in this data sheet is indicated"relId","joinOtherTableName"data sheet name indicating associated query"industry_tenant_release"data sheet"joinOtherIdColumn"name of associated field in data table for indicating associated query"id","joinOtherColumn"indicating associated in associated tableidCorresponding "capAndproName"is used. I.e. in this data sheetrelIdAndindustry_tenant_releasein the tableidValue is equal torelIdIs/are as followscapAndproNameThe field concatenation constitutes the entity name, but this is not intuitive enough because of the string "number +capAndproNameCan be mistaken for "industry_tenant_release"entities in tables, therefore, lower bonds"entityTagConfigJson"serves to distinguish entities, among"JointTableNameFlag"Boolean-type value indicates whether the name of its own data table is to be spliced in the entity name"JointIdFlag"also of Boolean type, indicating whether or not to spliceid,"descriptionColumns"then the fields that need to be concatenated in the entity name are listed.
Then is "relationTableIdMapJson"is used to define the table and the association relationship between the tables.
Finally, the mapping rule of the actual entity attribute name and the standard attribute name is "property_key_mapping"to complete the mapping. In this example "deli_time": "deliTime"the key represents the standard name of the attribute, the value represents the field name corresponding to the attribute in the data table, the attribute names need to be unified, and convenience is providedAnd (5) carrying out subsequent treatment.
And (4) finishing the series of rules, namely finishing the extraction work of the entity, and then further processing the entity data to construct the knowledge graph.
Constructing a knowledge graph:
the above rules are used to read the original data from the database, which is only the first step of constructing the knowledge graph, and are extracted as the entity data for constructing the knowledge graph. The subsequent steps include labeling the data and packagingDomain EventIs then stored inKafkaAnd, finally, graph database readingKafkaAnd generating a knowledge graph by the data.
Data is tagged and encapsulated, including both entity data and relationship data. The labeling is to describe each piece of read data in detail and package the data into a wholeJSONData or data in dictionary format, and the label of each entity data comprises:id、source、class、relation、object、entityandhandleType. Therein, in addition tohandleTypeOther labels than encapsulateddataInfoOf the output tagged entity dataJSONThe type structure is as follows:
{
"handleType":"category, entity extraction or relationship extraction",
"dataInfo":{
"id":the "data ID",
"source":"Source, typically a table name",
"entity":"data abstract naming, naming of entities output according to naming rules",
"class":"the category to which the data belongs",
"relation":the "relationship",
"objects":"associated objectJSONArray "
}
}
The relationship data is a preset relationship, and the relationship data is in one-to-one correspondence with the entity data through labeling. Relationship label dataThe first part istagA second part ispropertyIn whichtagThe section is used to indicate the relationship between two data tables,tagthe Chinese medicine also comprises two parts, one part ishandleTypeAnd the other part isdataInfo;propertyPart of the data is used for respectively listing the data with relationship in two data tables in pairs, only one pair is listed in each relationship label,propertythe method comprises the following three parts:subjec、objectandrelationwherein the direction of the relationship is "subject"and"object"come to distinguish"subject"representing a body of a relationship"object"denotes an object of a relationship. The tag details are as follows:
{
"tag": {
"handleType"category, entity extraction or relationship extraction,
"dataInfo":{
"relation":"relationship name",
"subjectSoure""Table A",
"objectSource"table B "
}
},
"property":{
"subject":{
"id": 0,
"entity":"description about an entity",
"class":"Categories"
},
"object": {
"id":0,
"entity":"description about an entity",
"class":"Categories"
}
"relation":{
"name":"relationship name",
"description":"description of relationship"
}
}
}
The data label firstly labels the entity data, and then labels the relation data. After the data is labeled, the data is packaged, namely, the labeled data and an input predefined data are packaged "topic", and time stamp are packaged together to form oneDomain EventAn object.topicBy usingclassNames distinguished by commontopicI.e. entity data of the same category share a message queue.
Finally, willDomain EventIs sent toKafkaAnd then graph databases fromKafkaAnd consumption data, and forming a map according to the label of the entity data and the label of the relation data.
In summary, with the technical scheme of the invention, by the method, a knowledge graph construction rule is customized to embody a certain semantic relationship between data, and the relationship between each data sheet is defined by the methodJSONThe data format is stored in a database, and then a program reads a data table and constructs a graph to describe a data model; the method can be easily applied to the project of constructing a knowledge graph by using a plurality of data tables with association.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (7)
1. A method for identifying a set of television end users, comprising the steps of:
s1: defining an entity naming rule;
s2: constructing relationships among entities;
s3: establishing a mapping relation between the entity attribute name and the standard attribute name;
s4: and reading the original data from the database to construct a knowledge graph.
2. The method of claim 1, wherein the step S2 includes the steps of:
s21: predefining entity relationships;
s22: establishing a table and a query rule between tables;
s23: entity relationships in the table are constructed.
3. The method of claim 1, wherein the step S3 includes the steps of:
s31: predefining a standard for attribute naming;
s32: field names of the same attribute but not standard in different data tables are mapped to a uniform name.
4. The method of claim 1, wherein the step S4 includes the steps of:
s41: acquiring original data;
s42: labeling the data;
s43: encapsulating data asDomain Event;
S44: sendingDomain EventToKafka;
S45: graph database readingKafkaThe data of (1);
s46: and synthesizing the knowledge graph according to the label of the entity data and the label of the relation data.
5. The method of claim 4, wherein the step S42 further comprises the steps of:
s421: labeling entity data;
s422: and labeling the relation data.
6. The method for identifying a set of TV end users as claimed in claim 4, wherein in step S43, the package data encapsulates entity data and relationship data.
7. The method of claim 5, wherein the relationship data is one-to-one mapped to the entity data by tagging.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911355096.2A CN111104524A (en) | 2019-12-25 | 2019-12-25 | Method for identifying television end user set |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911355096.2A CN111104524A (en) | 2019-12-25 | 2019-12-25 | Method for identifying television end user set |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111104524A true CN111104524A (en) | 2020-05-05 |
Family
ID=70425204
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911355096.2A Pending CN111104524A (en) | 2019-12-25 | 2019-12-25 | Method for identifying television end user set |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111104524A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112035676A (en) * | 2020-09-02 | 2020-12-04 | 中国银行股份有限公司 | User operation behavior knowledge graph construction method and device |
CN112507108A (en) * | 2020-11-25 | 2021-03-16 | 北京明略软件系统有限公司 | Knowledge extraction method and system based on json rule file and rule analysis engine |
CN113157866A (en) * | 2021-04-27 | 2021-07-23 | 平安科技(深圳)有限公司 | Data analysis method and device, computer equipment and storage medium |
CN113157866B (en) * | 2021-04-27 | 2024-05-14 | 平安科技(深圳)有限公司 | Data analysis method, device, computer equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955531A (en) * | 2014-05-12 | 2014-07-30 | 南京提坦信息科技有限公司 | Online knowledge map based on named entity library |
CN106776711A (en) * | 2016-11-14 | 2017-05-31 | 浙江大学 | A kind of Chinese medical knowledge mapping construction method based on deep learning |
CN108446368A (en) * | 2018-03-15 | 2018-08-24 | 湖南工业大学 | A kind of construction method and equipment of Packaging Industry big data knowledge mapping |
CN110019560A (en) * | 2017-12-28 | 2019-07-16 | 中国移动通信集团上海有限公司 | A kind of querying method and device of knowledge based map |
US10496678B1 (en) * | 2016-05-12 | 2019-12-03 | Federal Home Loan Mortgage Corporation (Freddie Mac) | Systems and methods for generating and implementing knowledge graphs for knowledge representation and analysis |
-
2019
- 2019-12-25 CN CN201911355096.2A patent/CN111104524A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103955531A (en) * | 2014-05-12 | 2014-07-30 | 南京提坦信息科技有限公司 | Online knowledge map based on named entity library |
US10496678B1 (en) * | 2016-05-12 | 2019-12-03 | Federal Home Loan Mortgage Corporation (Freddie Mac) | Systems and methods for generating and implementing knowledge graphs for knowledge representation and analysis |
CN106776711A (en) * | 2016-11-14 | 2017-05-31 | 浙江大学 | A kind of Chinese medical knowledge mapping construction method based on deep learning |
CN110019560A (en) * | 2017-12-28 | 2019-07-16 | 中国移动通信集团上海有限公司 | A kind of querying method and device of knowledge based map |
CN108446368A (en) * | 2018-03-15 | 2018-08-24 | 湖南工业大学 | A kind of construction method and equipment of Packaging Industry big data knowledge mapping |
Non-Patent Citations (1)
Title |
---|
万倩;朱里越;欧阳峰;: "基于人工智能的广电舆情分析系统", 广播与电视技术, no. 12 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112035676A (en) * | 2020-09-02 | 2020-12-04 | 中国银行股份有限公司 | User operation behavior knowledge graph construction method and device |
CN112035676B (en) * | 2020-09-02 | 2024-02-23 | 中国银行股份有限公司 | User operation behavior knowledge graph construction method and device |
CN112507108A (en) * | 2020-11-25 | 2021-03-16 | 北京明略软件系统有限公司 | Knowledge extraction method and system based on json rule file and rule analysis engine |
CN113157866A (en) * | 2021-04-27 | 2021-07-23 | 平安科技(深圳)有限公司 | Data analysis method and device, computer equipment and storage medium |
CN113157866B (en) * | 2021-04-27 | 2024-05-14 | 平安科技(深圳)有限公司 | Data analysis method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10061754B2 (en) | Method and apparatus for declarative updating of self-describing, structured documents | |
CN103077185B (en) | A kind of method of object-based self-defined extension information | |
US6665662B1 (en) | Query translation system for retrieving business vocabulary terms | |
CN103886046B (en) | Automatic semanteme extraction method for Web data exchange | |
WO2018153266A1 (en) | Knowledge map construction method and apparatus, and storage medium | |
US8806345B2 (en) | Information exchange using generic data streams | |
US20060253540A1 (en) | Method and system for transferring information | |
US9361398B1 (en) | Maintaining a relational database and its schema in response to a stream of XML messages based on one or more arbitrary and evolving XML schemas | |
US20050071347A1 (en) | System and method for conversion between graph-based representations and structural text-based representations of business processes | |
JP2010225181A (en) | Registry driven interoperability and exchange of document | |
CN111104524A (en) | Method for identifying television end user set | |
US10489024B2 (en) | UI rendering based on adaptive label text infrastructure | |
CN111782216A (en) | Page generation method, server and storage medium | |
CN102365619A (en) | Method and apparatus for processing user interface composed of component objects | |
JP4181080B2 (en) | Hierarchical database management system, hierarchical database management method, and hierarchical database management program | |
TWI289261B (en) | System and method for dynamically generating a HTTP query | |
CN107515866B (en) | Data operation method, device and system | |
Musyaffa et al. | Minimally invasive semantification of light weight service descriptions | |
WO2018028127A1 (en) | Method and apparatus for parsing storage files | |
US20140025661A1 (en) | Method of displaying search result data, search server and mobile device | |
CN113094614A (en) | Data distribution method, system and device | |
García et al. | Facilitating business interoperability from the semantic web | |
CN104021216A (en) | Message proxy server and information publish subscription method and system | |
CN105024923B (en) | The method and device that message category based on XMPP extension message is realized | |
CN108763512A (en) | A kind of information processing method, device and server |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20221226 Address after: 100144 1206, Floor 12, Building 7, Yard 49, Badachu Road, Shijingshan District, Beijing Applicant after: BEIJING CASICLOUD CO.,LTD. Address before: 100039 1201-3, 12 / F, building 7, yard 16, West Fourth Ring Middle Road, Haidian District, Beijing Applicant before: CASICLOUD-TECH CO.,LTD. |