CN109635171B - Fusion reasoning system and method for news program intelligent tags - Google Patents

Fusion reasoning system and method for news program intelligent tags Download PDF

Info

Publication number
CN109635171B
CN109635171B CN201811528577.4A CN201811528577A CN109635171B CN 109635171 B CN109635171 B CN 109635171B CN 201811528577 A CN201811528577 A CN 201811528577A CN 109635171 B CN109635171 B CN 109635171B
Authority
CN
China
Prior art keywords
label
entity
library
program
reasoning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811528577.4A
Other languages
Chinese (zh)
Other versions
CN109635171A (en
Inventor
温序铭
谢超平
王炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Sobey Digital Technology Co Ltd
Original Assignee
Chengdu Sobey Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Sobey Digital Technology Co Ltd filed Critical Chengdu Sobey Digital Technology Co Ltd
Priority to CN201811528577.4A priority Critical patent/CN109635171B/en
Publication of CN109635171A publication Critical patent/CN109635171A/en
Application granted granted Critical
Publication of CN109635171B publication Critical patent/CN109635171B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fusion inference system and a fusion inference method of news program intelligent tags, which relate to the technical field of news program tags and comprise an intelligent recognition actuator, a history tag library, an internal knowledge base, an internal case library and an analysis inference device, wherein the intelligent recognition actuator executes recognition tasks of various news program materials and extracts basic tags of video images, voice and text information; the historical label database stores materials, metadata and labels; the internal knowledge base is used for supplementing an intelligent recognition result and providing more information for subsequent analysis and reasoning; the internal case base is a case set established based on the historical label base; the analysis reasoner is used for fusion inference of the intelligent label and comprises a rule-based reasoner and a deep learning-based reasoner.

Description

Fusion reasoning system and method for news program intelligent tags
Technical Field
The invention relates to the technical field of news program tags, in particular to a fusion reasoning system and method of news program intelligent tags.
Background
With the development and progress of the era, tags of media contents are ubiquitous, and participation of the tags is required for many practical services, such as content retrieval and review. Conventionally, media content is identified and tags are added manually, but with the increase of media data volume and the expansion of service requirements, the traditional inefficient manual tag adding manner cannot meet the requirements of fast and efficient services. Meanwhile, the continuous development of intelligent identification technology and the gradual maturity of technologies such as deep learning make it increasingly practical to automatically and accurately generate content tags for multimedia materials.
A news manuscript usually contains various expression forms of video, audio and characters, the same news event is described in different dimensions by the media contents, the contained information is generally overlapped or supplemented with each other, and keywords such as five news elements including time, place, character, event, reason and the like can be obtained by manually browsing the news manuscript. The intelligent identification method for media content is gradually mature, for example, face recognition and OCR recognition can be adopted for video, voice recognition is adopted for audio, and how to integrate information of different dimensions to infer and summarize to form an intelligent label is an important subject under the media technology.
In the past, labels in media contents are generally extracted manually, but the manual extraction mode cannot be suitable for massive media contents, the manual extraction efficiency is low, and errors are easy to occur; on the other hand, the multi-class recognition results of the video and the audio and the characters are not reasonably and comprehensively utilized, and the utilization difficulty is high.
Disclosure of Invention
The invention aims to: in order to solve the problems that the existing news program media content tags are extracted manually, efficiency is low, and mistakes are easy to occur, the invention provides a fusion reasoning system and a fusion reasoning method for news program intelligent tags, which comprehensively utilize an intelligent identification method, establish an internal knowledge base and an internal case base based on a historical tag base, complete automatic fusion reasoning of news program tags, and are accurate and efficient in classification.
The invention specifically adopts the following technical scheme for realizing the purpose:
a fusion inference system for intelligent labels of news programs comprises an intelligent identification actuator, a history label library, an internal knowledge base, an internal case library and an analysis inference device,
the intelligent recognition actuator: the system is used for executing the identification tasks of various news program materials and extracting basic labels of video images, voice and text information;
a history label library: for storing material, metadata and tags;
an internal knowledge base: the internal knowledge graph is established based on a historical label library, entities of the internal knowledge graph comprise characters, countries, time administration, events and the like, and the internal knowledge graph is used for supplementing intelligent recognition results and providing more information for subsequent analysis and reasoning;
internal case base: a case set established based on a historical label library comprises representative cases with complete metadata and labels and is used for a deep learning training process;
analyzing the reasoner: the fusion inference for the intelligent label comprises a rule-based inference engine and a deep learning-based inference engine; the rule-based reasoning device comprehensively utilizes the intelligent recognition result and the internal knowledge base to carry out reasoning; the deep learning-based reasoning device trains a deep learning model by utilizing an internal case base and then conducts reasoning by utilizing the trained model.
Furthermore, the intelligent recognition actuator recognizes various news program materials, and a face recognition actuator, an OCR actuator, a voice recognition actuator and an NLP actuator are adopted in the recognition process.
Further, the materials in the history label library comprise texts, voice and video images; the metadata comprises information such as title, keyword, creation time, place and type of the material; the tags include the field, emotional tendency, occurrence time, person, occurrence location, etc. to which the material relates.
A fusion reasoning method for intelligent labels of news programs comprises the following steps:
s1, constructing a news knowledge base: constructing a news knowledge base by utilizing a historical label base, a historical material base, the Internet and other knowledge bases;
s2, constructing an internal case library: extracting text materials with domain labels and pictures with scene labels from a historical label library to form an internal case library, and digitizing case texts in the internal case library;
s3, training a deep learning model: training a deep neural network by using a digitized case, wherein the training process of the deep neural network comprises a text classification training process and a scene recognition training process;
s4, fusion reasoning is carried out: according to the type of an input program, content recognition is carried out by using an intelligent recognition strategy corresponding to an intelligent recognition actuator, then the program category is judged by using an internal knowledge base to carry out rule-based reasoning on a recognition result, a first candidate new label set is obtained, text classification and scene recognition are carried out by using a trained deep neural network, a second candidate new label set is obtained, and a user selects and corrects the first candidate new label set and the second candidate new label set to output a final program label.
Further, the entities of the news knowledge base in S1 include time, place, event, person, and the like, and the construction of the news knowledge base includes the following steps:
s1.1, constructing a global ontology library: constructing ontology bases of various fields according to the categories of entertainment, sports, finance, civil life, current affairs, tourism, military affairs and the like of news services, wherein the construction range comprises a concept, a concept hierarchy, attributes, attribute value types, attribute value ranges, relationships, relationship definition domain concept sets and relationship value ranges;
s1.2, acquiring an entity: acquiring a historical label library entity, and if the entity information of the historical label library is incomplete, supplementing entity information in a historical material library and the Internet;
s1.3, entity evaluation: quantifying the reliability of the obtained entity, and discarding the entity with lower reliability;
s1.4, knowledge fusion: linking the entities left after entity evaluation to the current knowledge base by an entity disambiguation and coreference resolution method, and merging related entities in a third-party knowledge base into the current knowledge base;
s1.5, knowledge reasoning: and reasoning the relationship among the entities in the current knowledge base, the attributes of the entities and the hierarchical relationship among the bodies, and adding the manually checked knowledge reasoning result into the current knowledge base to obtain the news knowledge base.
Further, the building of the global ontology library in S1.1 includes the following steps:
s1.1.1: dividing the labels into a plurality of fields from a historical label library according to news classification;
s1.1.2: induction generation of each field ontology base is carried out according to the historical labels, and each field ontology base is manually added, deleted and adjusted according to news business rules;
s1.1.3: and fusing the ontology base of each field and the existing knowledge map ontology base by using the rules of similarity detection, conflict resolution and the like to obtain a global ontology base.
Further, the entity acquisition in S1.2 includes the following steps:
s1.2.1: analyzing the table name and field information of the historical label library, and extracting the corresponding entities represented by the labels in the historical label library, the attributes and the relations among the entities;
s1.2.2: after converting video and audio data in the historical material into characters by methods such as voice recognition, subtitle recognition, image target recognition and the like, extracting news manuscripts by methods such as natural language processing, data mining and the like, and recognizing to obtain entities, relations and attributes in the characters;
s1.2.3: the method comprises the steps of collecting entity information from the Internet, and obtaining entity attributes and relationship information by taking searched keywords as entity names, such as birthdays, nationalities, companions and the like of people, administrative district types, aliases, climates and the like of places.
Further, the text classification training process based on the deep neural network in S3 includes the following steps:
s3.1.1: performing word segmentation processing and serialization on the text, and removing words which are meaningless to classification to obtain a word sequence;
s3.1.2: converting the word sequence into a word number sequence;
s3.1.3: respectively converting the word sequence numbers into n-dimensional word vectors;
s3.1.4: forming word vectors into a text matrix according to the word sequence, wherein each row of the matrix is a word vector of one word;
s3.1.5: the text matrix is used to train the deep neural network.
Further, the scene recognition training process based on the deep neural network in S3 includes the following steps:
s3.2.1: expanding a scene recognition image sample through operations of image shearing, rotation, zooming and the like;
s3.2.2: and adjusting the size of the scene recognition image and performing other preprocessing to train the deep neural network.
Further, in S4, content identification is performed by using an intelligent identification executor, and the following policy is adopted:
strategy A: when the input program type is a picture, inputting the picture into a face recognizer to obtain character tags, inputting the picture into an OCR recognizer, inputting a recognition result into an NLP actuator to perform entity extraction, and acquiring tags such as time, place, events and the like;
and (2) strategy B: when the input program type is voice, inputting the voice into a voice recognizer and inputting a recognition result into an NLP executor to perform entity extraction so as to obtain time, place, event and other labels;
and (3) strategy C: when the input program type is video, processing the image frame by adopting a strategy A, and processing the video voice by adopting a strategy B;
and (4) strategy D: when the input program type is a text, the text is input into an NLP executor to carry out entity extraction, and labels such as time, place, event and the like are obtained.
Further, the rule-based reasoning in S4 includes the following steps:
s4.1.1: respectively taking the basic label extracted by the intelligent identification actuator and the related metadata of the program initiation as entities;
s4.1.2: obtaining the position of each entity in the internal knowledge base by using methods such as entity disambiguation, coreference resolution and the like;
s4.1.3: extracting adjacent nodes and relations as subgraphs by taking each entity as a central node;
s4.1.4: inputting sub-graph data of each entity into the trained GCN, calculating by adopting a distributed graph technology, and reasoning to obtain a program field;
s4.1.5: and taking the basic label and the program field obtained by inference as a first candidate new label set.
Further, the text classification and scene recognition based on the deep neural network in S4 includes the following steps:
s4.2.1: inputting the preprocessed text data into the trained deep neural network to obtain a field label;
s4.2.2: inputting the preprocessed scene recognition image data into a trained deep neural network to obtain a scene label;
s4.2.3: and taking the obtained field label and the scene label as a second candidate new label set.
Further, after the final program tag is obtained in S4, the program, the final program tag, and the metadata information are stored in the history tag library, the history tag library is updated, and the internal knowledge base and the internal case base are updated at the same time.
The invention has the following beneficial effects:
1. the fusion inference system establishes an internal case base and an internal knowledge base based on the historical tag base, provides a real and effective basis for a fusion inference process, provides a rule-based inference engine and a deep learning-based inference engine, performs tag fusion inference from multiple dimensions of a knowledge map and deep learning, updates the historical tag base while outputting an inference result, forms an inference flow closed loop, and continuously improves the accuracy of the inference system.
2. The fusion inference method disclosed by the invention is based on an intelligent identification process, a mature intelligent identification technology is fused into the fusion inference process, the data dimension of unstructured data is reduced, and the subsequent inference difficulty is also reduced; on the other hand, a news program classification model is constructed by comprehensively utilizing a distributed graph computing technology and a graph neural network, and the classification model is accurate and efficient.
Drawings
FIG. 1 is an overall schematic diagram of the converged inference system of the present invention.
Fig. 2 is a schematic diagram of the construction process of the news knowledge base of the present invention.
FIG. 3 is a schematic diagram of the fusion reasoning process of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, the following detailed description is given with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, the embodiment provides a fusion inference system for intelligent tags of news programs, which includes an intelligent recognition executor, a history tag library, an internal knowledge base, an internal case library and an analysis reasoner,
the intelligent recognition actuator: the system is used for executing the recognition tasks of various news program materials, and the basic label extraction is carried out on video images, voice and text information by adopting a face recognition actuator, an OCR actuator, a voice recognition actuator and an NLP actuator in the recognition process;
a history label library: the system comprises a history tag library, a database and a database, wherein the history tag library is used for storing materials, metadata and tags, and the materials in the history tag library comprise texts, voice and video images; the metadata comprises information such as title, keyword, creation time, place and type of the material; the tags comprise fields, emotional tendency, occurrence time, people, occurrence places and the like related to the materials, wherein the fields comprise entertainment, politics, civilian life and the like, and the emotional tendency comprises positive and negative;
an internal knowledge base: the internal knowledge graph is established based on a historical label library, entities of the internal knowledge graph comprise characters, countries, time administration, events and the like, and the internal knowledge graph is used for supplementing intelligent recognition results and providing more information for subsequent analysis and reasoning;
internal case base: a case set established based on a historical label library comprises representative cases with complete metadata and labels and is used for a deep learning training process;
analyzing the inference device: the fusion inference for the intelligent label comprises a rule-based inference engine and a deep learning-based inference engine; the rule-based reasoning device comprehensively utilizes the intelligent recognition result and the internal knowledge base to carry out reasoning; and the deep learning-based inference engine trains a deep learning model by using the internal case base and then performs inference by using the trained model.
Based on the fusion reasoning system, the embodiment further provides a fusion reasoning method for the news program intelligent tag, which includes the following steps:
s1, constructing a news knowledge base: constructing a news knowledge base by utilizing a historical label base, a historical material base, the Internet and other knowledge bases; the entities of the news knowledge base comprise time, places, events, people and the like, and the construction of the news knowledge base comprises the following steps:
s1.1, constructing a global ontology library: the method comprises the following steps of constructing ontology bases in various fields according to the categories of entertainment, sports, finance, livelihood, tense, travel, military affairs and the like of news services, wherein the construction range comprises a concept, a concept level, attributes, attribute value types, attribute value ranges, relationships, relationship definition range concept sets and relationship value ranges, and the method specifically comprises the following steps:
s1.1.1: dividing the labels into a plurality of fields from a historical label library according to news classification;
s1.1.2: inducing and generating the ontology bases of each field according to the historical labels, and manually adding, deleting and adjusting the ontology bases of each field according to news business rules;
s1.1.3: fusing each field ontology base and the existing knowledge map ontology base by utilizing rules of similarity detection, conflict resolution and the like to obtain a global ontology base;
s1.2, acquiring an entity: acquiring a historical tag library entity, wherein the entity refers to a vertex in a knowledge map, and if the entity information of the historical tag library is incomplete, the entity information is supplemented in a historical material library and the internet, and the method specifically comprises the following steps:
s1.2.1: analyzing the table name and field information of the historical label library, and extracting corresponding entities represented by the labels in the historical label library, attributes and the relationship among the entities;
s1.2.2: after converting video and audio data in the historical material into characters by methods such as voice recognition, subtitle recognition, image target recognition and the like, extracting a news manuscript and recognizing to obtain entities, relations and attributes in the characters by methods such as natural language processing, data mining and the like;
s1.2.3: acquiring entity information from the Internet, wherein the searched keywords are entity names, and further acquiring entity attributes and relationship information, such as birthdays, nationalities, companions and the like of people, administrative region types, aliases, climates and the like of places;
s1.3, entity evaluation: quantifying the reliability of the obtained entity, and discarding the entity with lower reliability;
s1.4, knowledge fusion: linking the entities left after entity evaluation to the current knowledge base by an entity disambiguation and coreference resolution method, and merging related entities in a third-party knowledge base into the current knowledge base;
s1.5, knowledge reasoning: reasoning the relationship among the entities in the current knowledge base, the attributes of the entities and the hierarchical relationship among the bodies, and adding the manually checked knowledge reasoning result into the current knowledge base to obtain a news knowledge base;
s2, constructing an internal case library: extracting text materials with domain labels and pictures with scene labels from a historical label library to form an internal case library, and digitizing case texts in the internal case library, wherein the domain labels comprise science and technology, sports, entertainment, time administration and the like;
s3, training a deep learning model: training a deep neural network by using a digitized case, wherein the training process of the deep neural network comprises a text classification training process and a scene recognition training process;
the text classification training process based on the deep neural network comprises the following steps:
s3.1.1: performing word segmentation processing and serialization on the text, and removing words which are meaningless to classification to obtain a word sequence;
s3.1.2: converting the word sequence into a word number sequence;
s3.1.3: respectively converting the word number sequence into an n-dimensional word vector;
s3.1.4: forming word vectors into a text matrix according to the word sequence, wherein each row of the matrix is the word vector of one word;
s3.1.5: the text matrix is used to train a deep neural network.
The scene recognition training process based on the deep neural network comprises the following steps:
s3.2.1: expanding a scene recognition image sample through operations of image shearing, rotation, zooming and the like;
s3.2.2: adjusting the size of the scene recognition image and performing other preprocessing to train a deep neural network;
s4, fusion reasoning is carried out: according to the type of an input program, performing content identification by using an intelligent identification strategy corresponding to an intelligent identification actuator, then performing rule-based reasoning on an identification result by using an internal knowledge base to judge the program category to obtain a first candidate new label set, performing text classification and scene identification by using a trained deep neural network to obtain a second candidate new label set, and selecting and correcting the first candidate new label set and the second candidate new label set by a user to output a final program label;
the intelligent identification strategy comprises the following strategies:
strategy A: when the input program type is a picture, inputting the picture into a face recognizer to obtain a character tag, inputting the picture into an OCR recognizer, inputting a recognition result into an NLP actuator to perform entity extraction, and acquiring tags such as time, place, event and the like;
and a strategy B: when the input program type is voice, inputting the voice into a voice recognizer and inputting a recognition result into an NLP executor to perform entity extraction so as to obtain time, place, event and other labels;
and (3) strategy C: when the input program type is video, processing the image frame by adopting a strategy A, and processing the video voice by adopting a strategy B;
and (3) strategy D: when the input program type is a text, inputting the text into an NLP executor to perform entity extraction, and obtaining time, place, event and other labels;
the rule-based reasoning comprises the following steps:
s4.1.1: respectively taking the basic label extracted by the intelligent identification actuator and the related metadata of the program initiation as entities;
s4.1.2: obtaining the position of each entity in the internal knowledge base by using methods such as entity disambiguation, coreference resolution and the like;
s4.1.3: extracting adjacent nodes and relations as subgraphs by taking each entity as a central node;
s4.1.4: inputting sub-graph data of each entity into the trained GCN, calculating by adopting a distributed graph technology, and reasoning to obtain a program field;
s4.1.5: taking the basic label and the program field obtained by inference as a first candidate new label set;
the text classification and scene recognition based on the deep neural network comprises the following steps:
s4.2.1: inputting the preprocessed text data into the trained deep neural network to obtain a field label;
s4.2.2: inputting the preprocessed scene recognition image data into the trained deep neural network to obtain a scene label;
s4.2.3: taking the obtained field label and scene label as a second candidate newly added label set;
and after the final program label is obtained in the S4, storing the program, the final program label and the metadata information into a historical label library, updating the historical label library, and updating an internal knowledge base and an internal case base at the same time.
The above description is only a preferred embodiment of the present invention, and not intended to limit the present invention, the scope of the present invention is defined by the appended claims, and all equivalent structural changes made by using the contents of the specification and the drawings of the present invention should be covered by the scope of the present invention.

Claims (9)

1. A fusion reasoning system of news program intelligent tags is characterized in that: comprises an intelligent recognition actuator, a historical label library, an internal knowledge library, an internal case library and an analysis inference device,
the intelligent recognition actuator: the system is used for executing the identification tasks of various news program materials and extracting basic labels of video images, voice and text information;
a history label library: the system comprises a history tag library, a database and a database, wherein the history tag library is used for storing materials, metadata and tags, and the materials in the history tag library comprise texts, voice and video images; the metadata comprises title, keyword, creation time, place and type information of the material; the tags comprise fields, emotional tendency, occurrence time, people and occurrence places related to the materials;
an internal knowledge base: the internal knowledge graph is established based on a historical label library, entities of the internal knowledge graph comprise characters, countries, time administration and events, and the internal knowledge library is used for supplementing intelligent recognition results and providing more information for subsequent analysis and reasoning;
internal case base: a case set established based on a historical label library comprises representative cases with complete metadata and labels and is used for a deep learning training process;
analyzing the inference device: the fusion inference for the intelligent label comprises a rule-based inference engine and a deep learning-based inference engine; the rule-based reasoning device comprehensively utilizes the intelligent recognition result and the internal knowledge base to carry out reasoning; the deep learning-based reasoning device trains a deep learning model by using an internal case base and then carries out reasoning by using the trained model;
when the rule-based reasoning machine carries out reasoning, the method comprises the following steps:
s4.1.1: respectively taking the basic label extracted by the intelligent identification actuator and the related metadata of the program initiation as entities;
s4.1.2: obtaining the position of each entity in the internal knowledge base by using methods such as entity disambiguation, coreference resolution and the like;
s4.1.3: extracting adjacent nodes and relations as subgraphs by taking each entity as a central node;
s4.1.4: inputting the subgraph data of each entity into the trained GCN, calculating by adopting a distributed graph technology, and reasoning to obtain the program field;
s4.1.5: and taking the basic label and the program field obtained by inference as a first candidate new label set.
2. The fusion inference system for news program smart tags as claimed in claim 1, wherein: the intelligent recognition actuator recognizes various news program materials, and a face recognition actuator, an OCR actuator, a voice recognition actuator and an NLP actuator are adopted in the recognition process.
3. A fusion reasoning method of news program intelligent labels is characterized by comprising the following steps:
s1, constructing a news knowledge base: constructing a news knowledge base by utilizing a historical label base, a historical material base, the Internet and other knowledge bases;
s2, constructing an internal case library: extracting text materials with field labels and pictures with scene labels from a historical label library to form an internal case library, and digitizing case texts in the internal case library;
s3, training a deep learning model: training a deep neural network by using a digitized case, wherein the training process of the deep neural network comprises a text classification training process and a scene recognition training process;
s4, fusion reasoning is carried out: according to the type of an input program, performing content identification by using an intelligent identification actuator, extracting a basic label, performing rule-based reasoning on the basic label by using an internal knowledge base to judge the program category to obtain a first candidate newly-added label set, performing text classification or scene identification by using a trained deep neural network to obtain a second candidate newly-added label set, and selecting and correcting the first candidate newly-added label set and the second candidate newly-added label set by a user to output a final program label;
when the rule-based reasoner conducts inference, the method comprises the following steps:
s4.1.1: respectively taking the basic label extracted by the intelligent recognition actuator and the initial related metadata of the program as entities;
s4.1.2: obtaining the position of each entity in the internal knowledge base by using methods such as entity disambiguation, coreference resolution and the like;
s4.1.3: extracting adjacent nodes and relations as subgraphs by taking each entity as a central node;
s4.1.4: inputting sub-graph data of each entity into the trained GCN, calculating by adopting a distributed graph technology, and reasoning to obtain a program field;
s4.1.5: and taking the basic label and the program field obtained by inference as a first candidate new label set.
4. The fusion inference method for news program smart tags as claimed in claim 3, wherein the entities of the news knowledge base in S1 include time, place, event, and people, and the construction of the news knowledge base includes the following steps:
s1.1, constructing a global ontology base: constructing ontology bases of various fields according to the categories of entertainment, sports, finance, civilian life, fashion, tourism and military affairs of news services, wherein the construction range comprises a concept, a concept hierarchy, attributes, attribute value types, attribute value ranges, relationships, relationship definition domain concept sets and relationship value ranges;
s1.2, acquiring an entity: acquiring a historical label library entity, and if the entity information of the historical label library is incomplete, supplementing entity information in a historical material library and the Internet;
s1.3, entity evaluation: quantifying the reliability of the obtained entity, and discarding the entity with lower reliability;
s1.4, knowledge fusion: linking the entities left after entity evaluation to the current knowledge base by an entity disambiguation and coreference resolution method, and merging related entities in a third-party knowledge base into the current knowledge base;
s1.5, knowledge reasoning: and reasoning the relationship among the entities in the current knowledge base, the attributes of the entities and the hierarchical relationship among the bodies, and adding the manually checked knowledge reasoning result into the current knowledge base to obtain the news knowledge base.
5. The fusion inference method for news program smart tags as claimed in claim 3, wherein the text classification training process based on deep neural network in S3 includes the following steps:
s3.1.1: performing word segmentation processing and serialization on the text, and removing words with meaningless classification to obtain a word sequence;
s3.1.2: converting the word sequence into a word number sequence;
s3.1.3: respectively converting the obtained word sequence numbers into n-dimensional word vectors;
s3.1.4: forming a text matrix by the word vectors according to the word sequence;
s3.1.5: the text matrix is used to train a deep neural network.
6. The fusion inference method for news program smart tags as claimed in claim 3, wherein the scene recognition training process based on the deep neural network in S3 includes the following steps:
s3.2.1: expanding a scene recognition image sample through image shearing, rotation and zooming operations;
s3.2.2: and adjusting the size of the scene recognition image and performing other preprocessing to train the deep neural network.
7. The fusion inference method for news program smart labels as claimed in claim 3, wherein in S4, the content is identified by using a smart identification executor, and the following strategy is adopted:
strategy A: when the input program type is a picture, inputting the picture into a face recognizer to obtain character tags, inputting the picture into an OCR recognizer, inputting a recognition result into an NLP actuator to perform entity extraction, and acquiring time, place and event tags;
and (2) strategy B: when the input program type is voice, inputting the voice into a voice recognizer and inputting a recognition result into an NLP executor to perform entity extraction so as to obtain time, place and event labels;
and (3) strategy C: when the input program type is video, processing the image frame by adopting a strategy A, and processing the video voice by adopting a strategy B;
and (3) strategy D: and when the input program type is a text, inputting the text into an NLP executor to perform entity extraction, and acquiring time, place and event labels.
8. The fusion inference method for news program smart tags as claimed in claim 3, wherein the text classification and scene recognition based on deep neural network in S4 includes the following steps:
s4.2.1: inputting the preprocessed text data into the trained deep neural network to obtain a field label;
s4.2.2: inputting the preprocessed scene recognition image data into a trained deep neural network to obtain a scene label;
s4.2.3: and taking the obtained field label and the scene label as a second candidate new label set.
9. The fusion inference method for news program intelligent tags as claimed in claim 3, wherein after the final program tag is obtained in S4, the program, the final program tag and metadata information are stored in the history tag library, the history tag library is updated, and the internal knowledge base and the internal case library are updated at the same time.
CN201811528577.4A 2018-12-13 2018-12-13 Fusion reasoning system and method for news program intelligent tags Active CN109635171B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811528577.4A CN109635171B (en) 2018-12-13 2018-12-13 Fusion reasoning system and method for news program intelligent tags

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811528577.4A CN109635171B (en) 2018-12-13 2018-12-13 Fusion reasoning system and method for news program intelligent tags

Publications (2)

Publication Number Publication Date
CN109635171A CN109635171A (en) 2019-04-16
CN109635171B true CN109635171B (en) 2022-11-29

Family

ID=66073756

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811528577.4A Active CN109635171B (en) 2018-12-13 2018-12-13 Fusion reasoning system and method for news program intelligent tags

Country Status (1)

Country Link
CN (1) CN109635171B (en)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059201A (en) * 2019-04-19 2019-07-26 杭州联汇科技股份有限公司 A kind of across media program feature extracting method based on deep learning
CN111857551B (en) * 2019-04-29 2023-04-07 杭州海康威视数字技术股份有限公司 Video data aging method and device
CN110245259B (en) * 2019-05-21 2021-09-21 北京百度网讯科技有限公司 Video labeling method and device based on knowledge graph and computer readable medium
CN110188241B (en) * 2019-06-04 2023-07-25 成都索贝数码科技股份有限公司 Intelligent manufacturing system and manufacturing method for events
CN110222779B (en) * 2019-06-11 2023-08-01 腾讯科技(深圳)有限公司 Distributed data processing method and system
CN110784759B (en) * 2019-08-12 2022-08-12 腾讯科技(深圳)有限公司 Bullet screen information processing method and device, electronic equipment and storage medium
CN110704637B (en) * 2019-09-29 2023-05-12 出门问问信息科技有限公司 Method and device for constructing multi-modal knowledge base and computer readable medium
CN111177399B (en) * 2019-12-04 2023-06-16 华瑞新智科技(北京)有限公司 Knowledge graph construction method and device
CN113597618B (en) * 2019-12-20 2024-09-17 京东方科技集团股份有限公司 Inference calculation device, model training device, and inference calculation system
CN111126373A (en) * 2019-12-23 2020-05-08 北京中科神探科技有限公司 Internet short video violation judgment device and method based on cross-modal identification technology
CN110827351B (en) * 2020-01-09 2020-04-14 西南交通大学 Automatic generation method of voice tag of new target for robot audio-visual collaborative learning
CN111274960A (en) * 2020-01-20 2020-06-12 央视国际网络有限公司 Video processing method and device, storage medium and processor
CN113254683B (en) * 2020-02-07 2024-04-16 阿里巴巴集团控股有限公司 Data processing method and device, and tag identification method and device
CN111563382A (en) * 2020-03-18 2020-08-21 大箴(杭州)科技有限公司 Text information acquisition method and device, storage medium and computer equipment
CN111444344B (en) * 2020-03-27 2022-10-25 腾讯科技(深圳)有限公司 Entity classification method, entity classification device, computer equipment and storage medium
CN111813940B (en) * 2020-07-14 2023-01-17 科大讯飞股份有限公司 Text field classification method, device, equipment and storage medium
CN111598239B (en) * 2020-07-27 2020-11-06 江苏联著实业股份有限公司 Method and device for extracting process system of article based on graph neural network
CN112507691A (en) * 2020-12-07 2021-03-16 数地科技(北京)有限公司 Interpretable financial subject matter generating method and device fusing emotion, industrial chain and case logic
CN112699248B (en) * 2020-12-24 2022-09-16 厦门市美亚柏科信息股份有限公司 Knowledge ontology construction method, terminal equipment and storage medium
CN112766506A (en) * 2021-01-19 2021-05-07 澜途集思生态科技集团有限公司 Knowledge base construction method based on architecture
CN112836110B (en) * 2021-02-07 2022-09-16 四川封面传媒有限责任公司 Hotspot information mining method and device, computer equipment and storage medium
CN113515522B (en) * 2021-07-19 2024-05-24 南京信息职业技术学院 Automatic label classification method based on data mining technology
CN113473182B (en) * 2021-09-06 2021-12-07 腾讯科技(深圳)有限公司 Video generation method and device, computer equipment and storage medium
CN114282545A (en) * 2021-09-23 2022-04-05 中国银联股份有限公司 Method and system for generating object abbreviation, storage medium and program product
CN114707005B (en) * 2022-06-02 2022-10-25 浙江建木智能系统有限公司 Knowledge graph construction method and system for ship equipment
CN115309941B (en) * 2022-08-19 2023-03-10 联通沃音乐文化有限公司 AI-based intelligent tag retrieval method and system
CN116226761A (en) * 2022-12-27 2023-06-06 北京关键科技股份有限公司 Training data classification cataloging method and system based on deep neural network
CN116402062B (en) * 2023-06-08 2023-09-15 之江实验室 Text generation method and device based on multi-mode perception data
CN117609432A (en) * 2023-12-21 2024-02-27 中国疾病预防控制中心慢性非传染性疾病预防控制中心 Method for realizing intelligent policy retrieval through label extraction strategy
CN117828082B (en) * 2024-01-03 2024-08-06 文华智典(武汉)科技有限公司 File security identification method and system based on semantic learning

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1711773A (en) * 2002-11-18 2005-12-21 皇家飞利浦电子股份有限公司 Creation of a stereotypical profile via program feature based clustering
CN102141997A (en) * 2010-02-02 2011-08-03 三星电子(中国)研发中心 Intelligent decision support system and intelligent decision method thereof
CN102622451A (en) * 2012-04-16 2012-08-01 上海交通大学 System for automatically generating television program labels
CN106547880A (en) * 2016-10-26 2017-03-29 重庆邮电大学 A kind of various dimensions geographic scenes recognition methodss of fusion geographic area knowledge
CN107193895A (en) * 2017-05-09 2017-09-22 四川师范大学 Extract the new method that language acknowledging model hides knowledge
CN107231570A (en) * 2017-06-13 2017-10-03 中国传媒大学 News data content characteristic obtains system and application system
CN107302726A (en) * 2017-06-30 2017-10-27 环球智达科技(北京)有限公司 The label generating method of programme information
CN107333149A (en) * 2017-06-30 2017-11-07 环球智达科技(北京)有限公司 The aggregation processing method of programme information
CN107392423A (en) * 2017-06-13 2017-11-24 中国传媒大学 Drama evaluation system and evaluation method based on intelligent label
CN107633075A (en) * 2017-09-22 2018-01-26 吉林大学 A kind of multi-source heterogeneous data fusion platform and fusion method
CN107832287A (en) * 2017-09-26 2018-03-23 晶赞广告(上海)有限公司 A kind of label identification method and device, storage medium, terminal
CN107862239A (en) * 2017-09-15 2018-03-30 广州唯品会研究院有限公司 A kind of combination text carries out the method and its device of picture recognition with picture
CN108806668A (en) * 2018-06-08 2018-11-13 国家计算机网络与信息安全管理中心 A kind of audio and video various dimensions mark and model optimization method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1711773A (en) * 2002-11-18 2005-12-21 皇家飞利浦电子股份有限公司 Creation of a stereotypical profile via program feature based clustering
CN102141997A (en) * 2010-02-02 2011-08-03 三星电子(中国)研发中心 Intelligent decision support system and intelligent decision method thereof
CN102622451A (en) * 2012-04-16 2012-08-01 上海交通大学 System for automatically generating television program labels
CN106547880A (en) * 2016-10-26 2017-03-29 重庆邮电大学 A kind of various dimensions geographic scenes recognition methodss of fusion geographic area knowledge
CN107193895A (en) * 2017-05-09 2017-09-22 四川师范大学 Extract the new method that language acknowledging model hides knowledge
CN107392423A (en) * 2017-06-13 2017-11-24 中国传媒大学 Drama evaluation system and evaluation method based on intelligent label
CN107231570A (en) * 2017-06-13 2017-10-03 中国传媒大学 News data content characteristic obtains system and application system
CN107302726A (en) * 2017-06-30 2017-10-27 环球智达科技(北京)有限公司 The label generating method of programme information
CN107333149A (en) * 2017-06-30 2017-11-07 环球智达科技(北京)有限公司 The aggregation processing method of programme information
CN107862239A (en) * 2017-09-15 2018-03-30 广州唯品会研究院有限公司 A kind of combination text carries out the method and its device of picture recognition with picture
CN107633075A (en) * 2017-09-22 2018-01-26 吉林大学 A kind of multi-source heterogeneous data fusion platform and fusion method
CN107832287A (en) * 2017-09-26 2018-03-23 晶赞广告(上海)有限公司 A kind of label identification method and device, storage medium, terminal
CN108806668A (en) * 2018-06-08 2018-11-13 国家计算机网络与信息安全管理中心 A kind of audio and video various dimensions mark and model optimization method

Also Published As

Publication number Publication date
CN109635171A (en) 2019-04-16

Similar Documents

Publication Publication Date Title
CN109635171B (en) Fusion reasoning system and method for news program intelligent tags
US11899681B2 (en) Knowledge graph building method, electronic apparatus and non-transitory computer readable storage medium
CN107766371B (en) Text information classification method and device
CN107705066B (en) Information input method and electronic equipment during commodity warehousing
CN111694965B (en) Image scene retrieval system and method based on multi-mode knowledge graph
US20180357211A1 (en) Constructing a Narrative Based on a Collection of Images
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN111914558A (en) Course knowledge relation extraction method and system based on sentence bag attention remote supervision
Clinchant et al. Comparing machine learning approaches for table recognition in historical register books
CN113254630B (en) Domain knowledge map recommendation method for global comprehensive observation results
CN110489565B (en) Method and system for designing object root type in domain knowledge graph body
CN114996488A (en) Skynet big data decision-level fusion method
CN114004581A (en) Intention interaction system based on multi-dimensional government affair knowledge base
CN113673943A (en) Personnel exemption aided decision making method and system based on historical big data
CN106874397B (en) Automatic semantic annotation method for Internet of things equipment
CN116975615A (en) Task prediction method and device based on video multi-mode information
CN114254129A (en) Method, device and readable storage medium for updating knowledge graph
CN115982379A (en) User portrait construction method and system based on knowledge graph
CN114357022B (en) Media content association mining method based on event relation discovery
CN117851609A (en) Non-genetic knowledge graph construction method based on multi-source heterogeneous data fusion
Tavakkol et al. Kartta labs: Unrendering historical maps
CN117541202A (en) Employment recommendation system based on multi-mode knowledge graph and pre-training large model fusion
CN114491071A (en) Food safety knowledge graph construction method and system based on cross-media data
CN114090777A (en) Text data processing method and device
Zhang et al. Deep-learning generation of POI data with scene images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant