CN109255035B

CN109255035B - Method and device for constructing knowledge graph

Info

Publication number: CN109255035B
Application number: CN201811011403.0A
Authority: CN
Inventors: 陈大伟; 刘宝
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2024-03-26
Anticipated expiration: 2038-08-31
Also published as: CN109255035A

Abstract

The embodiment of the application discloses a method and a device for constructing a knowledge graph. One embodiment of the method comprises the following steps: selecting target identification information from a preset target identification information set; the following construction steps are executed: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that no exists, determining the initial knowledge-graph as the final knowledge-graph. The embodiment is helpful for improving the comprehensiveness and flexibility of the association relation between the characterization video information.

Description

Method and device for constructing knowledge graph

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a method and a device for constructing a knowledge graph.

Background

A Knowledge Graph (knowledgegraph) is a Knowledge base called semantic network (semanteme network), i.e. a Knowledge base with a directed Graph structure, where the nodes of the Graph represent entities (entities) or concepts (concepts) and the edges of the Graph represent various semantic relationships between entities/concepts. The knowledge graph can be applied to various fields, such as information searching, information recommending and the like. By using the knowledge graph, other entities associated with the entity representing the information can be obtained, so that other information associated with the information can be obtained more accurately.

Disclosure of Invention

The embodiment of the application provides a method and a device for constructing a knowledge graph.

In a first aspect, an embodiment of the present application provides a method for constructing a knowledge-graph, where the method includes: selecting target identification information from a preset target identification information set, wherein the target identification information is used for representing target video information in a preset video information base, and the target video information comprises attribute information used for representing video attributes; based on the selected target identification information, the following construction steps are performed: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that no exists, determining the initial knowledge-graph as the final knowledge-graph.

In some embodiments, the method further comprises: in response to determining that unselected target identification information exists in the target identification information set, reselecting the target identification information from the unselected target identification information; and continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information.

In some embodiments, determining attribute information of the target entity based on the acquired attribute information includes: in response to determining that the target identification information characterizes at least two video information, merging attribute information included in the two acquired video information into new attribute information; and determining the obtained new attribute information as the attribute information of the target entity.

In some embodiments, determining attribute information of the target entity based on the acquired attribute information includes: in response to determining that the target identification information characterizes a piece of video information, the acquired attribute information is taken as attribute information of the target entity.

In some embodiments, the target set of identification information is included in a preset set of identification information; and before determining whether there is unselected target identification information in the target identification information set, the constructing step further includes: selecting identification information with a preset association relation with target identification information from the identification information set as association identification information; and adding the associated identification information serving as an associated entity associated with the target entity and attribute information comprising the video information characterized by the associated identification information serving as attribute information of the associated entity into the initial knowledge graph.

In some embodiments, the attribute information included in the video information characterized by the associated identification information matches the attribute information included in the video information characterized by the target identification information.

In some embodiments, the video information characterized by the target identification information is video information labeled long video and the video information characterized by the associated identification information is video information labeled short video.

In a second aspect, an embodiment of the present application provides an apparatus for constructing a knowledge-graph, where the apparatus includes: a first selection unit configured to select target identification information from a preset target identification information set, wherein the target identification information is used for representing target video information in a preset video information base, and the target video information comprises attribute information used for representing video attributes; a construction unit configured to perform, based on the selected target identification information, the following construction steps: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that no exists, determining the initial knowledge-graph as the final knowledge-graph.

In some embodiments, the apparatus further comprises: a second selecting unit configured to reselect target identification information from unselected target identification information in response to determining that unselected target identification information exists in the target identification information set; and continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information.

In some embodiments, the building unit comprises: a merging module configured to merge attribute information included in the two acquired video information into new attribute information in response to determining that the target identification information characterizes the at least two video information; and a first determining module configured to determine the obtained new attribute information as the attribute information of the target entity.

In some embodiments, the building unit comprises: and a second determining module configured to characterize one video information in response to determining that the target identification information, regarding the acquired attribute information as attribute information of the target entity.

In some embodiments, the target set of identification information is included in a preset set of identification information; the construction unit includes: the selection module is configured to select identification information with a preset association relation with the target identification information from the identification information set as association identification information; the adding module is configured to add the association identification information into the initial knowledge graph as an association entity associated with the target entity and attribute information included in video information characterized by the association identification information as attribute information of the association entity.

In a third aspect, embodiments of the present application provide a server, including: one or more processors; a storage device having one or more programs stored thereon; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.

In a fourth aspect, embodiments of the present application provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the implementations of the first aspect.

The method and the device for constructing the knowledge graph provided by the embodiment of the application select target identification information from the preset identification information set, and then execute the following construction steps: determining target identification information as a target entity, acquiring attribute information included in video information characterized by the target identification information, determining attribute information of the target entity based on the acquired attribute information, adding the attribute information of the target entity and the target entity into an initial knowledge graph, and determining the initial knowledge graph as a final knowledge graph in response to determining that unselected target identification information does not exist in an identification information set. Therefore, the knowledge graph of the entity including the characterization video information can be constructed, and the comprehensiveness and flexibility of the association relation between the characterization video information can be improved by utilizing the constructed knowledge graph.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings, in which:

FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present application may be applied;

FIG. 2 is a flow chart of one embodiment of a method for constructing a knowledge-graph, in accordance with an embodiment of the present application;

FIG. 3 is a schematic diagram of an application scenario of a method for constructing a knowledge-graph, in accordance with an embodiment of the present application;

FIG. 4 is a flow chart of yet another embodiment of a method for constructing a knowledge-graph, in accordance with an embodiment of the present application;

FIG. 5 is a schematic structural view of one embodiment of an apparatus for constructing a knowledge-graph, in accordance with an embodiment of the present application;

FIG. 6 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present application.

Detailed Description

The present application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 illustrates an exemplary system architecture 100 to which the methods for constructing a knowledge-graph or the apparatuses for constructing a knowledge-graph of the embodiments of the present application may be applied.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various applications, such as a video play class application, a web browser application, a search class application, an instant messaging tool, social platform software, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smartphones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.

The server 105 may be a server that provides various services, such as a background information processing server that processes video information uploaded by the terminal devices 101, 102, 103. The background information processing server may analyze the identification information and attribute information of the video information, etc., and obtain a processing result (for example, construct a knowledge graph including an entity representing the video information).

It should be noted that, the method for building a knowledge graph provided in the embodiments of the present application is generally performed by the server 105, and accordingly, the device for building a knowledge graph is generally disposed in the server 105.

The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. In the case where the identification information and the attribute information of the video information do not need to be acquired from a remote place, the above-described system architecture may not include the terminal device.

With continued reference to fig. 2, a flow 200 of one embodiment of a method for constructing a knowledge-graph according to the present application is shown. The method for constructing the knowledge graph comprises the following steps of:

step 201, selecting target identification information from a preset target identification information set.

In this embodiment, the execution subject (e.g., the server shown in fig. 1) of the method for constructing a knowledge graph may select target identification information from a preset set of target identification information in various manners (e.g., randomly selecting, selecting according to a preset arrangement order, etc.). The target identification information set may be stored in the execution body in advance, or may be stored in another electronic device communicatively connected to the execution body in advance. In this embodiment, the target identification information is used to characterize target video information in a preset video information base. The target video information includes attribute information for characterizing video attributes. The form of the target identification information may include, but is not limited to, at least one of the following: numbers, letters, symbols, pictures, etc. The target video information may be information used to characterize the target video, including but not limited to at least one of the following: the name of the target video file, the author, the category, description information (e.g., profile information, rating information, etc.), the time of the presentation, etc. The target video may be a video specified by a technician or extracted from a preset video set by the execution subject.

It should be appreciated that the video characterized by the target video information may be various types of video, such as movies, television shows, small videos uploaded by users, and the like. The target video information may or may not include the video that the target video information characterizes (e.g., only the title of the video that the video information characterizes and the user's comments on the video).

The attribute information may be information related to the target video, and may include, but is not limited to, at least one of: information about a person (e.g., video producer, actor, director, etc.) associated with the target video represented by the target video information, information about a time (e.g., presentation time, shooting time, etc.) associated with the target video represented by the target video information, information about a description (e.g., introduction, snapshot, poster picture, etc.) associated with the target video represented by the target video information, and so forth.

The video information library may be provided in the execution body in advance, or may be provided in another electronic device communicatively connected to the execution body. It should be appreciated that the number of video information stores may be one, for example, the video information stores may include video information for characterizing a video website offering; the number of video information stores may also be plural, for example, each video information store includes video information provided by one video website.

Step 202, based on the selected target identification information, performing the following construction steps: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that no exists, determining the initial knowledge-graph as the final knowledge-graph.

In this embodiment, based on the target identification information selected in step 201, the execution subject may execute the following construction steps:

in step 2021, the target identification information is determined as the target entity.

Wherein the entity is included in a knowledge graph for characterizing certain information (e.g., information characterizing a person, place, time, thing, etc.). In this embodiment, the entity may be identification information characterizing the video information.

In step 2022, attribute information included in the video information characterized by the target identification information is acquired.

Specifically, the execution subject may acquire attribute information included in the video information characterized by the target identification information from a remote location or from a local location.

Step 2023, determining attribute information of the target entity based on the acquired attribute information.

Specifically, the execution subject may determine the attribute information of the target entity in various ways. For example, target attribute information (e.g., text information) may be extracted from the acquired attribute information as attribute information of the target entity.

In some optional implementations of the present embodiment, based on the obtained attribute information, the executing entity may determine the attribute information of the target entity according to the following steps:

first, in response to determining that the target identification information characterizes at least two video information, attribute information included in the two acquired video information is combined into new attribute information. In particular, at least two video information characterized by the object identification information may respectively characterize the same or similar videos.

For example, the object identification information characterizes two video information, a and B, respectively. Wherein A is used for representing Chinese dubbing edition of film XXX. B is used for representing English dubbing edition of film XXX. The attribute information of A comprises the Chinese name, chinese brief introduction and the link address of Chinese dubbing video of the film XXX; the attribute information of B comprises English name, english brief introduction and link address of English dubbing video of film XXX. The merged attribute information may include: chinese name, chinese introduction, link address of chinese dubbing video, english name, english introduction, link address of english dubbing video of movie XXX.

For another example, assuming that the attribute information of the two video information a and B has the same content (e.g., includes the same profile information), one of the same content (e.g., one of the two pieces of the same profile information) may be retained.

Then, the obtained new attribute information is determined as the attribute information of the target entity. Specifically, in the knowledge graph, an entity may correspond to attribute information. The attribute information may be used to describe the corresponding entity. In general, in the knowledge graph, the correspondence between the entity and the attribute information may be represented by a data structure in the form of a triplet, that is, "entity-attribute value", where the attribute information of the entity may include the attribute-attribute value described above. For example, a triplet may be "abc 123-name-XXX", where "abc123" is the entity used to characterize the movie XXX, "name" is an attribute, and "XXX" is an attribute value. By executing the implementation manner, the same or similar videos can be corresponding to the entities included in the knowledge graph, and the incidence relation among different videos can be reflected in a knowledge graph mode.

In response to determining that the target identification information characterizes a piece of video information, the acquired attribute information is taken as attribute information of the target entity. At this point, the target entity characterizes a video message.

Step 2024, adding the attribute information of the target entity and the target entity to the initial knowledge-graph.

Specifically, the initial knowledge graph may be a knowledge graph that is pre-established and does not include any information, or may be a knowledge graph that is pre-established and includes initial entity and initial attribute information. Moreover, the initial entity may characterize the video information, or may characterize other types of information, without limitation.

Step 2025 determines whether there is unselected target identification information in the target identification information set.

In response to determining that there is no unselected target identification information in the set of target identification information, the initial knowledge-graph is determined as the final knowledge-graph, step 2026.

In some optional implementations of this embodiment, the executing entity may, in response to determining that there is unselected target identification information in the target identification information set, first reselect the target identification information from the unselected target identification information. And then continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information. As an example, assuming that the target identification information set includes target identification information a and target identification information B, the target identification information used in the first execution of the constructing step is target identification information a, the initial knowledge graph is C, and after the first execution of the constructing step, the executing body adds the entity and the attribute information to the knowledge graph C, the initial knowledge graph C becomes an initial knowledge graph C ', and the executing body reuses the target identification information B and the initial knowledge graph C', and continues to execute the constructing step.

By repeatedly executing the construction steps, the association relationship between the video information represented by each target identification information and the video information can be represented by the knowledge graph. The constructed knowledge graph can be applied to the fields of video recommendation, video search and the like. The constructed knowledge graph is used in the fields, so that the accuracy and the comprehensiveness of video recommendation and video search can be improved. Meanwhile, the knowledge graph is used, so that the mutual communication of different technical fields can be facilitated. For example, a video comment website and a video playing website can share video information of the opposite side by using the knowledge graph at the same time, so that video search results and video recommendation information are more comprehensive.

With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for constructing a knowledge-graph according to the present embodiment. In the application scenario of fig. 3, a target identification information set 302 is stored in advance in the server 301, and the target identification information set 302 includes target identification information A, B, C. The server 301 generates entity a and attribute information D of the entity a based on the target identification information a and the corresponding attribute information D, and adds the entity a and the attribute information D to the initial knowledge graph 303; generating an entity B and attribute information E of the entity B based on the target identification information B and the corresponding attribute information E, and adding the entity B and the attribute information E into the initial knowledge graph 303; generating entity C and attribute information F of entity C based on the target identification information C and the corresponding attribute information F, and adding the entity C and the attribute information F to the initial knowledge graph 303 to generate a final knowledge graph 304.

The method provided in the above embodiment of the present application selects the target identification information from the preset identification information set, and then performs the following construction steps: determining target identification information as a target entity, acquiring attribute information included in video information characterized by the target identification information, determining attribute information of the target entity based on the acquired attribute information, adding the attribute information of the target entity and the target entity into an initial knowledge graph, and determining the initial knowledge graph as a final knowledge graph in response to determining that unselected target identification information does not exist in an identification information set. Therefore, the knowledge graph of the entity including the characterization video information can be constructed, and the comprehensiveness and flexibility of the association relation between the characterization video information can be improved by utilizing the constructed knowledge graph.

With further reference to fig. 4, a flow 400 of yet another embodiment of a method for constructing a knowledge-graph is shown. The process 400 of the method for constructing a knowledge graph includes the steps of:

step 401, selecting target identification information from a preset target identification information set.

In this embodiment, step 401 is substantially identical to step 201 in the corresponding embodiment of fig. 2, and will not be described herein.

In this embodiment, based on the selected target identification information, an execution subject (e.g., a server shown in fig. 1) of the method for constructing a knowledge-graph may perform the constructing step (i.e., steps 402-409).

In step 402, the target identification information is determined as a target entity.

In this embodiment, step 402 is substantially identical to step 2021 in the corresponding embodiment of fig. 2, and will not be described herein.

Step 403, obtaining attribute information included in the video information characterized by the target identification information.

In this embodiment, step 403 is substantially identical to step 2022 in the corresponding embodiment of fig. 2, and will not be described herein.

Step 404, determining attribute information of the target entity based on the acquired attribute information.

In this embodiment, step 404 is substantially identical to step 2023 in the corresponding embodiment of fig. 2, and will not be described herein.

And step 405, adding the attribute information of the target entity and the target entity into the initial knowledge graph.

In this embodiment, step 405 is substantially identical to step 2024 in the corresponding embodiment of fig. 2, and will not be described herein.

Step 406, selecting the identification information with the preset association relation with the target identification information from the identification information set as the association identification information.

In this embodiment, the executing body may select, from the set of identification information, identification information having a preset association relationship with the target identification information as the association identification information. The identification information set may be stored in the execution body in advance, or may be stored in another electronic device communicatively connected to the execution body in advance. The identification information in the identification information set is used for representing video information in a preset video information base, and the video information comprises attribute information for representing video attributes. The set of target identification information may be included in the set of identification information. The target identification information may be identification information of the identification information set, where the characterized video has certain preset characteristics (for example, the playing duration of the characterized video is within a preset range, and the source of the characterized video is a preset source). The execution subject may extract the identification information from the identification information set as target identification information according to the specification of the technician; or at least one identification information is extracted from the above identification information set as a target identification information set based on some characteristic (such as play time length) of the video characterized by the identification information. For example, from the identification information set, the identification information of which the playing time length of the characterized video is in the preset time length range is extracted as the target identification information.

In this embodiment, the association relationship between the identification information and the target identification information may be characterized by a preset correspondence table. In the correspondence table, target identification information and corresponding association identification information may be stored, and the execution subject may search the correspondence table for the target identification information and the corresponding association identification information. It should be noted that, the method for characterizing the association relationship between the identification information and the target identification information is not limited to the method of the correspondence table, and other methods (e.g., through a data structure such as a linked list) may be used to characterize the association relationship between the identification information.

In some optional implementations of this embodiment, the video information characterized by the target identification information may be video information characterized by the long video, and the video information characterized by the associated identification information may be video information characterized by the short video. Specifically, the long video may be a video whose play time is equal to or longer than a preset time threshold; alternatively, the long video may be a video including the number of image frames equal to or greater than a preset number threshold. Accordingly, the short video may be a video whose play time is less than a preset time threshold; alternatively, the short video may be a video including a number of image frames less than a preset number threshold. It should be appreciated that the short video indicated by the associated identification information may be a video clip taken from the long video indicated by the target identification information; alternatively, the short video indicated by the association identification information may be a video created by the short video creator based on the long video indicated by the target identification information (for example, a video of a comment made on the long video indicated by the target identification information, or a video of a pruning process performed on the long video indicated by the target identification information).

In some optional implementations of this embodiment, the attribute information included in the video information characterized by the associated identification information matches the attribute information included in the video information characterized by the target identification information. Specifically, the executing body or other electronic devices may determine, in advance, according to various methods, whether attribute information included in the video information represented by the association identification information matches attribute information included in the video information represented by the target identification information. For example, the attribute information may include text information (such as names of respective actors, descriptions of video contents, etc.), the executing body may calculate the similarity of the text information by using an existing method for calculating the similarity of the text information, and if the calculated similarity is greater than or equal to a preset similarity threshold, it is determined that the two attribute information are matched. For another example, the attribute information included in the video information may include an image, which may be an image taken from the video represented by the video information, or may be a poster image of the video, or the like. The executing body or other electronic devices may determine the similarity of the images according to the existing algorithm (such as a histogram distance algorithm, an average hash algorithm, a perceptual hash algorithm, etc.) for determining the similarity between the images, and if the determined similarity is greater than or equal to a preset similarity threshold, determine that the two attribute information are matched. Further, the execution subject or other electronic device may determine that the identification information of the video information to which the attribute information that matches each other belongs has an association relationship.

Step 407, taking the association identification information as an association entity associated with the target entity, and taking attribute information included in the video information characterized by the association identification information as attribute information of the association entity, and adding the attribute information into the initial knowledge graph.

In this embodiment, the executing body may add the association identifier information as an association entity associated with the target entity, and add attribute information included in the video information represented by the association identifier information as attribute information of the association entity to the initial knowledge graph. In the knowledge graph, the association relationship between the target entity and the associated entity, namely, the target entity-relationship-associated entity, can be characterized through a data structure in the form of a triplet. As an example, assuming that the target entity characterizes a certain movie "XXX", the associated entity is used to characterize a fragment of the movie "XXX", the triplet used to characterize the association between the two may be "abc 123-fragment-abc 456", where "abc123" is the entity used to characterize the movie "XXX", and "abc456" is the entity used to characterize a fragment of the movie "XXX", and "fragment" is used to characterize the relationship between the two being a complete video and a fragment intercepted from the complete video.

Step 408 determines whether there is unselected target identification information in the set of target identification information.

In this embodiment, step 408 is substantially identical to step 2025 in the corresponding embodiment of fig. 2, and will not be described herein.

In response to determining that no exists, the initial knowledge-graph is determined to be the final knowledge-graph 409.

In this embodiment, step 409 is substantially identical to step 2026 in the corresponding embodiment of fig. 2, and will not be described herein.

Step 410, in response to determining that there is a presence, reselecting target identification information from unselected target identification information; steps 402-409 continue with the re-selected target identification information and the initial knowledge-graph of the last added entity and attribute information.

In this embodiment, the executing body may, in response to determining that there is unselected target identification information in the target identification information set, first reselect target identification information from the unselected target identification information. And then continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information. As an example, assuming that the target identification information set includes target identification information a and target identification information B, the target identification information used in the first execution of the constructing step is target identification information a, the initial knowledge graph is C, and after the first execution of the constructing step, the executing body adds the entity and the attribute information to the knowledge graph C, the initial knowledge graph C becomes an initial knowledge graph C ', and the executing body reuses the target identification information B and the initial knowledge graph C', and continues to execute the constructing step (i.e., steps 402-409).

As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for constructing a knowledge graph in this embodiment highlights the steps of selecting the associated identification information corresponding to the target identification information, and adding the target entity and the associated entity to the knowledge graph. Therefore, the scheme described in the embodiment can further improve the comprehensiveness of the video information represented by the knowledge graph, and is beneficial to improving the comprehensiveness and accuracy of video searching and recommendation through the knowledge graph.

With further reference to fig. 5, as an implementation of the method shown in the foregoing drawings, the present application provides an embodiment of an apparatus for constructing a knowledge-graph, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 5, the apparatus 500 for constructing a knowledge graph according to the present embodiment includes: a first selection unit 501 configured to select target identification information from a preset set of target identification information, where the target identification information is used to characterize target video information in a preset video information base, and the target video information includes attribute information used to characterize a video attribute; a construction unit 502 configured to perform, based on the selected target identification information, the following construction steps: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that no exists, determining the initial knowledge-graph as the final knowledge-graph.

In this embodiment, the first selection unit 501 may select the target identification information from a preset target identification information set in various manners (for example, a random selection manner, a sequential selection manner, or the like). The target identification information set may be stored in the apparatus 500 in advance, or may be stored in another electronic device communicatively connected to the apparatus 500 in advance. In this embodiment, the target identification information is used to characterize target video information in a preset video information base, and the target video information includes attribute information for characterizing a video attribute. Wherein, the form of the identification information can include, but is not limited to, at least one of the following: numbers, letters, symbols, pictures, etc. The target video information may be information used to characterize the target video, including but not limited to at least one of the following: the name of the target video file, the author, the category, description information (e.g., profile information, rating information, etc.), the time of the presentation, etc. The target video may be a video specified by a technician or extracted from a preset video set by the apparatus 500.

The attribute information may be information related to the target video information, and may include, but is not limited to, at least one of: information about a person (e.g., video producer, actor, director, etc.) associated with the target video represented by the target video information, information about a time (e.g., presentation time, shooting time, etc.) associated with the target video represented by the target video information, information about a description (e.g., introduction, snapshot, poster picture, etc.) associated with the target video represented by the target video information, and so forth.

The video information library may be provided in the apparatus 500 in advance, or may be provided in another electronic device communicatively connected to the apparatus 500. It should be appreciated that the number of video information stores may be one, for example, the video information stores may include video information for characterizing a video website offering; the number of video information stores may also be plural, for example, each video information store includes video information provided by one video website.

In the present embodiment, the above-described construction unit 502 may perform the construction step based on the target identification information selected by the first selection unit 501. The construction steps may refer to steps 2021 to 2026 in the corresponding embodiment of fig. 2, which are not described herein.

In some optional implementations of the present embodiment, the apparatus 500 may further include a second selecting unit 503 (not shown in the figure) configured to reselect the target identification information from the unselected target identification information in response to determining that there is unselected target identification information in the target identification information set; and continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information. As an example, assuming that the target identification information set includes target identification information a and target identification information B, the target identification information used in the first execution of the constructing step is target identification information a, and the initial knowledge graph is C, the apparatus 500 changes the initial knowledge graph C into an initial knowledge graph C 'after the first execution of the constructing step due to the fact that the entity and attribute information are added to the knowledge graph C, and the apparatus reuses the target identification information B and the initial knowledge graph C', and continues to execute the constructing step.

In some alternative implementations of the present embodiment, the constructing unit 502 may include: a merging module (not shown in the figure) configured to merge attribute information included in the acquired two video information into new attribute information in response to determining that the target identification information characterizes at least two video information; a first determining module (not shown in the figure) configured to determine the obtained new attribute information as the attribute information of the target entity.

In some alternative implementations of the present embodiment, the constructing unit 502 may include: a second determining module (not shown in the figure) is configured to characterize one video information in response to determining the target identification information, and to take the acquired attribute information as the attribute information of the target entity.

In some optional implementations of the present embodiment, the target set of identification information may be included in a preset set of identification information; the construction unit 502 may include: a selection module (not shown in the figure) configured to select, from the set of identification information, identification information having a preset association relationship with the target identification information as associated identification information; an adding module (not shown in the figure) is configured to add the association identification information as an association entity associated with the target entity, and attribute information included in the video information characterized by the association identification information as attribute information of the association entity into the initial knowledge graph.

In some optional implementations of this embodiment, the attribute information included in the video information characterized by the associated identification information matches the attribute information included in the video information characterized by the target identification information.

In some optional implementations of this embodiment, the video information characterized by the target identification information is video information labeled long video, and the video information characterized by the associated identification information is video information labeled short video.

The device provided by the embodiment of the application selects the target identification information from the preset identification information set, and then executes the following construction steps: determining target identification information as a target entity, acquiring attribute information included in video information characterized by the target identification information, determining attribute information of the target entity based on the acquired attribute information, adding the attribute information of the target entity and the target entity into an initial knowledge graph, and determining the initial knowledge graph as a final knowledge graph in response to determining that unselected target identification information does not exist in an identification information set. Therefore, the knowledge graph of the entity including the characterization video information can be constructed, and the comprehensiveness and flexibility of the association relation between the characterization video information can be improved by utilizing the constructed knowledge graph.

Referring now to FIG. 6, there is illustrated a schematic diagram of a computer system 600 suitable for use in implementing a server of an embodiment of the present application. The server illustrated in fig. 6 is merely an example, and should not be construed as limiting the functionality and scope of use of the embodiments herein.

As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 601.

It should be noted that the computer readable medium described in the present application may be a computer readable signal medium or a computer readable medium, or any combination of the two. The computer readable medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware. The described units may also be provided in a processor, for example, described as: a processor includes a first selection unit and a construction unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the first selection unit may also be described as "a unit that selects target identification information from a preset set of target identification information".

As another aspect, the present application also provides a computer-readable medium that may be contained in the server described in the above embodiment; or may exist alone without being assembled into the server. The computer readable medium carries one or more programs which, when executed by the server, cause the server to: selecting target identification information from a preset target identification information set, wherein the target identification information is used for representing target video information in a preset video information base, and the target video information comprises attribute information used for representing video attributes; based on the selected target identification information, the following construction steps are performed: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that no exists, determining the initial knowledge-graph as the final knowledge-graph.

The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the invention referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or equivalents thereof is possible without departing from the spirit of the invention. Such as the above-described features and technical features having similar functions (but not limited to) disclosed in the present application are replaced with each other.

Claims

1. A method for constructing a knowledge-graph, comprising:

selecting target identification information from a preset target identification information set, wherein the target identification information is used for representing target video information in a preset video information base, the target video information comprises attribute information used for representing video attributes, and the target identification information set is contained in the preset identification information set;

based on the selected target identification information, the following construction steps are performed: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; selecting identification information with a preset association relation with target identification information from the identification information set as association identification information; the associated identification information is used as an associated entity associated with the target entity, and attribute information contained in video information characterized by the associated identification information is used as attribute information of the associated entity and added into an initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that the first knowledge-graph is not present, determining the first knowledge-graph as a second knowledge-graph; the attribute information included in the video information characterized by the associated identification information is matched with the attribute information included in the video information characterized by the target identification information.

2. The method of claim 1, wherein the method further comprises:

in response to determining that unselected target identification information exists in the target identification information set, reselecting target identification information from the unselected target identification information; and continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information.

3. The method of claim 1, wherein the determining attribute information of the target entity based on the acquired attribute information comprises:

in response to determining that the target identification information characterizes at least two video information, merging attribute information included in the two acquired video information into new attribute information;

and determining the obtained new attribute information as the attribute information of the target entity.

4. The method of claim 1, wherein the determining attribute information of the target entity based on the acquired attribute information comprises:

in response to determining that the target identification information characterizes a piece of video information, the acquired attribute information is taken as attribute information of the target entity.

5. The method of claim 4, wherein the video information characterized by the object identification information is video information labeled long video and the video information characterized by the associated identification information is video information labeled short video.

6. An apparatus for constructing a knowledge-graph, comprising:

a first selection unit configured to select target identification information from a preset target identification information set, wherein the target identification information is used for representing target video information in a preset video information base, the target video information comprises attribute information used for representing video attributes, and the target identification information set is contained in the preset identification information set;

a construction unit configured to perform, based on the selected target identification information, the following construction steps: determining target identification information as a target entity; acquiring attribute information included in video information characterized by target identification information; determining attribute information of a target entity based on the acquired attribute information; adding the attribute information of the target entity and the target entity into the initial knowledge graph; selecting identification information with a preset association relation with target identification information from the identification information set as association identification information; the associated identification information is used as an associated entity associated with the target entity, and attribute information contained in video information characterized by the associated identification information is used as attribute information of the associated entity and added into an initial knowledge graph; determining whether unselected target identification information exists in the target identification information set; in response to determining that the first knowledge-graph is not present, determining the first knowledge-graph as a second knowledge-graph; the attribute information included in the video information characterized by the associated identification information is matched with the attribute information included in the video information characterized by the target identification information.

7. The apparatus of claim 6, wherein the apparatus further comprises:

a second selecting unit configured to reselect target identification information from unselected target identification information in response to determining that unselected target identification information exists in the target identification information set; and continuing to execute the construction step by using the re-selected target identification information and the initial knowledge graph of the last added entity and attribute information.

8. The apparatus of claim 6, wherein the building element comprises:

a merging module configured to merge attribute information included in the two acquired video information into new attribute information in response to determining that the target identification information characterizes the at least two video information;

and a first determining module configured to determine the obtained new attribute information as the attribute information of the target entity.

9. The apparatus of claim 6, wherein the building element comprises:

a second determining module configured to characterize one piece of video information in response to determining that the target identification information, regarding the acquired attribute information as attribute information of the target entity;

the adding module is configured to add the association identification information into the initial knowledge graph as an association entity associated with the target entity and attribute information included in video information characterized by the association identification information as attribute information of the association entity.

10. The apparatus of claim 9, wherein the video information characterized by the object identification information is video information labeled long video and the video information characterized by the associated identification information is video information labeled short video.

11. A server, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-5.

12. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-5.