CN111274379A

CN111274379A - SPO selection method and device, electronic equipment and storage medium

Info

Publication number: CN111274379A
Application number: CN202010042671.XA
Authority: CN
Inventors: 贺薇; 李双婕; 史亚冰; 蒋烨; 张扬; 朱勇
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-01-15
Filing date: 2020-01-15
Publication date: 2020-06-12
Anticipated expiration: 2040-01-15
Also published as: CN111274379B

Abstract

The application discloses a selection method and device of SPO, electronic equipment and a storage medium, and relates to the fields of artificial intelligence and knowledge maps. The specific implementation scheme is as follows: filtering a plurality of O values corresponding to a current SP extracted by upstream equipment to obtain an SPO corresponding to the current SP and meeting constraint conditions; selecting N SPOs from the SPOs meeting the constraint condition according to the SPOs meeting the constraint condition corresponding to the current SP and the predetermined scores of all O values in the SPOs meeting the constraint condition; wherein N is a natural number of 1 or more. Under the condition that the upstream model has errors, the embodiment of the application can filter the SPO which does not meet the constraint condition, so that the accuracy of selecting the SPO can be improved.

Description

SPO selection method and device, electronic equipment and storage medium

Technical Field

The application relates to the technical field of computer processing, and further relates to an artificial intelligence technology, in particular to a triple SPO selection method, a triple SPO selection device, electronic equipment and a storage medium.

Background

The knowledge graph is a large-scale knowledge base of real world knowledge represented in a structured form from the semantic perspective, and is a directed graph, wherein the directed graph comprises elements such as entities (nodes) and relations (edges). The triple SPO is a triple formed by an entity pair (subject S-object Opair) and a relationship (predicate P) therebetween. The SPO triple data in the knowledge graph can be widely used for searching and recommending products, the requirement of a user on entity association can be directly met, the efficiency of people for searching and browsing entities can be effectively improved, and the user experience is improved.

On one hand, the open SPO extraction is of multiple sources, and different sources may conflict; on the other hand, the problems of accuracy rate of an extraction algorithm and data source quality exist, so that a fusion and preferential solution is needed to select correct SPO from sources with uneven quality, and the accuracy of output fact knowledge is ensured.

In the prior art, the selection of SPO is usually achieved by the following two schemes: (1) selecting based on the number of SPO occurrences: voting the number of times of each SPO extracted in each text, and taking the SPO with the maximum occurrence number or higher than a certain number preset threshold as a final selection result; (2) selecting based on the SPO confidence score: and giving different weights to the texts of various sources or different extraction models, obtaining the final confidence score of each SPO in a weighted summation mode, and taking the SPO with the highest score or exceeding a certain score threshold as a final selection result.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

for the schemes (1) and (2), both depend too much on the excavation effect of the upstream equipment, and if the upstream equipment has errors, the accuracy of the finally selected SPO is reduced only by counting or adding.

Disclosure of Invention

In view of this, embodiments provided in the present application provide a method and an apparatus for selecting an SPO, an electronic device, and a storage medium, so that when there is an error in an upstream model, SPOs that do not meet constraint conditions can be filtered out, and thus, the accuracy of selecting the SPOs can be improved.

In a first aspect, an embodiment of the present application provides a method for selecting an SPO, where the method includes:

filtering a plurality of object O values corresponding to a current subject and a predicate SP extracted by upstream equipment to obtain an SPO corresponding to the current SP and meeting constraint conditions;

selecting N SPOs from the SPOs meeting the constraint condition according to the SPOs meeting the constraint condition corresponding to the current SP and the predetermined scores of all O values in the SPOs meeting the constraint condition; wherein N is a natural number of 1 or more.

The above embodiment has the following advantages or beneficial effects: in the embodiment, the plurality of O values corresponding to the current SP extracted by the upstream device are filtered, so that SPOs which do not meet the constraint condition can be filtered, and SPOs which meet the constraint condition are reserved, so that a plurality of SPOs can be selected from SPOs which meet the constraint condition instead of selecting a plurality of SPOs from all SPOs, and the selection range of SPOs is reduced, so that the accuracy of selecting SPOs can be improved.

In the above embodiment, before filtering the plurality of O values corresponding to the current SP extracted by the upstream device, the method further includes:

judging whether the current SP is a valid SP;

and if the current SP is judged to be the effective SP, executing the operation of filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment.

The above embodiment has the following advantages or beneficial effects: the embodiment can perform the operation of selecting the SPO only for the valid SPs and not for the invalid SPs by judging the validity of the SPs, thereby saving time and improving efficiency.

In the above embodiment, the filtering the plurality of O values corresponding to the current SP extracted by the upstream device includes:

obtaining the category of each O value corresponding to the current SP;

matching the category of each O value corresponding to the current SP with the category of a predetermined current P value; wherein the current P value is a P value in the current SP;

if the category of each O value corresponding to the current SP is successfully matched with the category of the current P value, determining the successfully matched O value as an effective O value corresponding to the current SP;

and determining the SPO meeting the constraint condition according to the current SP and the effective O value.

The above embodiment has the following advantages or beneficial effects: in the above embodiment, the category of each O value is matched with the category of the current P value, so that an effective O value corresponding to the current SP and an invalid O value corresponding to the current SP can be obtained, where only the effective O value corresponding to the current SP is retained, and then the SPO meeting the constraint condition is determined according to the current SP and the effective O value.

In the above embodiment, the selecting N SPOs from the SPOs meeting the constraint condition includes:

accumulating the scores of all O values corresponding to the current SP, and sequencing all O values corresponding to the current SP according to the accumulated scores of all O values;

extracting a current P value from the current SP, and determining the attribute of the current P value according to the current P value and the corresponding relation between the predetermined P value and the attribute; wherein the attributes of the current P value include: a single-value attribute or a multi-value attribute;

if the attribute of the current P value is the single-value attribute, selecting an SPO from the SPOs meeting the constraint condition according to the sorted accumulated scores of all O values;

and if the attribute of the current P value is the multi-value attribute, selecting a plurality of SPOs from the SPOs meeting the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value.

The above embodiment has the following advantages or beneficial effects: the above embodiment can divide the current P value into a single-value attribute and a multi-value attribute by distinguishing the attributes of the current P value, and different measures are taken for different attributes, that is: if the attribute of the current P value is a single-value attribute, selecting an SPO from SPOs meeting the constraint condition according to the sorted accumulated scores of all O values; and if the attribute of the current P value is a multi-value attribute, selecting a plurality of SPOs from the SPOs meeting the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value. This allows for a more rapid selection of SPOs that meet the constraints.

In the above embodiment, the selecting, according to the sorted accumulated scores of the respective O values and the predetermined probability distribution of the O value corresponding to each P value, a plurality of SPOs from the SPOs that meet the constraint condition includes:

if the prior probability of the score of the Mth O value multiplied by the Mth O value is greater than the prior probability of the score of the M +1 th O value multiplied by the M +1 th O value, selecting the first M O values from the sorted O values, and determining M SPOs according to the first M O values; wherein M is a natural number greater than 1.

The above embodiment has the following advantages or beneficial effects: in the above embodiment, according to the scores of the O values and the prior probabilities of the O values, the score of the mth O value multiplied by the prior probability of the mth O value is compared with the score of the M +1 th O value multiplied by the prior probability of the M +1 th O value, and if the score of the mth O value multiplied by the prior probability of the mth O value is greater than the score of the M +1 th O value multiplied by the prior probability of the M +1 th O value, the previous M O values are selected from the sorted O values, and the M O values can be quickly and accurately selected from the plurality of O values, so that M SPOs can be determined according to the selected M O values.

In a second aspect, the present application further provides an apparatus for selecting an SPO, the apparatus including: the system comprises an SPO filtering module and an O value judging module; wherein the content of the first and second substances,

the SPO filtering module is used for filtering a plurality of object O values corresponding to the current subject and predicate SP extracted by upstream equipment to obtain the SPO corresponding to the current SP and meeting the constraint condition;

the O value judging module is used for selecting N SPOs from the SPOs meeting the constraint conditions according to the SPOs meeting the constraint conditions corresponding to the current SP and the scores of all O values in the SPOs meeting the constraint conditions which are determined in advance; wherein N is a natural number of 1 or more.

In the above embodiment, the apparatus further includes: the SP-NIL judging module is used for judging whether the current SP is a valid SP or not; and if the current SP is judged to be the effective SP, executing the operation of filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment through the SPO filtering module.

In the above embodiment, the SPO filtering module includes: the method comprises the steps of obtaining a submodule, a matching submodule and a determining submodule; wherein the content of the first and second substances,

the obtaining submodule is used for obtaining the category of each O value corresponding to the current SP;

the matching submodule is used for matching the category of each O value corresponding to the current SP with the category of the predetermined current P value; wherein the current P value is a P value in the current SP;

the determining submodule is configured to determine, if the category of each O value corresponding to the current SP is successfully matched with the category of the current P value, the successfully matched O value as an effective O value corresponding to the current SP; and determining the SPO meeting the constraint condition according to the current SP and the effective O value.

In the above embodiment, the O value determining module includes: a sorting submodule and a selecting submodule; wherein the content of the first and second substances,

the sorting submodule is used for accumulating the scores of all O values corresponding to the current SP and sorting all O values corresponding to the current SP according to the accumulated scores of all O values;

the selection submodule is used for extracting a current P value from the current SP and determining the attribute of the current P value according to the current P value and the corresponding relation between the predetermined P value and the attribute; wherein the attributes of the current P value include: a single-value attribute or a multi-value attribute; if the attribute of the current P value is the single-value attribute, selecting an SPO from the SPOs meeting the constraint condition according to the sorted accumulated scores of all O values; and if the attribute of the current P value is the multi-value attribute, selecting a plurality of SPOs from the SPOs meeting the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value.

In a third aspect, an embodiment of the present application provides an electronic device, including:

one or more processors;

a memory for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the selection method of the SPO according to any embodiment of the present application.

In a fourth aspect, the present application provides a storage medium, on which a computer program is stored, where the computer program is executed by a processor to implement the selection method of the SPO according to any embodiment of the present application.

One embodiment in the above application has the following advantages or benefits: according to the SPO selection method, the device, the electronic equipment and the storage medium, a plurality of O values corresponding to the current SP extracted by upstream equipment are filtered to obtain the SPO corresponding to the current SP and meeting constraint conditions; then selecting N SPOs from the SPOs meeting the constraint conditions according to the SPOs meeting the constraint conditions corresponding to the current SP and the scores of all O values in the predetermined SPOs meeting the constraint conditions; wherein N is a natural number of 1 or more. That is to say, the method and the device can filter the SPO which does not meet the constraint condition, so that the purpose of improving the accuracy of selecting the SPO is achieved. However, the existing selection method of the SPO excessively depends on the excavation effect of the upstream equipment, and if the upstream equipment has errors, the accuracy of the finally selected SPO is reduced only by a counting or adding mode. Because the technical means of filtering a plurality of O values is adopted and the technical means of selecting the SPO from the SPOs meeting the constraint condition is adopted, the technical problem of low accuracy rate of selecting the SPO in the prior art is solved; moreover, the technical scheme of the embodiment of the application is simple and convenient to implement, convenient to popularize and wide in application range.

Other effects of the above-described alternative will be described below with reference to specific embodiments.

Drawings

The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:

fig. 1 is a schematic flowchart of a selection method of an SPO according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of a selection method of an SPO according to the second embodiment of the present application;

FIG. 3 is a schematic structural diagram of an SPO selection system provided in the second embodiment of the present application;

FIG. 4 is a schematic structural diagram of a selection device of an SPO according to a third embodiment of the present application;

FIG. 5 is a schematic structural diagram of an SPO filter module provided in the third embodiment of the present application;

fig. 6 is a schematic structural diagram of an O value determination module according to a third embodiment of the present application;

fig. 7 is a block diagram of an electronic device for implementing the selection method of the SPO according to the embodiment of the present application.

Detailed Description

The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Example one

Fig. 1 is a flowchart illustrating a selection method of an SPO according to an embodiment of the present application, where the method may be performed by an apparatus or an electronic device for selecting an SPO, where the apparatus or the electronic device may be implemented by software and/or hardware, and the apparatus or the electronic device may be integrated in any intelligent device with a network communication function. As shown in fig. 1, the selection method of the SPO may include the following steps:

s101, filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment to obtain the SPO corresponding to the current SP and meeting the constraint condition.

In a specific embodiment of the present application, the electronic device may respectively input a current SP and a plurality of O values corresponding to the current SP extracted by the upstream device into the SPO filtering module, and output an SPO corresponding to the current SP and meeting the constraint condition through the SPO filtering module. Specifically, the electronic device may first obtain the category of each O value corresponding to the current SP through the SPO filter model; then matching the category of each O value corresponding to the current SP with the predetermined category of the P value in the current SP; if the category of each O value corresponding to the current SP is successfully matched with the category of the predetermined P value, determining the successfully matched O value as an effective O value corresponding to the current SP; and determining the SPO meeting the constraint condition according to the current SP and the effective O value. For example, assuming that the category of the P value in the current SP is "lead actor", and the category of a certain O value corresponding to the current SP is "person", since the category "person" of the O value matches the category "lead actor" of the P value, the O value can be determined as an effective O value corresponding to the current SP, and thus an SPO meeting the constraint condition can be determined according to the current SP and the O value.

S102, selecting N SPOs from the SPOs meeting the constraint conditions according to the SPOs meeting the constraint conditions corresponding to the current SP and the scores of all O values in the SPOs meeting the constraint conditions which are determined in advance; wherein N is a natural number of 1 or more.

In a specific embodiment of the present application, the electronic device may respectively input scores of each O value in an SPO meeting a constraint condition corresponding to a current SP and an SPO meeting a constraint condition determined in advance to the O value determination module, and select N SPOs from the SPOs meeting the constraint condition through the O value determination module; wherein N is a natural number of 1 or more. Specifically, the electronic device may accumulate the scores of the O values corresponding to the current SP through the O value determination module, and sort all the O values corresponding to the current SP according to the accumulated scores of the O values; then extracting a current P value from the current SP, and determining the attribute of the current P value according to the current P value and the predetermined corresponding relation between the P value and the attribute; wherein, the attributes of the current P value include: a single-value attribute or a multi-value attribute; if the attribute of the current P value is a single-value attribute, the electronic equipment can select an SPO from the SPOs which accord with the constraint condition according to the accumulated scores of all the sorted O values; if the attribute of the current P value is a multi-value attribute, the electronic device may select a plurality of SPOs from the SPOs that meet the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value.

The method for selecting the SPO provided by the embodiment of the application includes the steps that a plurality of O values corresponding to a current SP extracted by upstream equipment are filtered to obtain the SPO corresponding to the current SP and meeting constraint conditions; then selecting N SPOs from the SPOs meeting the constraint conditions according to the SPOs meeting the constraint conditions corresponding to the current SP and the scores of all O values in the predetermined SPOs meeting the constraint conditions; wherein N is a natural number of 1 or more. That is to say, the method and the device can filter the SPO which does not meet the constraint condition, so that the purpose of improving the accuracy of selecting the SPO is achieved. However, the existing selection method of the SPO excessively depends on the excavation effect of the upstream equipment, and if the upstream equipment has errors, the accuracy of the finally selected SPO is reduced only by a counting or adding mode. Because the technical means of filtering a plurality of O values is adopted and the technical means of selecting the SPO from the SPOs meeting the constraint condition is adopted, the technical problem of low accuracy rate of selecting the SPO in the prior art is solved; moreover, the technical scheme of the embodiment of the application is simple and convenient to implement, convenient to popularize and wide in application range.

Example two

Fig. 2 is a schematic flow chart of a selection method of an SPO according to the second embodiment of the present application. As shown in fig. 2, the selection method of the SPO may include the following steps:

s201, judging whether the current SP is a valid SP; if yes, go to S202; otherwise, S206 is executed.

In a specific embodiment of the present application, the electronic device may input the current SP into an SP-NIL determination model, and determine whether the current SP is a valid SP through the SP-NIL determination model; if the current SP is a valid SP, executing S202; if the current SP is an invalid SP, S206 is executed. Specifically, the electronic device may determine whether the current SP is a valid SP according to the probability distribution of the current P value. For example, if the prior probability of the current P value is higher than or equal to the preset probability threshold, the electronic device may determine that the current SP is a valid SP; if the prior probability of the current P value is lower than the preset probability threshold, the electronic device may determine that the current SP is an invalid SP. In addition, the electronic device can also judge whether the current SP is a valid SP according to the heat characteristic of the current SP. For example, if the current SP includes at least one heat characteristic, the electronic device may determine that the current SP is a valid SP; if the current SP does not include any one of the heat characteristics, the electronic device may determine that the current SP is an invalid SP. In addition, the electronic device may further determine whether the current SP is a valid SP according to the score of each O value corresponding to the current SP. For example, if the average score of the scores of the O values corresponding to the current SP is higher than or equal to the preset score threshold, the electronic device may determine that the current SP is a valid SP; if the average score of the scores of the O values corresponding to the current SP is lower than the preset score threshold, the electronic device may determine that the current SP is an invalid SP.

S202, filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment to obtain the SPO corresponding to the current SP and meeting the constraint condition.

In an embodiment of the present application, if the SP-NIL determination module determines that the current SP is a valid SP, the SP-NIL determination module may respectively input a plurality of O values corresponding to the current SP extracted by the current SP and the upstream device into the SPO filtering module, and output an SPO corresponding to the current SP and meeting the constraint condition through the SPO filtering module. Specifically, the electronic device may first obtain the category of each O value corresponding to the current SP through the SPO filter model; then matching the category of each O value corresponding to the current SP with the category of the predetermined current P value; wherein the current P value is the P value in the current SP; if the category of each O value corresponding to the current SP is successfully matched with the category of the current P value, the SP-NIL determination module may determine the successfully matched O value as an effective O value corresponding to the current SP; and determining the SPO meeting the constraint condition according to the current SP and the effective O value.

And S203, accumulating the scores of all O values corresponding to the current SP, and sequencing all O values corresponding to the current SP according to the accumulated scores of all O values.

In a specific embodiment of the present application, the electronic device may accumulate the scores of the O values corresponding to the current SP through the O value determining module, and sort all the O values corresponding to the current SP according to the accumulated scores of the O values. Specifically, the electronic device may obtain scores of the O values according to the O values corresponding to the current SP, then accumulate the scores of the O values, and sort all the O values corresponding to the current SP according to the accumulated scores of the O values.

S204, extracting a current P value from the current SP, and determining the attribute of the current P value according to the current P value and the corresponding relation between the predetermined P value and the attribute; wherein, the attributes of the current P value include: a single-value attribute or a multi-value attribute.

In a specific embodiment of the application, the electronic device may extract a current P value from the current SP through the O value determining module, and determine an attribute of the current P value according to the current P value and a predetermined correspondence between the P value and the attribute; wherein, the attributes of the current P value include: a single-value attribute or a multi-value attribute. Specifically, the electronic device may determine a corresponding relationship between the P value and the attribute in advance, and in this step, the electronic device may determine the attribute of the current P value according to the extracted current P value and the predetermined corresponding relationship between the P value and the attribute.

S205, if the attribute of the current P value is a single-value attribute, selecting an SPO from SPOs meeting the constraint condition according to the sorted accumulated scores of all O values; and if the attribute of the current P value is a multi-value attribute, selecting a plurality of SPOs from the SPOs meeting the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value.

In a specific embodiment of the present application, if the attribute of the current P value is a single-value attribute, the electronic device may select an SPO from SPOs that meet the constraint condition through the O value determination module according to the sorted accumulated scores of the O values; if the attribute of the current P value is a multi-value attribute, the electronic device may select, by the O value determination module, a plurality of SPOs from the SPOs that meet the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value. Specifically, if the attribute of the current P value is a single-value attribute, the electronic device may select an O value with the highest accumulated score from the sorted O values, and then combine the O value with the highest accumulated score and the current SP into one SPO. If the attribute of the current P value is a multi-value attribute, the electronic device can obtain the scores of all O values and the prior probabilities of all O values, if the score of the Mth O value multiplied by the prior probability of the Mth O value is greater than the score of the M +1 th O value multiplied by the prior probability of the M +1 th O value, the first M O values are selected from the sorted O values, and M SPOs are determined according to the first M O values; wherein M is a natural number greater than 1.

S206, ending the SPO selection flow.

In an embodiment of the application, if the SP-NIL module determines that the current SP is an invalid SP, the electronic device may end the selection process of the SPO.

Fig. 3 is a schematic structural diagram of an SPO selection system provided in the second embodiment of the present application. As shown in fig. 3, the SPO selection model includes: the device comprises an input module, an SP-NIL module, an SPO filtering module, an O value judging module, an external dependence module, a prior database and an output module; wherein the function of each module is as follows:

the input module is used for receiving a plurality of O values corresponding to the current SP sent by the upstream equipment and inputting the current SP and each O value corresponding to the current SP into the SP-NIL module;

the SP-NIL module is used for judging whether the current SP is a valid SP or not; and if the current SP is judged to be the effective SP, sending the current SP and a plurality of O values corresponding to the current SP to an SPO effectiveness filtering module. Specifically, if there is a part of input current SPs that are themselves erroneous (e.g., "beijing-birth place") or have no O value (e.g., "deer break-wife"), the final accuracy is affected if such SPOs are output. The SP-NIL determination module may determine whether the current SP is valid through a priori confidence and a posteriori confidence, and if the prior confidence or the posteriori confidence of the SP is relatively low, the SP-NIL module may determine that the O value corresponding to the current SP should be Null (NIL). Here, the a priori confidence may be determined according to the probability distribution of the current P value or the heat characteristic of the current SP; the posterior confidence may be determined according to the scores of the respective O values corresponding to the current SP.

And the SPO filtering module is used for filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment to obtain the SPO corresponding to the current SP and meeting the constraint condition. Specifically, the inputs to the module are a plurality of SPOs, namely: and (4) Para-S-P-O, judging whether the category of the input O value accords with the category constraint of the current P value to the entity, wherein the output of the module is the filtered effective Para-S-P-O. In order to obtain the correct O value entity category corresponding to the current P value, the method can calculate the category of the O value corresponding to the P value in the existing knowledge base, for example, for the P value 'lead actor', the category of the obtained constraint O value is 'figure'; in order to obtain the category of the O value, the part of speech of each word in the mined text of the O value and the category to which each named entity belongs can be obtained by means of a part of speech recognition and subgraph association tool, and the category is supplemented by the part of speech, for example, the word with the part of speech recognized as nr is taken as the category of 'person'. According to the O value constraint corresponding to the current P value and the type of the O value input by the upstream equipment, SPOs which do not conform to the type constraint can be filtered.

The O value judging module is used for accumulating the scores of all O values corresponding to the current SP and sequencing all O values corresponding to the current SP according to the accumulated scores of all O values; extracting a current P value from the current SP, and determining the attribute of the current P value according to the current P value and the corresponding relation between the predetermined P value and the attribute; wherein, the attributes of the current P value include: a single-value attribute or a multi-value attribute; if the attribute of the current P value is a single-value attribute, selecting an SPO from SPOs meeting the constraint condition according to the sorted accumulated scores of all O values; and if the attribute of the current P value is a multi-value attribute, selecting a plurality of SPOs from the SPOs meeting the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value. Specifically, the input of this module is all valid O values corresponding to the current SP and the scores of the respective O values, that is: < Para, O _ score >, the output is a list of current SP and O values, i.e.: < S, P, O _ list >. The module firstly accumulates each input O value to obtain a final score of each O value, and sorts each O value according to the scores. Then, in order to determine the output number of the O values, the application may perform statistics on the number distribution of the O values corresponding to each P value through the prior knowledge base, and obtain the number distribution of the O values corresponding to each P value according to the existing SPOs in the prior knowledge base, for example, "director" is that the number distribution of the O values is "1":0.68, "2":0.25, …, which indicates that in the knowledge base, 68% of "director" relationships include 1O value, 25% of "director" relationships include 2O values, and so on. Here, the attribute of the current P value can also be determined, and if the attribute of the current P value is a single-value attribute, only the O value with the highest score is reserved; if the attribute of the current P value is a multi-value attribute, calculating the output number of the O value according to the prior distribution and the posterior distribution of each O value, wherein the specific method comprises the following steps: and if the prior probability of multiplying the score of the Mth O value by the Mth O value is greater than the prior probability of multiplying the score of the M +1 th O value by the M +1 th O value, selecting the first M O values from the sorted O values, and determining M SPOs according to the first M O values.

An external dependency module for providing tool support for the SPO filtering module, which may include at least the following deep learning tools: the subgraph associates egls with part-of-speech recognition tools.

The prior database is used for providing data support for the SPO filtering module and the O value judging module, and the module at least comprises the following data: probability distribution of P values, heat characteristics of SPs, class of P values, and class of O values.

And the output module is used for outputting the N SPOs selected from the SPOs meeting the constraint conditions.

EXAMPLE III

Fig. 4 is a schematic structural diagram of an SPO selection apparatus according to a third embodiment of the present application. As shown in fig. 4, the apparatus 400 includes: an SPO filtering module 401 and an O value judging module 402; wherein the content of the first and second substances,

the SPO filtering module 401 is configured to filter a plurality of O values corresponding to a current SP extracted by an upstream device to obtain a SPO corresponding to the current SP and meeting constraint conditions;

the O value determining module 402 is configured to select N SPOs from the SPOs meeting the constraint condition according to the SPOs meeting the constraint condition corresponding to the current SP and the scores of each O value in the predetermined SPOs meeting the constraint condition; wherein N is a natural number of 1 or more.

Further, the apparatus further comprises: an SP-NIL determination module (not shown in the figure) for determining whether the current SP is a valid SP; and if the current SP is judged to be the effective SP, executing the operation of filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment through the SPO filtering module.

Fig. 5 is a schematic structural diagram of an SP filtering module provided in the third embodiment of the present application. As shown in fig. 5, the SPO filter module 401 includes: an obtaining sub-module 4011, a matching sub-module 4012 and a determining sub-module 4013; wherein the content of the first and second substances,

the obtaining submodule 4011 is configured to obtain a category of each O value corresponding to the current SP;

the matching sub-module 4012 is configured to match the category of each O value corresponding to the current SP with a predetermined category of the current P value; wherein the current P value is a P value in the current SP;

the determining submodule 4013 is configured to, if the category of each O value corresponding to the current SP is successfully matched with the category of the current P value, determine the successfully matched O value as an effective O value corresponding to the current SP; and determining the SPO meeting the constraint condition according to the current SP and the effective O value.

Fig. 6 is a schematic structural diagram of an O value determining module according to a third embodiment of the present application. As shown in fig. 6, the O value determination module 402 includes: ordering submodule 4021 and selecting submodule 4022; wherein the content of the first and second substances,

the sorting submodule 4021 is configured to accumulate scores of all O values corresponding to the current SP, and sort all O values corresponding to the current SP according to the accumulated scores of all O values;

the selecting submodule 4022 is configured to extract a current P value from the current SP, and determine an attribute of the current P value according to the current P value and a predetermined correspondence between the P value and the attribute; wherein the attributes of the current P value include: a single-value attribute or a multi-value attribute; if the attribute of the current P value is the single-value attribute, selecting an SPO from the SPOs meeting the constraint condition according to the sorted accumulated scores of all O values; and if the attribute of the current P value is the multi-value attribute, selecting a plurality of SPOs from the SPOs meeting the constraint condition according to the sorted accumulated scores of the O values and the predetermined probability distribution of the O value corresponding to each P value.

Further, the selecting sub-module 4022 is specifically configured to select the first M O values from the sorted O values and determine M SPOs according to the first M O values if the prior probability of multiplying the score of the mth O value by the mth O value is greater than the prior probability of multiplying the score of the M + 1O value by the M + 1O value; wherein M is a natural number greater than 1.

The selection device of the SPO can execute the method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For details of the technique not described in detail in this embodiment, reference may be made to a method for selecting an SPO provided in any embodiment of the present invention.

Example four

According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.

Fig. 7 is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.

As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 7, one processor 701 is taken as an example.

The memory 702 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the SPO selection method provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the selection method of SPO provided herein.

The memory 702, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the selection method of the SPO in the embodiment of the present application (for example, the SPO filtering module 401 and the O value judging module 402 shown in fig. 4). The processor 701 executes various functional applications of the server and data processing, i.e., implements the selection method of the SPO in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 702.

The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of the electronic device according to the selection method of the SPO, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 702 may optionally include memory located remotely from the processor 701, which may be connected over a network to the electronic device of the selected method of SPO. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the selection method of the SPO may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.

The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus of the selection method of the SPO, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical scheme of the embodiment of the application, a plurality of O values corresponding to the current SP extracted by upstream equipment are filtered to obtain the SPO corresponding to the current SP and meeting the constraint condition; then selecting N SPOs from the SPOs meeting the constraint conditions according to the SPOs meeting the constraint conditions corresponding to the current SP and the scores of all O values in the predetermined SPOs meeting the constraint conditions; wherein N is a natural number of 1 or more. That is to say, the method and the device can filter the SPO which does not meet the constraint condition, so that the purpose of improving the accuracy of selecting the SPO is achieved. However, the existing selection method of the SPO excessively depends on the excavation effect of the upstream equipment, and if the upstream equipment has errors, the accuracy of the finally selected SPO is reduced only by a counting or adding mode. Because the technical means of filtering a plurality of O values is adopted and the technical means of selecting the SPO from the SPOs meeting the constraint condition is adopted, the technical problem of low accuracy rate of selecting the SPO in the prior art is solved; moreover, the technical scheme of the embodiment of the application is simple and convenient to implement, convenient to popularize and wide in application range.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.

The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. A method for selecting a triple SPO, the method comprising:

2. The method as claimed in claim 1, wherein before said filtering the plurality of O values corresponding to the current SP extracted by the upstream device, the method further comprises:

judging whether the current SP is a valid SP;

3. The method as claimed in claim 1, wherein said filtering the plurality of O values corresponding to the current SP extracted by the upstream device comprises:

obtaining the category of each O value corresponding to the current SP;

4. The method of claim 1, wherein the selecting N SPOs from the constrained SPOs comprises:

5. The method of claim 4, wherein selecting a plurality of SPOs from the SPOs that meet the constraint condition according to the sorted accumulated scores of the O values and a predetermined probability distribution of the O value corresponding to each P value comprises:

6. An apparatus for selecting a triple SPO, the apparatus comprising: the system comprises an SPO filtering module and an O value judging module; wherein the content of the first and second substances,

7. The apparatus of claim 6, further comprising: the SP-NIL judging module is used for judging whether the current SP is a valid SP or not; and if the current SP is judged to be the effective SP, executing the operation of filtering a plurality of O values corresponding to the current SP extracted by the upstream equipment through the SPO filtering module.

8. The apparatus of claim 6, wherein the SPO filtering module comprises: the method comprises the steps of obtaining a submodule, a matching submodule and a determining submodule; wherein the content of the first and second substances,

9. The apparatus of claim 6, wherein the O value determining module comprises: a sorting submodule and a selecting submodule; wherein the content of the first and second substances,

10. The apparatus of claim 9, wherein:

the selecting submodule is specifically configured to select the first M O values from the sorted O values and determine M SPOs according to the first M O values if the prior probability of multiplying the score of the mth O value by the mth O value is greater than the prior probability of multiplying the score of the M + 1O value by the M + 1O value; wherein M is a natural number greater than 1.

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.