CN113297332A - Entity object determination method, device, server and storage medium - Google Patents

Entity object determination method, device, server and storage medium Download PDF

Info

Publication number
CN113297332A
CN113297332A CN202011041598.0A CN202011041598A CN113297332A CN 113297332 A CN113297332 A CN 113297332A CN 202011041598 A CN202011041598 A CN 202011041598A CN 113297332 A CN113297332 A CN 113297332A
Authority
CN
China
Prior art keywords
service scene
entity
data object
entity object
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011041598.0A
Other languages
Chinese (zh)
Inventor
付宇新
梁童鹿
贾易东
罗惠玲
韩呈豪
江仙高
肖国锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202011041598.0A priority Critical patent/CN113297332A/en
Publication of CN113297332A publication Critical patent/CN113297332A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a method, a device, a server and a storage medium for determining an entity object, wherein the method comprises the following steps: acquiring a target data object associated with a service scene based on a data object in an object class associated with the service scene; determining the correlation between the entity object and the service scene according to the historical interactive behavior data of the entity object and the target data object; and determining the entity objects of which the correlation with the service scene is not less than a preset correlation threshold value as the entity objects associated with the service scene, wherein the entity objects associated with the service scene form an entity object group of the service scene. The embodiment of the application can determine the group of the entity object associated with the business scene and improve the comprehensiveness of group determination.

Description

Entity object determination method, device, server and storage medium
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a method and a device for determining an entity object, a server and a storage medium.
Background
Massive data objects exist on the internet, and the data objects related to the entity objects and the service scenes are required to be pushed to different service scenes.
In order to provide data objects related to a business scenario to entity objects requiring the data objects, it is particularly necessary to determine a group of entity objects associated with the business scenario, and therefore how to provide an entity object determination scheme to determine the group of entity objects associated with the business scenario and improve the comprehensiveness of group determination becomes a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of this, embodiments of the present application provide a method, an apparatus, a server, and a storage medium for determining an entity object group associated with a business scenario, and improve comprehensiveness of group determination.
In order to achieve the above purpose, the embodiments of the present application provide the following technical solutions:
an entity object determination method, comprising:
acquiring a target data object associated with a service scene based on a data object in an object class associated with the service scene;
determining the correlation between the entity object and the service scene according to the historical interactive behavior data of the entity object and the target data object;
and determining the entity objects of which the correlation with the service scene is not less than a preset correlation threshold value as the entity objects associated with the service scene, wherein the entity objects associated with the service scene form an entity object group of the service scene.
An embodiment of the present application further provides an entity object determining apparatus, including:
the target data object acquisition module is used for acquiring a target data object associated with a service scene based on a data object in an object category associated with the service scene;
the correlation determination module is used for determining the correlation between the entity object and the service scene according to the historical interactive behavior data of the entity object and the target data object;
and the group determination module is used for determining the entity objects of which the correlation with the service scene is not less than a preset correlation threshold value as the entity objects associated with the service scene, wherein the entity objects associated with the service scene form an entity object group of the service scene.
An embodiment of the present application further provides a server, including: at least one memory and at least one processor; the memory stores one or more computer-executable instructions that are invoked by the processor to perform the entity object determination method as described above.
An embodiment of the present application further provides a storage medium, where the storage medium stores one or more computer-executable instructions, and the one or more computer-executable instructions are configured to execute the entity object determination method as described above.
The entity object determining method provided by the embodiment of the application can acquire the target data object associated with the service scene based on the data object under the object category associated with the service scene, and avoids the crowd directly determining the service scene by using the data object associated with the service scene expressed in the cognitive map; therefore, on the basis of the enlarged target data object associated with the service scene, the embodiment of the application can determine the correlation between the entity object and the service scene according to the historical interaction behavior data of the entity object and the target data object, further determine the entity object, the correlation of which is not less than the preset correlation threshold value, of the service scene as the entity object associated with the service scene, and form the entity object group of the service scene by the entity object associated with the service scene, so as to determine the entity object group associated with the service scene. According to the embodiment of the application, on the basis of the enlarged target data object associated with the business scene, the entity object group associated with the business scene is determined based on the historical interactive behavior data of the entity object for the target data object, so that under the condition that the accuracy of determining the entity object group is guaranteed, the entity object group associated with the business scene can be enlarged as much as possible based on the enlarged target data object associated with the business scene, and the comprehensiveness of determining the group is improved. Therefore, the entity object determining method provided by the embodiment of the application can determine the entity object group associated with the business scene and improve the comprehensiveness of group determination.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is an exemplary graph of the number of data objects associated with an overall business scenario in a cognitive atlas;
fig. 2 is a flowchart of an entity object determination method according to an embodiment of the present application;
fig. 3 is a flowchart of determining a target data object associated with a service scenario according to an embodiment of the present application;
FIG. 4 is a diagram illustrating an example of matching determination order bits provided by an embodiment of the present application;
fig. 5 is a flowchart for vectorizing a service scene and a data object according to an embodiment of the present application;
FIG. 6 is an exemplary diagram of building a dependency graph according to an embodiment of the present application;
FIG. 7 is an exemplary diagram of a sampling path provided by an embodiment of the present application;
FIG. 8 is an exemplary diagram of training using a skip-gram algorithm according to an embodiment of the present application;
fig. 9 is a diagram illustrating an example of a process for vectorizing a service scene and a data object according to an embodiment of the present application;
fig. 10 is a flowchart for determining a correlation between an entity object and a service scene according to an embodiment of the present disclosure;
FIG. 11 is a diagram illustrating a comparison of the number of data objects associated with a business scenario;
FIG. 12 is another comparative example diagram of the number of data objects associated with a business scenario;
fig. 13 is a block diagram of an entity object determination apparatus according to an embodiment of the present application;
fig. 14 is another block diagram of an entity object determination apparatus according to an embodiment of the present application;
fig. 15 is a block diagram of a server according to an embodiment of the present application.
Detailed Description
An internet platform such as e-commerce can determine a group of entity objects associated with a business scenario based on a cognitive map, in one example, entity objects such as users, and a group of entity objects such as a user population; the cognitive map explicitly expresses the requirements of the entity objects as nodes (called E-commerce Concept), associates the nodes with data objects (data objects such as commodities), object categories (categories of data objects), external general domain knowledge and the like, and provides a uniform data basis for data object cognition, entity object cognition and knowledge cognition; the requirements of the entity object can be defined as phrases (such as children loss prevention, mid-autumn festival gift delivery and the like) which accord with common sense, complete semantics and smooth word order in the cognitive map, and a service scene can be added on the basis of the cognitive map, so that the association between the service scene and the data object is realized; furthermore, the incidence relation between the service scene and the entity object can be indirectly obtained by utilizing the incidence relation between the service scene and the data object in the cognitive map and the behavior of the entity object aiming at the data object in a period of historical time, so that the entity object group related to the service scene can be determined;
specifically, based on the incidence relation between the service scene and the data objects in the cognitive map, a plurality of data objects in the service scene can be determined, so that for a certain data object, the scores of the entity object and the data objects can be determined according to the historical behavior data of the entity object and the data objects, the scores of the entity object and each data object in the service scene are further integrated to obtain the scores of the entity object and the service scene, then based on the scores of the entity object and the service scene, whether the entity object belongs to the service scene or not is determined, and the entity object group determination of the service scene is realized.
However, the above-mentioned manner of determining the entity object group is established on the basis of the association relationship between the service scene and the data object expressed by the cognitive map, although the association relationship between the service scene and the data object expressed by the cognitive map has higher accuracy, the recall rate is insufficient, which results in that if the number of the associated data objects under the service scene is small, the number of the data objects that the entity object can behave under the service scene is small, so that the determined entity object group of the service scene has a small scale, and the entity object group associated with the service scene cannot be determined comprehensively;
for example, as shown in fig. 1, the number of data objects associated with different service scenes in the cognitive map is different, and the number of data objects associated with part of the service scenes is very small, and one of the reasons for this phenomenon is: the concept (concept) in the cognitive map is mainly used for recommendation of data objects (for example, recommendation of commodities), and the recall amount of the data objects covered by the main concept can meet the recommendation requirement of the data objects; however, the number of data objects associated with a service scene is small, which results in a small scale of a group of entity objects that can be determined in the service scene, and the small scale of the group of entity objects certainly cannot meet the objectives of acquisition of new entity objects, large-scale delivery of data objects, recommendation and the like in the service scene.
Based on this, the embodiment of the application provides an entity object determination method, an entity object determination device, a server and a storage medium, so as to determine an entity object group associated with a business scene and improve the comprehensiveness of the entity object group determination.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In an alternative implementation, fig. 2 illustrates an alternative flow of the entity object determination method provided in the embodiment of the present application, and the method flow may be applied to a server, for example, a server of an internet platform, specifically, a server of an e-commerce platform; any business scenario set by an internet platform such as e-commerce may determine an entity object group associated with the business scenario (for example, determine a user group associated with the business scenario) based on the process shown in fig. 2, and as shown in fig. 2, the process of the method may include:
step S100, acquiring a target data object associated with a service scene based on a data object in an object category associated with the service scene.
In order to expand the entity object group associated with the service scene as much as possible and improve the comprehensiveness of the entity object group determination, the embodiment of the application can re-determine the data object associated with the service scene, so that the expansion of the entity object group associated with the service scene is realized on the basis of the expansion of the data object associated with the service scene; in order to expand the data objects associated with the service scene, in the embodiments of the present application, the data objects associated with the service scene may be obtained based on the data objects in the object class associated with the service scene, and the data objects associated with the service scene expressed in the cognitive map are not directly used, so as to expand the number of the data objects associated with the service scene.
In a more specific optional implementation, in the embodiment of the present application, based on an object category (category of a data object) associated with a service scene expressed in a cognitive map and a category attribute of the data object, a data object under the object category associated with the service scene is determined, and then a similarity between the service scene and the data object (i.e., a data object under the object category associated with the service scene) is determined, and according to the similarity between the service scene and the data object, a target data object associated with the service scene is selected from the data object.
Step S110, determining the correlation between the entity object and the service scene according to the historical interactive behavior data of the entity object and the target data object.
Based on the target data object associated with the service scene, the embodiment of the application can acquire historical interaction behavior data of the entity object and the target data object, so that the correlation between the entity object and the service scene is determined according to the historical interaction behavior data of the entity object and the target data object; for example, the number of the target data objects may be multiple, in the embodiment of the present application, behavior data of the entity object for each target data object in a historical time period may be acquired, where the behavior data may describe information such as a behavior type and a behavior frequency of the entity object for the target data object; based on behavior data of the entity object aiming at each target data object in the historical time period, the embodiment of the application can determine the correlation between the entity object and the service scene; the relevance of the entity object to the service scene can be represented by a score, and the greater the relevance of the entity object to the service scene, the higher the relevance of the entity object to the service scene.
In alternative implementations, the embodiments of the present application may define the following assumptions:
if the entity object acts on the data object of a certain service scene, the entity object has correlation with the service scene, different behavior types of the entity object represent correlation with different degrees, for example, the entity object A purchases the data object of the service scene 1, and the entity object B browses the data object of the service scene 1, then the correlation between the entity object A and the service scene 1 is higher than the correlation between the entity object B and the service scene 1;
the more times of the data object of an entity object and a certain service scene is acted, the higher the relevance of the entity object and the service scene is;
based on the above assumptions, in an optional implementation of determining the correlation between the entity object and the service scene, in the embodiment of the present application, the correlation (which may be defined using a score) between the entity object and each target data object may be determined based on behavior data of the entity object for each target data object in a historical time period, and then the correlation between the entity object and the service scene may be determined according to the correlation between the entity object and each target data object.
Optionally, the historical time period belongs to a past time period, which may be set according to an actual situation, and the embodiment of the present application is not limited.
Step S120, determining the entity object whose correlation with the service scene is not less than a preset correlation threshold as the entity object associated with the service scene, where the entity object associated with the service scene forms an entity object group of the service scene.
According to the embodiment of the application, a correlation threshold value can be set or designated, after the correlation between the entity object and the service scene is determined, the entity object of which the correlation with the service scene is not less than the correlation threshold value can be selected through the correlation threshold value, the entity object associated with the service scene is obtained, and then the entity object associated with the service scene forms an entity object group of the service scene, so that the entity object group associated with the service scene is determined.
Optionally, a specific value of the correlation threshold may be set or specified according to an actual situation, and the embodiment of the present application is not limited.
The entity object determining method provided by the embodiment of the application can acquire the target data object associated with the service scene based on the data object under the object category associated with the service scene, and avoids the crowd directly determining the service scene by using the data object associated with the service scene expressed in the cognitive map; therefore, on the basis of the enlarged target data object associated with the service scene, the embodiment of the application can determine the correlation between the entity object and the service scene according to the historical interaction behavior data of the entity object and the target data object, further determine the entity object, the correlation of which is not less than the preset correlation threshold value, of the service scene as the entity object associated with the service scene, and form the entity object group of the service scene by the entity object associated with the service scene, so as to determine the entity object group associated with the service scene. According to the embodiment of the application, on the basis of the enlarged target data object associated with the business scene, the entity object group associated with the business scene is determined based on the historical interactive behavior data of the entity object for the target data object, so that under the condition that the accuracy of determining the entity object group is guaranteed, the entity object group associated with the business scene can be enlarged as much as possible based on the enlarged target data object associated with the business scene, and the comprehensiveness of determining the group is improved. Therefore, the entity object determining method provided by the embodiment of the application can determine the entity object group associated with the business scene and improve the comprehensiveness of group determination.
In optional implementation, fig. 3 shows an optional process for determining a target data object associated with a service scene, which is provided in the embodiment of the present application, and the process of the method may implement expansion of the target data object associated with the service scene based on an object type associated with the service scene expressed by a cognitive map and a category attribute of the data object, so as to solve the problem that the size of the data object associated with a part of the service scenes in the cognitive map is small; optionally, the method flow may be applied to a server, for example, a server of an internet platform, such as a server of an e-commerce platform; any service scenario set by the internet platform may be implemented based on the process shown in fig. 3 to determine a target data object associated with the service scenario, and as shown in fig. 3, the process of the method may include:
step S200, determining candidate data objects under the object categories related to the service scene based on the object categories related to the service scene and the category attributes of the data objects expressed in the cognitive map.
Optionally, in the embodiment of the present application, based on the association relationship between the service scene and the object category in the cognitive map and the category attribute of the data object, all data objects whose category attribute matches the object category associated with the service scene may be determined, where all data objects may be used as candidate data objects in the object category associated with the service scene, and a target data object may be selected from the candidate data objects; candidate data objects under the object category associated with the service scene can form a candidate data object set of the service scene, and one service scene can correspond to one candidate data object set; for example, taking an e-commerce platform as an example, if the object category associated with the business scene of the beauty makeup is, for example, makeup or perfume, all data objects under the object category of the makeup or perfume may be uniformly determined as candidate data objects of the business scene of the beauty makeup, so as to obtain a candidate data object set of the business scene of the beauty makeup.
Step S210, determining the similarity between the service scene and the candidate data object.
For each candidate data object, the embodiment of the present application may respectively calculate the similarity between the service scene and each candidate data object, and in an optional implementation, the embodiment of the present application may obtain a service scene vector corresponding to the service scene and a data object vector corresponding to each candidate data object, thereby respectively calculating the similarity between the service scene vector of the service scene and the data object vector of each candidate data object, and obtaining the similarity between the service scene and each candidate data object.
In a further optional implementation, the determining of the similarity between the traffic scene vector and the candidate data object vector may be implemented by calculating a cosine distance, a euclidean distance, or a pearson correlation coefficient between the traffic scene vector and the data object vector.
Step S220, based on the similarity between the service scene and the candidate data object, determining a similarity threshold corresponding to the service scene.
The similarity between the service scene and the candidate data objects can represent the correlation between the service scene and the candidate data objects, and in the embodiment of the application, the similarity threshold corresponding to the service scene can be determined based on the similarity between the service scene and each candidate data object, so that the object associated with the service scene can be selected from the candidate data objects through the similarity threshold in the following process.
In optional implementation, a heuristic method may be used in the embodiment of the present application to determine the similarity threshold corresponding to the service scenario; optionally, in the embodiment of the present application, the candidate data objects may be sorted according to the similarity, for example, the candidate data objects are sorted according to a sequence from the similarity from large to small, order bits at least matched with the numerical values of the first scale number are determined from the sorting, and the similarity of the candidate data objects corresponding to the order bits is used as the similarity threshold corresponding to the service scene;
optionally, the first proportion number may be matched with a first proportion of the number of data objects associated with the service scene expressed in the cognitive map; for a clearer explanation, as shown in the example of fig. 4, the number of candidate data objects of a service scene is M, after the candidate data objects of the service scene are sorted according to the sequence of similarity from large to small, the candidate data object corresponding to the ordinal j is the jth candidate data object in the sort, and the value of j represents j candidate data objects sorted according to the similarity from large to small; based on the incidence relation between the service scene expressed in the cognitive map and the data objects, the number of the data objects related to the service scene expressed in the cognitive map can be determined, and the first proportion number can be obtained by taking the first proportion of the number of the data objects; if the value of the ordinal number j is at least matched with the first proportional number (for example, the ordinal number j is not less than the first proportional number, or the ordinal number j is the ordinal number closest to the first proportional number in the ranking), the similarity of the candidate data object corresponding to the ordinal number j may be used as the similarity threshold corresponding to the service scene.
Step S230, based on the similarity threshold, determining a target data object associated with the service scene, of which the similarity is not less than the similarity threshold, from the candidate data objects.
After the similarity threshold of the service scene is obtained, the data object with the similarity not less than the similarity threshold can be selected from the candidate data objects based on the similarity threshold, and the data object is used as the target data object associated with the service scene.
Based on the flow shown in fig. 3, in the embodiment of the present application, the target data object associated with the service scene may be selected from the candidate data objects in the object category associated with the service scene, so as to expand the number of the target data objects associated with the service scene, and provide a basis for expanding the entity object group associated with the service scene.
Optionally, in an implementation of determining similarity between a service scene and a candidate data object in the flow illustrated in fig. 3, in the embodiment of the present application, a vectorization representation of the service scene and the candidate data object may be obtained, so that the similarity between the service scene vector and the data object vector is calculated through the service scene vector vectorized by the service scene and the data object vector vectorized by the candidate data object, and the similarity between the service scene and the candidate data object is obtained; based on this, in an alternative implementation, fig. 5 shows an alternative flow of vectorizing a service scene and a data object, and as shown in fig. 5, the flow may include:
step S300, based on the incidence relation between the plurality of service scenes and the data objects expressed by the cognitive map, setting each service scene and each data object as nodes respectively so as to construct an incidence map of the service scenes and the data objects.
The cognitive map can express the association relationship between a plurality of service scenes and data objects, namely the cognitive map can express the data objects respectively associated with a plurality of services.
In optional implementation, in the embodiment of the application, a service scene expressed in a cognitive map is used as a top node, a data object expressed in the cognitive map is used as a child node, and the associated top node and the child node are connected based on the association relationship between the service scene expressed in the cognitive map and the data object to obtain the association map; for example, as shown in fig. 6, a cognitive map is used to express data objects respectively associated with a service scene 1 and a service scene 2 as an example, where the service scene 1 is associated with the data objects 1 and 2, and the service scene 2 is associated with the data objects 2 and 3, in this embodiment of the present application, the service scene 1 and the service scene 2 are respectively top nodes, the data objects 1, 2 and 3 are respectively child nodes, based on the association relationship between the service scene 1 and the data objects 1 and 2, the top node of the service scene 1 is connected with the child nodes of the data objects 1 and 2, and based on the association relationship between the service scene 2 and the data objects 2 and 3, the top node of the service scene 2 is connected with the child nodes of the data objects 2 and 3, so as to implement building an association map.
Optionally, the association graph may be in the form of a heterogeneous graph, a bipartite graph, or the like, and the embodiments of the present application are not limited.
Step S310, performing random walk on the association graph by taking any node as a starting point, and sampling a path of the random walk to obtain sampling documents, wherein one sampling document comprises a node through which the path passes, and a plurality of sampling documents form a sampling document set.
After the association graph is obtained, for any node in the association graph, in the embodiment of the application, the node may be used as a starting point of a path to perform random walk, and the path of the random walk is sampled to obtain a sampling document, where one sampling document may include a node through which one path of the random walk passes, and correspondingly, a plurality of sampling documents may be obtained by sampling a plurality of paths of the random walk, and the plurality of sampling documents may form a sampling document set;
for example, as shown in fig. 7, the random walk is performed with the node of the data object 1 as the starting point of the path, and the path can be sampled: data object 1 → business scenario 1 → data object 2, the sample document for the path may include the nodes that the path passes through, i.e., the nodes for data object 1, the nodes for business scenario 1, the nodes for data object 2; taking the node of the service scene 2 as the starting point of the path to carry out random walk, wherein the path can be sampled as follows: service scene 2 → data object 2 → service scene 1, the sampling document of the path may include the nodes that the path passes through, i.e. the nodes of service scene 2, the nodes of data object 2, the nodes of service scene 1; random walk with the node of data object 3 as the start of the path, the samplable path: data object 3 → business scenario 2 → data object 2, the sample document for the path may include the nodes that the path passes through, i.e., the nodes for data object 3, the nodes for business scenario 2, the nodes for data object 2; of course, the paths obtained by sampling are only exemplary descriptions, and different paths can be sampled by performing random walk with different nodes in the association graph as the starting points of the paths, so that the sampling documents of different paths form a sampling document set.
And S320, performing vectorization training on each service scene and each data object based on the sampling document set.
The sampling document set comprises sampling documents corresponding to a plurality of paths which are randomly walked, the sampling documents comprise nodes through which the paths pass, and service scenes and data objects in the association graph are all represented by the nodes, so that the sequence of the nodes included in the sampling documents can reflect the relation between the service scenes represented by the nodes and the data objects.
In optional implementation, based on the sampling document set, the embodiment of the application can use a skip-gram algorithm to carry out vectorization training on each service scene and each data object; it should be noted that, the skip-gram algorithm is a word skipping model algorithm, the word description of the service scene and the data object can be considered as word description, and the skip-gram algorithm can generate words of the words around the text sequence based on a certain word, so as to implement word vectorization, and therefore, vectorization training can be performed on each service scene and each data object by using the skip-gram algorithm;
for example, as shown in fig. 7 and fig. 8, on the basis shown in fig. 7, the precedence order of the nodes included in the sampling document may reflect the relationship between the service scenario represented by the nodes and the data object, and for any node in the sampling document set, the embodiment of the present application may use a skip-gram algorithm to generate the nodes around the node in the sampling document set, so as to implement vectorization of the nodes (one node may represent the service scenario and may also represent the data object), as shown in fig. 8, the nodes around the node in the sampling document set of the service scenario 1 are the data object 1 and the data object 2, and the nodes around the node in the sampling document set of the data object 2 are the service scenario 1.
And step S330, obtaining a service scene vector corresponding to each service scene and a data object vector corresponding to each data object after the training is converged.
The convergence conditions for vectorization training of the service scenes and vectorization training of the data objects can be defined, and after the vectorization training of the service scenes and the data objects reaches the training convergence conditions, the service scene vectors corresponding to the service scenes and the object vectors corresponding to the data objects can be obtained.
In an alternative implementation, the process illustrated in fig. 5 may be learned based on unsupervised graph representation to vectorize the service scenario and the data objects, thereby providing a basis for data object recall in the service scenario; in an example, as shown in fig. 9, a process of vectorizing a service scene and a data object may be performed, where first, an association relationship between a plurality of service scenes and data objects expressed in a cognitive map is expressed by using an association map, then, nodes are randomly walked on the association map, and paths obtained by the walking are sampled to obtain a sample document set, and then, a skip-gram algorithm is used to perform vectorization training on each node in the sample document set, and after the training is converged, a vectorized representation of each learned node is obtained, that is, a service scene vector corresponding to each service scene and an object vector corresponding to each data object are obtained.
In an optional implementation of determining the correlation between the entity object and the service scene in the process shown in fig. 2, in the embodiment of the present application, the correlation between the entity object and each target data object associated with the service scene may be determined first, and then the correlation between the entity object and the service scene may be determined based on the correlation between the entity object and each target data object; in an alternative implementation, if the number of the target data objects is multiple, fig. 10 illustrates an alternative process for determining the correlation between the entity object and the business scenario provided in the embodiment of the present application, and as shown in fig. 10, the process may include:
step S400, determining the correlation between the entity object and each target data object based on the behavior data of the entity object aiming at each target data object in the historical time period.
Optionally, for any target data object associated with a service scene, in the embodiment of the present application, behavior parameters corresponding to various behavior types of an entity object for the target data object may be obtained based on behavior data of the entity object for the target data object in a historical time period; it should be noted that, multiple types of behaviors may occur in the entity object for a target data object in a historical time period, a behavior parameter corresponding to one behavior type may represent a degree of behavior of the entity object for the target data object in the behavior type, for example, the behavior type may include browsing, purchasing, searching, and the like, then the browsing behavior parameter may be browsing times, browsing time, and the like, the purchasing behavior parameter may be purchasing times, purchasing cost, and the like, and the searching behavior parameter may be searching times, searching time, and the like; therefore, for any target data object, the embodiment of the application can combine the behavior parameters of the entity object corresponding to various behavior types of the target data object with the behavior weights corresponding to the behavior types respectively, and then accumulate the combination results to obtain the correlation of the entity object to the target data object;
alternatively, the calculation formula of the correlation between the entity object and one target data object may be as follows:
Figure BDA0002706814810000131
wherein u represents an entity object u, i represents a target data object i, score (u, i) represents the correlation of the entity object u to the target data object i, a represents the a-th behavior type, A is all behavior types, and typa(a)(u,i)A behavior parameter, weight, representing the a-th behavior type of the entity object u for the target data object iaA behavior weight representing the a-th behavior type; it can be seen that, when determining the correlation of the entity object to any target data object, the embodiment of the present application may multiply the behavior parameter of the entity object to one behavior type of the target data object by the behavior weight of the behavior type to obtain a multiplication result of the entity object to one behavior type of the target data object, and then accumulate the multiplication results of the entity object to various behavior types of the target data object to obtain the correlation of the entity object to the target data object.
Step S410, accumulating the relativity of the entity object and each target data object to obtain the relativity of the entity object and the service scene.
After the correlation between the entity object and each target data object associated with the service scene is obtained, the correlation between the entity object and each target data object associated with the service scene can be accumulated, so that the correlation between the entity object and the service scene is obtained.
Optionally, the following formula may be used to obtain the correlation between the entity object and the service scene:
Figure BDA0002706814810000141
where s denotes a service scenario s.
According to the embodiment of the application, the data object related to the service scene is not directly expressed by the cognitive map, but the target data object related to the service scene is obtained based on the data object under the object category related to the service scene, the data object related to the service scene is re-determined, and the number of the target data objects related to the service scene can be greatly increased.
Through experimental comparison, fig. 11 shows a comparison example of the number of data objects associated with a service scene after the data objects associated with the service scene are re-determined, and it can be seen that the number of data objects associated with the service scene re-determined in the embodiment of the present application is increased by nearly one time compared with the number of data objects associated with the service scene expressed in a cognitive map; in further experimental comparison, fig. 12 shows that the number of data objects in all the service scenes is increased, and the distribution whole moves to the right;
in further experimental comparison, taking an entity object as an example of a user, the following table 1 shows the number of data objects originally expressed by the cognitive map, the comparison between the number of data objects newly determined in the embodiment of the present application and the comparison between the determined user population sizes in two service scenarios of "scene spring festival wine" and "winter body warming wine"; it can be seen that the number of data objects associated with the service scene and the user population scale associated with the service scene determined in the embodiment of the present application are both increased by about one time.
Figure BDA0002706814810000151
TABLE 1
Therefore, the embodiment of the application has obvious effects on enlarging the group scale of the entity object associated with the service scene and improving the comprehensiveness of group determination.
While various embodiments have been described above in connection with what are presently considered to be the embodiments of the disclosure, the various alternatives described in the various embodiments can be readily combined and cross-referenced without conflict to extend the variety of possible embodiments that can be considered to be the disclosed and disclosed embodiments of the disclosure.
In the following, the entity object determining apparatus provided in the embodiment of the present application is introduced, and the entity object determining apparatus described below may be regarded as a server, which is a functional module required to implement the entity object determining method provided in the embodiment of the present application. The contents of the entity object determination device described below may be referred to in correspondence with the contents of the entity object determination method described above.
In an alternative implementation, fig. 13 shows an alternative block diagram of an entity object determination apparatus provided in an embodiment of the present application, and as shown in fig. 13, the apparatus may include:
a target data object obtaining module 100, configured to obtain a target data object associated with a service scene based on a data object in an object category associated with the service scene;
a correlation determination module 110, configured to determine a correlation between an entity object and the service scene according to historical interaction behavior data of the entity object and the target data object;
a group determining module 120, configured to determine, as the entity object associated with the service scene, the entity object whose correlation with the service scene is not less than a preset correlation threshold, where the entity object associated with the service scene forms an entity object group of the service scene.
Optionally, a plurality of target data objects are provided; the correlation determination module 110 is configured to determine, according to historical interaction behavior data of the entity object and the target data object, a correlation between the entity object and the service scenario, including:
determining the correlation between the entity object and each target data object based on the behavior data of the entity object aiming at each target data object in the historical time period;
and determining the correlation between the entity object and the service scene according to the correlation between the entity object and each target data object.
Optionally, the relevance determining module 110 is configured to determine, based on the behavior data of the entity object for each target data object in the historical time period, the relevance between the entity object and each target data object, where the determining includes:
aiming at any target data object, acquiring behavior parameters corresponding to various behavior types of the entity object aiming at the target data object based on behavior data of the entity object aiming at the target data object in a historical time period; and respectively combining the behavior parameters of the entity object corresponding to various behavior types of the target data object with the behavior weights corresponding to the behavior types, and accumulating the combination results to obtain the correlation of the entity object to the target data object.
Optionally, the relevance determining module 110 is configured to determine, according to the relevance between the entity object and each target data object, the relevance between the entity object and the service scene includes:
and accumulating the correlation between the entity object and each target data object to obtain the correlation between the entity object and the service scene.
Optionally, the target data object obtaining module 100 is configured to obtain, based on a data object in an object category associated with a service scene, a target data object associated with the service scene, where the target data object includes:
determining candidate data objects under the object categories related to the service scene based on the object categories related to the service scene and the category attributes of the data objects expressed in the cognitive map;
determining similarity between the service scene and the candidate data object;
and selecting a target data object associated with the service scene from the candidate data objects according to the similarity between the service scene and the candidate data objects.
Optionally, the target data object obtaining module 100, configured to determine the similarity between the service scene and the candidate data object, includes:
acquiring a service scene vector corresponding to the service scene and a data object vector corresponding to the candidate data object;
and calculating the similarity between the service scene vector and the data object vector to obtain the similarity between the service scene and the candidate data object.
Optionally, the target data object obtaining module 100 is configured to select, according to the similarity between the service scene and the candidate data objects, a target data object associated with the service scene from the candidate data objects, where the target data object includes:
determining a similarity threshold corresponding to the service scene based on the similarity between the service scene and the candidate data object;
and determining target data objects which are associated with the business scene and have the similarity not less than the similarity threshold value from the candidate data objects based on the similarity threshold value.
Optionally, the target data object obtaining module 100 is configured to determine, based on the similarity between the service scene and the candidate data object, that the similarity threshold corresponding to the service scene includes:
and sequencing the candidate data objects according to the similarity, determining sequence bits matched with at least the numerical values of the first proportional quantity from the sequencing, and taking the similarity of the candidate data objects corresponding to the sequence bits as a similarity threshold corresponding to the service scene.
Optionally, the first proportion number is matched with the first proportion of the number of the data objects associated with the service scene expressed in the cognitive map.
Optionally, further, fig. 14 shows another optional block diagram of the entity object determining apparatus provided in the embodiment of the present application, and as shown in fig. 13 and fig. 14, the apparatus may further include:
the vectorization module 130 is configured to set each service scene and each data object as a node respectively based on an association relationship between a plurality of service scenes and data objects expressed by the cognitive map, so as to construct an association map between the service scenes and the data objects; carrying out random walk on the association diagram by taking any node as a starting point, and sampling a path of the random walk to obtain sampling documents, wherein one sampling document comprises a node through which the path passes, and a plurality of sampling documents form a sampling document set; based on the sampling document set, performing vectorization training on each service scene and each data object; and after the training is converged, obtaining a service scene vector corresponding to each service scene and a data object vector corresponding to each data object.
Optionally, the vectorization module 130 is configured to set each service scene and each data object as a node based on an association relationship between a plurality of service scenes and data objects expressed by the cognitive map, so as to construct an association map between a service scene and a data object, where the association map includes:
and connecting the related top nodes and the sub-nodes based on the association relationship between the service scene and the data object expressed by the cognitive map to obtain the association map.
Optionally, the vectorization module 130 is configured to perform vectorization training on each service scene and each data object based on the sampling document set, where the vectorization training includes:
and based on the sampling document set, performing vectorization training on each service scene and each data object by using a word skipping model algorithm.
The embodiment of the present application further provides a server, where the server may be loaded with the entity object determining apparatus, so as to implement the entity object determining method provided in the embodiment of the present application. In an alternative implementation, fig. 15 shows an alternative block diagram of a server, which, as shown in fig. 15, may comprise: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4;
in the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete mutual communication through the communication bus 4;
optionally, the communication interface 2 may be an interface of a communication module for performing network communication;
alternatively, the processor 1 may be a CPU (central Processing Unit), a GPU (Graphics Processing Unit), an NPU (embedded neural network processor), an FPGA (Field Programmable Gate Array), a TPU (tensor Processing Unit), an AI chip, an asic (application Specific Integrated circuit), or one or more Integrated circuits configured to implement the embodiments of the present application.
The memory 3 may comprise a high-speed RAM memory and may also comprise a non-volatile memory, such as at least one disk memory.
The memory 3 stores one or more computer-executable instructions, and the processor 1 calls the one or more computer-executable instructions to execute the entity object determination method provided in the embodiment of the present application.
The embodiment of the present application further provides a storage medium, where the storage medium stores one or more computer-executable instructions, and the one or more computer-executable instructions are used to execute the entity object determination method provided in the embodiment of the present application.
Although the embodiments of the present application are disclosed above, the present application is not limited thereto. Various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present disclosure, and it is intended that the scope of the present disclosure be defined by the appended claims.

Claims (15)

1. An entity object determination method, comprising:
acquiring a target data object associated with a service scene based on a data object in an object class associated with the service scene;
determining the correlation between the entity object and the service scene according to the historical interactive behavior data of the entity object and the target data object;
and determining the entity objects of which the correlation with the service scene is not less than a preset correlation threshold value as the entity objects associated with the service scene, wherein the entity objects associated with the service scene form an entity object group of the service scene.
2. The entity object determination method according to claim 1, wherein the target data object is plural; the determining the correlation between the entity object and the service scene according to the historical interaction behavior data of the entity object and the target data object comprises:
determining the correlation between the entity object and each target data object based on the behavior data of the entity object aiming at each target data object in the historical time period;
and determining the correlation between the entity object and the service scene according to the correlation between the entity object and each target data object.
3. The entity object determination method according to claim 2, wherein the determining of the correlation of the entity object with each target data object based on the behavior data of the entity object for each target data object in the historical time period comprises:
aiming at any target data object, acquiring behavior parameters corresponding to various behavior types of the entity object aiming at the target data object based on behavior data of the entity object aiming at the target data object in a historical time period; and respectively combining the behavior parameters of the entity object corresponding to various behavior types of the target data object with the behavior weights corresponding to the behavior types, and accumulating the combination results to obtain the correlation of the entity object to the target data object.
4. The entity object determination method according to claim 2, wherein the determining the correlation of the entity object with the service scenario according to the correlation of the entity object with each target data object comprises:
and accumulating the correlation between the entity object and each target data object to obtain the correlation between the entity object and the service scene.
5. The entity object determination method according to claim 1, wherein the obtaining the target data object associated with the service scenario based on the data object under the object category associated with the service scenario comprises:
determining candidate data objects under the object categories related to the service scene based on the object categories related to the service scene and the category attributes of the data objects expressed in the cognitive map;
determining similarity between the service scene and the candidate data object;
and selecting a target data object associated with the service scene from the candidate data objects according to the similarity between the service scene and the candidate data objects.
6. The entity object determination method of claim 5, wherein the determining the similarity of the business scenario and the candidate data object comprises:
acquiring a service scene vector corresponding to the service scene and a data object vector corresponding to the candidate data object;
and calculating the similarity between the service scene vector and the data object vector to obtain the similarity between the service scene and the candidate data object.
7. The entity object determination method according to claim 6, wherein the selecting, from the candidate data objects, the target data object associated with the business scenario according to the similarity between the business scenario and the candidate data objects comprises:
determining a similarity threshold corresponding to the service scene based on the similarity between the service scene and the candidate data object;
and determining target data objects which are associated with the business scene and have the similarity not less than the similarity threshold value from the candidate data objects based on the similarity threshold value.
8. The entity object determination method of claim 7, wherein the determining a similarity threshold corresponding to the business scenario based on the similarity of the business scenario and the candidate data object comprises:
and sequencing the candidate data objects according to the similarity, determining sequence bits matched with at least the numerical values of the first proportional quantity from the sequencing, and taking the similarity of the candidate data objects corresponding to the sequence bits as a similarity threshold corresponding to the service scene.
9. The entity object determination method of claim 8, wherein the first scale number matches a first scale of a number of data objects associated with the business scenario represented in a cognitive map.
10. The entity object determination method according to claim 6, further comprising:
based on the incidence relation between a plurality of service scenes and data objects expressed by the cognitive map, setting each service scene and each data object as nodes respectively to construct an incidence map of the service scenes and the data objects;
carrying out random walk on the association diagram by taking any node as a starting point, and sampling a path of the random walk to obtain sampling documents, wherein one sampling document comprises a node through which the path passes, and a plurality of sampling documents form a sampling document set;
based on the sampling document set, performing vectorization training on each service scene and each data object;
and after the training is converged, obtaining a service scene vector corresponding to each service scene and a data object vector corresponding to each data object.
11. The entity object determination method according to claim 10, wherein the establishing an association diagram of service scenes and data objects by setting each service scene and each data object as a node respectively based on the association relationship between a plurality of service scenes and data objects expressed by the cognitive map comprises:
and connecting the related top nodes and the sub-nodes based on the association relationship between the service scene and the data object expressed by the cognitive map to obtain the association map.
12. The entity object determination method of claim 10, wherein the vectorizing training of the business scenes and the data objects based on the sample set of documents comprises:
and based on the sampling document set, performing vectorization training on each service scene and each data object by using a word skipping model algorithm.
13. An entity object determination apparatus, comprising:
the target data object acquisition module is used for acquiring a target data object associated with a service scene based on a data object in an object category associated with the service scene;
the correlation determination module is used for determining the correlation between the entity object and the service scene according to the historical interactive behavior data of the entity object and the target data object;
and the group determination module is used for determining the entity objects of which the correlation with the service scene is not less than a preset correlation threshold value as the entity objects associated with the service scene, wherein the entity objects associated with the service scene form an entity object group of the service scene.
14. A server, comprising: at least one memory and at least one processor; the memory stores one or more computer-executable instructions that are invoked by the processor to perform the entity object determination method of any of claims 1-12.
15. A storage medium, wherein the storage medium stores one or more computer-executable instructions for performing the entity object determination method of any one of claims 1-12.
CN202011041598.0A 2020-09-28 2020-09-28 Entity object determination method, device, server and storage medium Pending CN113297332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011041598.0A CN113297332A (en) 2020-09-28 2020-09-28 Entity object determination method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011041598.0A CN113297332A (en) 2020-09-28 2020-09-28 Entity object determination method, device, server and storage medium

Publications (1)

Publication Number Publication Date
CN113297332A true CN113297332A (en) 2021-08-24

Family

ID=77318313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011041598.0A Pending CN113297332A (en) 2020-09-28 2020-09-28 Entity object determination method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN113297332A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113821710A (en) * 2021-11-22 2021-12-21 中国信息通信研究院 Global search method, device, electronic equipment and computer storage medium
CN117648387A (en) * 2024-01-29 2024-03-05 杭银消费金融股份有限公司 Construction method of logic data section based on data entity

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113821710A (en) * 2021-11-22 2021-12-21 中国信息通信研究院 Global search method, device, electronic equipment and computer storage medium
CN117648387A (en) * 2024-01-29 2024-03-05 杭银消费金融股份有限公司 Construction method of logic data section based on data entity
CN117648387B (en) * 2024-01-29 2024-05-07 杭银消费金融股份有限公司 Construction method of logic data section based on data entity

Similar Documents

Publication Publication Date Title
WO2020048084A1 (en) Resource recommendation method and apparatus, computer device, and computer-readable storage medium
CN109858040B (en) Named entity identification method and device and computer equipment
US8001001B2 (en) System and method using sampling for allocating web page placements in online publishing of content
EP1029304A1 (en) System and method for dynamic profiling of users in one-to-one applications and for validating user rules
CN112749281B (en) Restful type Web service clustering method fusing service cooperation relationship
CN103678672A (en) Method for recommending information
CN105740268A (en) Information pushing method and apparatus
JP2019164402A (en) Information processing device, information processing method, and program
CN111539197A (en) Text matching method and device, computer system and readable storage medium
CN110955831B (en) Article recommendation method and device, computer equipment and storage medium
CN113297332A (en) Entity object determination method, device, server and storage medium
KR102412158B1 (en) Keyword extraction and analysis method to expand market share in the open market
CN112685635A (en) Item recommendation method, device, server and storage medium based on classification label
CN111870959A (en) Resource recommendation method and device in game
CN111538909A (en) Information recommendation method and device
CN112215629B (en) Multi-target advertisement generating system and method based on construction countermeasure sample
CN113204643B (en) Entity alignment method, device, equipment and medium
CN117495485A (en) Product recommendation method, device and readable storage medium
CN112560105B (en) Joint modeling method and device for protecting multi-party data privacy
CN113327132A (en) Multimedia recommendation method, device, equipment and storage medium
CN115641179A (en) Information pushing method and device and electronic equipment
CN116823410A (en) Data processing method, object processing method, recommending method and computing device
KR100878157B1 (en) Method of Semantic Web Service Discovery using Process-based Ontology
CN111651456B (en) Potential user determination method, service pushing method and device
CN115063858A (en) Video facial expression recognition model training method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination