CN115408620A - Method and device for determining data tag object - Google Patents

Method and device for determining data tag object Download PDF

Info

Publication number
CN115408620A
CN115408620A CN202210961685.0A CN202210961685A CN115408620A CN 115408620 A CN115408620 A CN 115408620A CN 202210961685 A CN202210961685 A CN 202210961685A CN 115408620 A CN115408620 A CN 115408620A
Authority
CN
China
Prior art keywords
data tag
objects
data
user
user object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210961685.0A
Other languages
Chinese (zh)
Inventor
李宇
郑晓菊
方宇洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd filed Critical Shanghai Pudong Development Bank Co Ltd
Priority to CN202210961685.0A priority Critical patent/CN115408620A/en
Publication of CN115408620A publication Critical patent/CN115408620A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method and a device for determining a data tag object. The method comprises the following steps: acquiring target user object information of a target user object; acquiring a plurality of first data tag objects based on the target user object information; determining a plurality of second data tag objects with the highest degree of association with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects; acquiring a target user object feature vector of the target user object, and acquiring a plurality of fourth data tag objects based on the target user object feature vector; and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects. By adopting the method, the accuracy of recommending the data tag object can be improved.

Description

Method and device for determining data tag object
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a method and an apparatus for determining a data tag object.
Background
With the rapid development of information technology, data tag objects are uniformly managed by a data management platform step by step, and an enterprise is assisted to make effective decisions. However, different users have different requirements for the data tag object, and how to determine that the data tag object is recommended to the user becomes an urgent problem to be solved.
In the traditional technology, the data tag object recommendation method is to directly count the online date of the data tag object and the access heat of the historical data tag object and recommend a newer and hotter data tag object to all users. Therefore, the accuracy of the data tag object recommendation method in the traditional technology is not high.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method and an apparatus for determining a data tag object, which can improve the recommendation accuracy of the data tag object.
In a first aspect, the present application provides a method of determining a data tag object. The method comprises the following steps:
acquiring target user object information of a target user object;
based on the target user object information, inquiring and acquiring a plurality of similar user objects similar to the target user object, and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
determining a plurality of second data tag objects with the highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
acquiring a target user object feature vector of the target user object, inquiring a plurality of data label content vectors with the highest similarity to the target user object feature vector, and acquiring a plurality of fourth data label objects corresponding to the plurality of data label content vectors, wherein the target user object feature vector is determined based on a word vector of a data label object corresponding to the target user object;
and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
In one embodiment, the querying obtains a plurality of similar user objects similar to the target user object, including:
acquiring a target data tag vector corresponding to the target user object, and inquiring and acquiring a plurality of similar data tag vectors similar to the target data tag vector in a user object similarity matrix;
determining user objects corresponding to the number of similar data tag vectors, the number of similar user objects including the user objects corresponding to the number of similar data tag vectors.
In one embodiment, the obtaining method of the user object similarity matrix includes:
acquiring data tag objects related to the user objects based on the scoring relation of the user objects to the data tag objects;
vectorizing the data tag object associated with each user object to obtain a data tag vector corresponding to each user object;
and obtaining the similarity matrix of the user objects according to the similarity of the data label vectors corresponding to the user objects.
In one embodiment, the obtaining the first data tag objects with the highest association degree with the similar user objects includes:
and querying and determining a plurality of first data tag objects with the highest scoring relation with the plurality of similar user objects in a data tag object scoring table, wherein the data tag object scoring table comprises the scoring relation of each user object to each data tag object.
In one embodiment, the method for establishing the data tag object score table includes:
obtaining batch behavior data, wherein the batch behavior data comprises: behavior data of each user object in a preset time range to each data label object;
summarizing the batch of behavior data by taking the user object as a main key to obtain a summarized behavior data table;
and performing principal component analysis on the summarized behavior data table to obtain a data tag object scoring table, wherein the data tag object scoring table comprises scoring results of the user objects on the data tag objects.
In one embodiment, the obtaining a plurality of third data tag objects similar to the plurality of second data tag objects includes:
acquiring user vectors corresponding to the second data tag objects respectively, and inquiring and acquiring a plurality of similar user vectors similar to the user vectors in a data tag object similarity matrix;
determining data tag objects corresponding to the plurality of similar user vectors, and obtaining the plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
In one embodiment, the manner for acquiring the similarity matrix of the data tag object includes:
acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object;
vectorizing user object information of each user object of each data tag object respectively aiming at each data tag object to obtain a user vector corresponding to each data tag object;
and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
In one embodiment, the querying a plurality of data tag content vectors with the highest similarity to the feature vector of the target user object includes:
and querying and determining a plurality of data label content vectors similar to the target user object feature vector in the data label content similarity matrix.
In one embodiment, the obtaining method of the data tag content similarity matrix includes:
acquiring metadata information of each data tag object;
vectorizing metadata information of each data tag object to obtain a word vector of each data tag object;
and obtaining the data label content similarity matrix according to the similarity of the word vectors of the data label objects.
In a second aspect, the present application further provides an apparatus for determining a data tag object. The device comprises:
the information acquisition module is used for acquiring target user object information of a target user object;
the first object acquisition module is used for inquiring and acquiring a plurality of similar user objects similar to the target user object based on the target user object information and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
the third object acquisition module is used for determining a plurality of second data tag objects with the highest degree of association with the target user object and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
a fourth object obtaining module, configured to obtain a target user object feature vector of the target user object, query a plurality of data tag content vectors with a highest similarity to the target user object feature vector, and obtain a plurality of fourth data tag objects corresponding to the plurality of data tag content vectors, where the target user object feature vector is determined based on a word vector of a data tag object corresponding to the target user object;
and the recommended object determining module is used for determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
In a third aspect, the application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method in any of the above embodiments when the processor executes the computer program.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any of the above embodiments.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which, when executed by a processor, carries out the steps of the method of any of the above embodiments.
According to the method and the device for determining the data tag objects, the target user object information of the target user object is obtained, and a plurality of first data tag objects similar to the target user object are obtained based on the target user object information; determining a plurality of second data tag objects with the highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects; acquiring a target user object feature vector of a target user object, and acquiring a plurality of fourth data tag objects similar to the data tag content vector of the target user object based on the target user object feature vector; and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects. Compared with the prior art that the accuracy of recommending the data tag object to the user object according to the heat and the freshness of the data tag object is not high, in the embodiment, the target user object can be recommended more accurately through the target user object information, the plurality of second data tag objects with the highest association degree with the target user object and the data tag object determined by the target user object feature vector of the target user object, and the problem that the recommendation of the data tag object is inaccurate in the prior art is solved.
Drawings
FIG. 1 is a diagram of an application environment of a method for determining a data tag object provided in an embodiment of the present application;
FIG. 2 is a flow chart illustrating a method for determining a data tag object provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram illustrating the determination of a first data tag object in one embodiment;
FIG. 4 is a schematic flowchart illustrating a manner of obtaining a user object similarity matrix according to an embodiment;
FIG. 5 is a flowchart illustrating a manner in which a rating table for data tag objects may be built in one embodiment;
FIG. 6 is a schematic flow diagram illustrating the determination of a third data tag object in one embodiment;
FIG. 7 is a schematic flow chart illustrating a manner in which a data tag object similarity matrix is obtained according to an embodiment;
FIG. 8 is a schematic flow diagram illustrating the determination of a fourth data tag object in one embodiment;
FIG. 9 is a flowchart illustrating a manner in which a content similarity matrix of data tags is obtained according to an embodiment;
FIG. 10 is a flowchart illustrating a method for recommending data tag objects in one embodiment;
FIG. 11 is a block diagram of an apparatus for determining a data tag object provided in an embodiment of the present application;
FIG. 12 is a diagram showing an internal structure of a computer device provided in an embodiment of the present application;
fig. 13 is an internal structural view of another computer device provided in the embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The method for determining the data tag object provided by the embodiment of the application can be applied to an application environment as shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. When a user logs in the server 104 through the terminal 102, a data tag object recommendation process is activated. The terminal 102 sends a service request to the server 104, where the service request may be a login request or other service processing request, and the server 104 determines a data tag object recommended to the terminal 102 based on the service request of the terminal 102, the terminal 102 accesses service content and the like corresponding to the data tag object, a user approves, comments or other behaviors on the data tag object on the terminal 102, and the terminal 102 transmits the received user behavior data to a processing server for processing through a network, where the processing server may be the server 104 or another server. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the portable wearable devices may be smart watches, smart bands, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
Fig. 2 is a schematic flowchart of a method for determining a data tag object provided in this embodiment, which is described by taking the method as an example for being applied to a terminal device or a server, and includes the following steps:
s201, obtaining the target user object information of the target user object.
The target user object is a user object of the data tag object to be recommended. The target user object may be determined by combining with an actual technical scenario, for example, when a service request is received, the user object carried in the service request is determined as the target user object. In some embodiments, the target user object may be a user identifier carried in the service request.
The target user object information is information related to the target user object, such as all or a combination of several of the name of the user object, the department to which the user object belongs, and the post to which the user object belongs.
When the target user object information of the target user object is acquired, the target user object information can be acquired in various possible ways. For example, the service request sent by the target user object carries target user object information. For another example, user object information is stored in the database, the server extracts a target user object (e.g., a user identifier) from the service request, and then searches for user object information corresponding to the target user object from the database to obtain the target user object information.
S202, based on the target user object information, a plurality of similar user objects similar to the target user object are obtained through inquiry, and a plurality of first data tag objects with the highest relevance degree with the plurality of similar user objects are obtained.
When a query obtains several similar user objects similar to the target user object, the query can be obtained in one or more different manners, for example, based on a user object information table query or based on a user object similarity matrix.
And the data tag object is an object used for recommending the user object. Taking the technical scenario in the financial field as an example, the data tag object may be, for example, a financial product, loan information, or other objects that can be recommended to the user object, and is not limited herein.
The association degree of the user object and the data tag object is used to represent the association degree between the user object and the data tag object, and in some embodiments, the association degree may be embodied based on the scoring relationship of the user object to the data tag object.
The number of the first data tag objects is not limited in this embodiment of the application, and may be, for example, the number of the first data tag objects with the highest association degree with the plurality of similar user objects, or the number of the first data tag objects with the highest association degree in the data tag objects associated with the plurality of similar user objects in a predetermined ratio.
S203, determining a plurality of second data tag objects with highest relevance with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects.
A plurality of third data tag objects similar to the plurality of second data tag objects may be embodied by combining similarities between the data tag objects, and the specific number of the second data tag objects and the number of the third data tag objects are not limited in the embodiment of the present application.
S204, a target user object feature vector of the target user object is obtained, a plurality of data label content vectors with the highest similarity with the target user object feature vector are inquired, a plurality of fourth data label objects corresponding to the data label content vectors are obtained, and the target user object feature vector is determined based on word vectors of the data label objects corresponding to the target user object.
The target user object feature vector is a feature vector used to represent data tag object related to the target user object.
The word vector of the data tag object refers to a vector obtained by vectorizing the content of the data tag object, and is related to the content of the data tag object.
The specific number of the fourth data tag objects is not limited in the embodiment of the present application.
S205, determining a target data tag object to be recommended to a target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
When determining a target data tag object to be recommended to a target user object based on a plurality of first data tag objects, a plurality of third data tag objects, and a plurality of fourth data tag objects, the target data tag object may be determined in one or more different manners, for example, collaborative filtering may be performed on the plurality of first data tag objects, the plurality of third data tag objects, and the plurality of fourth data tag objects to obtain the target data tag object to be recommended to the target user object; or the scoring results of the plurality of first data label objects, the plurality of third data label objects and the plurality of fourth data label objects are sequenced in the sequence from high to low, and a certain number of data label objects before the highest scoring result are determined to be recommended to the target user object as the target data label objects.
The embodiment provides a method for determining a data tag object, which includes acquiring target user object information of a target user object, and acquiring a plurality of first data tag objects similar to the target user object based on the target user object information; determining a plurality of second data tag objects with the highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects; acquiring a target user object feature vector of a target user object, and acquiring a plurality of fourth data tag objects similar to the data tag content vector of the target user object based on the target user object feature vector; and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects. Compared with the prior art that the accuracy of recommending the data tag object to the user object according to the heat and the freshness of the data tag object is not high, in the embodiment, the data tag object determined by the target user object information, the second data tag objects with the highest association degree with the target user object and the target user object feature vector of the target user object is determined by considering the similarity of the user objects, the association degree between the user object and the data tag object, the association degree between the data tag objects and the similarity of the target user object feature vector determined based on the word vector and the data tag content vector of the data tag object, and the data tag object to be recommended in the team is determined by various dimensions, so that more accurate recommendation can be performed for the target user object, and the problem that the recommendation of the data tag object in the prior art is inaccurate is solved.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating a process of determining a first data tag object in an embodiment, and an implementation manner of how to determine the first data tag object is provided in this embodiment. On the basis of the above embodiment, the above S202 includes the following contents:
s2021, obtaining a target data label vector corresponding to the target user object, and inquiring to obtain a plurality of similar data label vectors similar to the target data label vector in the user object similarity matrix.
In this embodiment, the target data tag vector corresponding to the target user object is a data tag vector obtained by vectorizing each data tag object with the highest association degree with the target user object.
In some embodiments, obtaining a target data tag vector corresponding to a target user object may include: and acquiring the name of the data tag object with high association degree with the target user object, and vectorizing the name of the data tag object to obtain a target data tag vector.
The user object similarity matrix is a matrix used for representing the similarity between data label vectors of user objects, and the similarity between the user objects is represented.
In some embodiments, referring to fig. 4, the obtaining manner of the user object similarity matrix includes the following steps S2121 to S2321.
Step S2121: acquiring data tag objects related to the user objects based on the scoring relation of the user objects to the data tag objects; the level of the association degree between each user object and each data tag object can be judged according to the scoring relationship between each user object and each data tag object.
The scoring relation of the user object to each data label object is used for representing the evaluation degree of the user object to each data label object.
In some embodiments, the scoring relationship of the user object to each data tag object may be obtained in the following manner.
First, the interactive data of the user object to each data tag object is obtained, which may include, for example, browsing time, access times, access depth, (cancel) approval/collection, and number of bytes of comment content/data.
And then, assigning values to the acquired data respectively, for example, assigning values of praise for 4, cancelling praise for-3, summarizing the scores of the user to the data label objects, and obtaining the score value of the user to the data label objects, wherein the score value is used for representing the scoring relationship of the user to the data label objects.
In some embodiments, the obtained score value may be directly used as the degree of association, and in some embodiments, the degree of association may also be obtained by further processing the obtained score value, for example, by performing normalization processing on each score value, or by performing other processing, which is not specifically limited in the embodiment of the present application.
Step S2221: and vectorizing the data tag objects associated with the user objects respectively aiming at the user objects to obtain data tag vectors corresponding to the user objects.
As described above, for one user object, multiple data tag objects may be associated with the user object, for example, one user object has n data tag objects A1 \8230, 8230and An, each tag object has a corresponding name, and when vectorizing the data tag object associated with the user object, the names of the n data tag objects associated with the user object may be vectorized to obtain the data tag vector of the user object.
The vectorization may be performed in any vectorization manner, and the embodiment of the present application is not particularly limited.
Step S2321: and obtaining a user object similarity matrix according to the similarity of the data label vectors corresponding to the user objects.
The similarity represents the similarity between the data tag vectors, and in some embodiments, the similarity between the data tag vectors corresponding to the user objects may be calculated by cosine similarity.
After the similarity of the data tag vectors corresponding to the user objects is obtained, a user object similarity matrix can be constructed based on the similarity obtained through calculation. In the user object similarity matrix, a similarity value between two data tag vectors corresponding to each element in the matrix, for example, an element value corresponding to the 2 nd row and the 3 rd column, represents a similarity between data tag vectors corresponding to the 2 nd user object and the 3 rd user object.
S2022, determining the user objects corresponding to the plurality of similar data tag vectors, where the plurality of similar user objects includes the user objects corresponding to the plurality of similar data tag vectors.
As described above, one user object corresponds to one data tag vector, so that after several similar data tag vectors are obtained, the user objects corresponding to the data tag vectors can be obtained.
S2023, querying and determining a plurality of first data tag objects with the highest scoring relation with a plurality of similar user objects in a data tag object scoring table, wherein the data tag object scoring table comprises the scoring relation of each user object to each data tag object.
The data tag object scoring table is used for representing scoring relations between the user objects and the data tag objects, and the scoring relations represent scoring results of the user objects on the data tag objects.
In some embodiments, referring to fig. 5, the manner of establishing the data tag object score table may include the following steps S2123 to S2323.
Step S2123: obtaining batch behavior data, wherein the batch behavior data comprises: behavior data of each user object to each data tag object within a predetermined time range.
The behavior data refers to behavior score data corresponding to the behavior of each user object for commenting, agreeing, collecting or sharing each data tag object.
The process of obtaining the batch behavior data may be obtaining from a database, where comments, praise, collection, or sharing behavior data made by each user object on each data tag object are stored in the database.
Step S2223: and summarizing the batch behavior data by taking the user object as a main key to obtain a summarized behavior data table.
And summarizing the batch behavior data of each data tag object by taking the user object as a main key to obtain the behavior data of the user object to each data tag object and obtain a summarized behavior data table.
Step S2323: and performing principal component analysis on the summarized behavior data table to obtain a data tag object scoring table, wherein the data tag object scoring table comprises scoring results of each user object on each data tag object.
The principal component analysis of the summarized behavior data table may be performed by extracting principal components from the plurality of behavior data in the summarized behavior data table by using a PCA principal component analysis algorithm.
In this embodiment, the data tag object scoring table may be presented in a table form including a User object User _ id, a data tag object Item _ id, and a scoring result Score, or may be presented in a graph form.
In some embodiments, the querying to obtain several similar user objects similar to the target user object may further include:
inquiring in a user object information table to obtain user object information with department field information in the target user object information and post field information in the target user object information; the user objects corresponding to the user object information are a plurality of similar user objects similar to the target user object.
The user object information table is a table for representing information related to a user object, and the table includes user attribute information related to the user object, such as a user object ID, a user object name, a department to which the user object belongs, a post to which the user object belongs, user asset information, and user image information.
The department information of the user object comprises department field information of the target user object, and the post information of the user object comprises post field information of the target user object.
And then, in the data tag object scoring table, querying and determining a plurality of first data tag objects which have the highest scoring relation with a plurality of similar user objects.
Therefore, in the embodiment, the user objects in the same post and the same department are also used as a plurality of similar user objects similar to the target user object, so that the number of similar user objects similar to the target user object can be further increased, and the accuracy of finally determining the target data tag object to be recommended is improved.
Referring to fig. 6, fig. 6 is a schematic flowchart of determining a third data tag object in an embodiment, and an implementation manner of how to determine the third data tag object is provided in this embodiment. On the basis of the above embodiment, the above S203 includes the following contents:
s2031, in the data tag object scoring table, querying and determining a plurality of second data tag objects with the highest scoring relation with the target user object.
The data tag object scoring table may be the data tag object scoring table in the above embodiment. The number of the second data tag objects is determined according to the scoring result of the target user object to each data tag object recorded in the data tag object scoring table, and may be the number of the previous predetermined number with the highest degree of association with the target user object, or the number of the previous predetermined proportion with the highest degree of association in the data tag objects associated with the target user object.
S2032, user vectors corresponding to the second data label objects are obtained, and a plurality of similar user vectors similar to the user vectors are obtained through query in the similarity matrix of the data label objects.
In this embodiment, the user vector corresponding to each of the plurality of second data tag objects is a user vector obtained by vectorizing the user object information of each user object with the highest degree of association with the plurality of second data tag objects.
In some embodiments, obtaining the user vector corresponding to each of the plurality of second data tag objects may include: and acquiring user object names with high association degree with the plurality of second data tag objects, and vectorizing the user object names to obtain user vectors.
The data tag object similarity matrix is a matrix used for representing the similarity between user vectors of the data tag objects, and the similarity between the data tag objects is reflected.
In some embodiments, referring to fig. 7, the data tag object similarity matrix is obtained in a manner including the following steps S2132 to S2332.
Step S2132: acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object; the level of the association between each user object and each data tag object can be determined according to the scoring relationship between each user object and each data tag object.
The scoring relation of the user object to each data label object is used for representing the evaluation degree of the user object to each data label object.
In some embodiments, the scoring relationship of the user object to each data tag object may be obtained in the following manner.
First, the interactive data of the user object to each data tag object is obtained, which may include, for example, browsing time, access times, access depth, (cancel) approval/collection, and number of bytes of comment content/data.
And then, assigning values to the acquired data respectively, for example, assigning values of praise for 4, cancelling praise for-3, summarizing the scores of the user to the data label objects, and obtaining the score value of the user to the data label objects, wherein the score value is used for representing the scoring relationship of the user to the data label objects.
In some embodiments, the obtained score value may be directly used as the degree of association, and in some embodiments, the degree of association may also be obtained by further processing the obtained score value, for example, by performing normalization processing on each score value, or by performing other processing, which is not specifically limited in the embodiment of the present application.
Step S2232: and vectorizing the user object information of each user object of each data tag object aiming at each data tag object to obtain a user vector corresponding to each data tag object.
As described above, for one data tag object, multiple user objects may be associated with the data tag object, for example, s user objects D1 \8230 \ Ds are associated with the data tag object, and each user object has a corresponding name, so when vectorizing the user object associated with the data tag object, the names of the s user objects associated with the data tag object may be vectorized to obtain the user vector of the data tag object.
The vectorization may be performed in any vectorization manner, and the embodiment of the present application is not particularly limited.
Step S2332: and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
The similarity represents the degree of similarity between user vectors, and in some embodiments, the similarity between user vectors corresponding to data tag objects may be calculated by cosine similarity.
After the similarity of the user vectors corresponding to the data label objects is obtained, a data label object similarity matrix can be constructed based on the similarity obtained through calculation. In the data tag object similarity matrix, each element in the matrix represents a similarity value between two corresponding user vectors, for example, a value corresponding to the 3 rd row and the 4 th column represents a similarity value between the 3 rd data tag object and the 4 th data tag object.
S2033, determining data tag objects corresponding to the plurality of similar user vectors, and obtaining a plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
As described above, one data tag object corresponds to one user vector, so that after several similar user vectors are obtained, the data tag objects corresponding to the user vectors can be obtained.
In this embodiment, the data tag objects corresponding to several similar user vectors may be determined by querying in the data tag object scoring table.
Referring to fig. 8, fig. 8 is a schematic flowchart illustrating a process of determining a fourth data tag object in an embodiment, and an implementation manner of how to determine the fourth data tag object is provided in this embodiment. On the basis of the above embodiment, the above S204 includes the following contents:
s2041, obtaining a target user object feature vector of the target user object.
The target user object feature vector is determined based on the word vector of the data tag object corresponding to the target user object.
The word vector of the data tag object refers to a vector used for representing metadata information of the data tag object.
The word vector of the data tag object may be determined in the following manner: and acquiring metadata information of the data tag object, and performing vectorization processing on the metadata information to obtain a word vector of the data tag object.
S2042, inquiring and determining a plurality of data label content vectors similar to the target user object feature vector in the data label content similarity matrix.
The data tag content similarity matrix is a matrix used for representing the similarity between data tag content vectors of each data tag object, and the data tag content similarity matrix represents the similarity between the data tag content vectors. The data tag content vector may be a word vector of the data tag object.
In some embodiments, referring to fig. 9, the obtaining manner of the content similarity matrix of the data tag includes the following steps S2142 to S2342:
step S2142: and acquiring metadata information of each data tag object.
In some embodiments, taking the application in the field of computer financial technology as an example, the metadata information may include a service aperture of the data tag object.
In some embodiments, the metadata information may be obtained by: for example, the metadata information may be obtained from a database, and the database stores the metadata information of each data tag object.
Step S2242: and vectorizing the metadata information of each data label object to obtain a word vector of each data label object.
Taking a data tag object as a credit card as an example, the metadata information of each data tag object is vectorized, the metadata information of the credit card comprises an available amount and a repayment date, and the metadata information of each data tag object is vectorized to obtain a 100-dimensional vector [0.792, -0.177, \8230 ], wherein the 100-dimensional vector [0.792, -0.177, \8230 ] represents a word vector of the credit card.
The vectorization may be performed in any vectorization manner, and the embodiment of the present application is not particularly limited.
Step S2342: and obtaining a data label content similarity matrix according to the similarity of the word vectors of the data label objects.
The similarity represents the similarity between word vectors of each data tag object, and in some embodiments, the similarity between word vectors of each data tag object may be calculated by cosine similarity.
S2043, a plurality of fourth data label objects corresponding to the plurality of data label content vectors are obtained.
When obtaining a plurality of fourth data tag objects corresponding to a plurality of data tag content vectors, the fourth data tag objects may be obtained in one or more different manners, for example, data tag object information is carried in the data tag content vectors. For another example, a data tag object is stored in the database, the similarity between the data tag content vector and the data tag object is calculated by using cosine similarity, and the data tag object with the highest similarity to the data tag content vector is extracted from the database and used as the fourth data tag object.
In this embodiment, a process of recommending a data tag object to a target user object is explained.
Referring to fig. 10, in a specific implementation process, the embodiment of the present application relates to batch data loading, basic data determination, determination of a recommended object, and presentation of the recommended object.
Before recommending a data tag object to a target user object, firstly loading batch data and determining basic data, and then determining a recommended object by combining the determined basic data.
When batch data loading is performed, batch data information may be acquired from the database, where the batch data may be data within a certain time duration, and specifically may include: user object information, data tag objects, user behavior data; the user object information of each user object comprises a user object name, user object position information and user object department information, the data tag object comprises a data tag object name and metadata information of the data tag object, and the user behavior data is behavior data for the user object to like, comment, collect or share the data tag object.
Then, for the batch data, basic data thereof is determined, and the basic data comprises: the system comprises a user object information table, a data tag object scoring table, a user object similarity matrix, a data tag object similarity matrix and a data tag content similarity matrix.
When determining the basic data, the method specifically comprises the following steps: constructing a user object information table according to the user object information, wherein each row in the table can comprise a user object name, position information corresponding to the user object and department information corresponding to the user object;
according to the user behavior data of each user object to each data tag object, the user behavior data are data determined according to the user behavior, such as approval, comment and collection, the user behavior corresponds to different scores respectively, such as approval of +4, comment of +2 and collection of +5, user behavior data are obtained, principal component analysis is conducted on the user behavior data of each user object to each data tag object, main user behavior data are obtained, accordingly, the grading relation of each user object to each data tag object is obtained according to the main user behavior data, a data tag object grading table is obtained, and one user object in the table corresponds to at least one data tag object and the grading result of the user object to the data tag object;
querying a data tag object with a high user object scoring result in a data tag object scoring table, vectorizing the data tag object with the high user object scoring result to obtain a data tag vector corresponding to the user object, and calculating the similarity between the data tag vectors corresponding to the user objects by utilizing cosine similarity to obtain a user object similarity matrix;
querying a user object with a high data tag object scoring result in a data tag object scoring table, vectorizing the user object with the high data tag object scoring result to obtain a user vector corresponding to the data tag object, and calculating the similarity between the user vectors corresponding to the data tag objects by utilizing cosine similarity to obtain a data tag object similarity matrix;
obtaining metadata information of the data tag objects, carrying out vectorization processing on the metadata information of the data tag objects to obtain word vectors of the data tag objects, and calculating the similarity among the word vectors of the data tag objects to obtain a data tag content similarity matrix.
Batch data is obtained, and the batch data can be stored for a subsequent determination process of a data tag object to be recommended.
The process of recommending the data tag object to the target user object is as follows: when the target user object logs in the server 104 through the terminal 102, a data tag object recommendation process is activated. The terminal 102 sends a service request to the server 104, where the service request carries target user object information of a target user object. Based on the target user object information, querying a similar user object similar to the target user object in a user object information table or a user object similarity matrix, for example, a user object in the same post or the same department as the target user object may be obtained in the user object information table as the similar user object, or a user object with a high similarity value with the target user object may be obtained in the user object similarity matrix as the similar user object; in the data tag object scoring table, querying a data tag object with the highest scoring result of the data tag objects corresponding to the similar user object as a first data tag object;
determining a data tag object with the highest scoring result of the data tag object corresponding to the target user object as a second data tag object in the data tag object scoring table; determining a data tag object with high similarity value with the second data tag object as a third data tag object in the data tag object similarity matrix;
acquiring a word vector of a second data label object as a target user object characteristic vector of a target user object; determining a data label content vector with a high similarity value with a target user object feature vector of a target user object in the data label content similarity matrix as a similar data label content vector, and obtaining a fourth data label object according to the similar data label content vector;
and sequencing the similarity results of the first data tag object, the third data tag object and the fourth data tag object, selecting the object with the highest similarity as a target data tag object, and finally recommending the target data tag object to a target user object.
In the embodiment, the similarity between user objects in the same department and the same post is considered, the data tag object is recommended, the similar data tag object is recommended for the user objects with the same preference, namely the user objects with higher user similarity, the requirements and the preferences of the user objects on the data tag object are fully considered, and the recommendation of the data tag object which accords with the user objects can be ensured no matter whether the user objects are new user objects or old user objects; user behavior data, namely a series of user behaviors made by the user object on the data tag object, are also considered, and the user behaviors can reflect the evaluation and the preference of the user object on the data tag object, so that the problem of single data source is solved; and a plurality of data tag objects are obtained in different modes, and the data tag object to be recommended is finally determined based on collaborative filtering, so that the comprehensiveness of preference mining of the user object is improved, the preference degree of the user object on the data tag object is effectively captured, the cold start problem of the data tag object recommendation method is solved, and more accurate data tag object recommendation service is provided for the user object.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides an apparatus for determining a data tag object, which is used for implementing the above-mentioned method for determining a data tag object. The implementation scheme for solving the problem provided by the apparatus is similar to the implementation scheme described in the above method, so that specific limitations in one or more embodiments of the apparatus for determining a data tag object provided below may refer to the limitations in the above method for determining a data tag object, and are not described herein again.
Referring to fig. 11, fig. 11 is a block diagram illustrating a structure of an apparatus for determining a data tag object according to an embodiment of the present application, where the apparatus 1100 includes: an information obtaining module 1101, a first object obtaining module 1102, a third object obtaining module 1103, a fourth object obtaining module 1104 and a recommended object determining module 1105, wherein:
an information acquisition module 1101 for acquiring target user object information of a target user object;
a first object obtaining module 1102, configured to query and obtain, based on the target user object information, a plurality of similar user objects similar to the target user object, and obtain a plurality of first data tag objects with highest association degrees with the plurality of similar user objects;
a third object obtaining module 1103, configured to determine a plurality of second data tag objects with the highest association degree with the target user object, and obtain a plurality of third data tag objects similar to the plurality of second data tag objects;
a fourth object obtaining module 1104, configured to obtain a target user object feature vector of the target user object, query a plurality of data tag content vectors with the highest similarity to the target user object feature vector, and obtain a plurality of fourth data tag objects corresponding to the plurality of data tag content vectors, where the target user object feature vector is determined based on a word vector of the data tag object corresponding to the target user object;
a recommended object determining module 1105, configured to determine a target data tag object to be recommended to a target user object based on the number of first data tag objects, the number of third data tag objects, and the number of fourth data tag objects.
The apparatus for determining a data tag object provided in this embodiment obtains, by obtaining target user object information of a target user object, a plurality of first data tag objects similar to the target user object based on the target user object information; determining a plurality of second data tag objects with the highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects; acquiring a target user object feature vector of a target user object, and acquiring a plurality of fourth data tag objects similar to the data tag content vector of the target user object based on the target user object feature vector; and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects. Compared with the prior art that the accuracy of recommending the data tag object to the user object according to the heat and the freshness of the data tag object is not high, in the embodiment, the target user object can be recommended more accurately through the target user object information, the plurality of second data tag objects with the highest association degree with the target user object and the data tag object determined by the target user object feature vector of the target user object, and the problem that the recommendation of the data tag object is inaccurate in the prior art is solved.
Optionally, the first object obtaining module 1102 includes:
the data tag vector acquiring unit is used for acquiring a target data tag vector corresponding to a target user object, and inquiring and acquiring a plurality of similar data tag vectors similar to the target data tag vector in a user object similarity matrix;
and the user object determining unit is used for determining the user objects corresponding to the plurality of similar data label vectors, and the plurality of similar user objects comprise the user objects corresponding to the plurality of similar data label vectors.
Optionally, the obtaining manner of the user object similarity matrix includes:
acquiring data label objects related to the user objects based on the scoring relation of the user objects to the data label objects;
vectorizing the data tag objects associated with the user objects respectively aiming at the user objects to obtain data tag vectors corresponding to the user objects;
and obtaining a user object similarity matrix according to the similarity of the data label vectors corresponding to the user objects.
Optionally, the first object obtaining module 1102 further includes:
the first object determining unit is used for inquiring and determining a plurality of first data label objects with the highest scoring relation with a plurality of similar user objects in a data label object scoring table, and the data label object scoring table comprises the scoring relation of each user object to each data label object.
Optionally, the method for establishing the data tag object scoring table includes:
obtaining batch behavior data, wherein the batch behavior data comprises: behavior data of each user object in a preset time range to each data label object;
summarizing the batch behavior data by taking the user object as a main key to obtain a summarized behavior data table;
and performing principal component analysis on the aggregated behavior data table to obtain a data tag object rating table, wherein the data tag object rating table comprises a rating result of each user object on each data tag object.
Optionally, the third object obtaining module 1103 includes:
the user vector acquisition unit is used for acquiring user vectors corresponding to the second data tag objects respectively, and inquiring and acquiring a plurality of similar user vectors similar to the user vectors in the similarity matrix of the data tag objects;
and the third object determining unit is used for determining the data tag objects corresponding to the plurality of similar user vectors and obtaining a plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
Optionally, the obtaining method of the data tag object similarity matrix includes:
acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object;
vectorizing user object information of each user object of each data tag object respectively aiming at each data tag object to obtain a user vector corresponding to each data tag object;
and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
Optionally, the fourth object obtaining module 1104 includes:
and the content vector determining unit is used for inquiring and determining a plurality of data label content vectors similar to the target user object characteristic vector in the data label content similarity matrix.
Optionally, the obtaining method of the content similarity matrix of the data tag includes:
acquiring metadata information of each data tag object;
vectorizing metadata information of each data tag object to obtain a word vector of each data tag object;
and obtaining a data label content similarity matrix according to the similarity of the word vectors of the data label objects.
The various modules in the above-described apparatus for determining a data tag object may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent of a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing target user object information of the target user object, the data tag object and target user object feature vector data of the target user object. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program when executed by a processor implements a method of determining a data tag object.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 13. The computer device comprises a processor, a memory, a communication interface, a display screen and an input device which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program when executed by a processor implements a method of determining a data tag object. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the configurations shown in fig. 12 and 13 are only block diagrams of some configurations relevant to the present disclosure, and do not constitute a limitation on the computer device to which the present disclosure may be applied, and a particular computer device may include more or less components than those shown in the figures, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor when executing the computer program realizes the steps of the method for determining a data tag object provided by the above embodiment:
acquiring target user object information of a target user object;
based on the target user object information, inquiring and acquiring a plurality of similar user objects similar to the target user object, and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
determining a plurality of second data tag objects with the highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
acquiring a target user object feature vector of a target user object, inquiring a plurality of data label content vectors with the highest similarity to the target user object feature vector, and acquiring a plurality of fourth data label objects corresponding to the plurality of data label content vectors, wherein the target user object feature vector is determined based on the word vector of the data label object corresponding to the target user object;
and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a target data tag vector corresponding to a target user object, and inquiring and acquiring a plurality of similar data tag vectors similar to the target data tag vector in a user object similarity matrix;
determining user objects corresponding to a number of similar data tag vectors, the number of similar user objects including user objects corresponding to the number of similar data tag vectors.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring data label objects related to the user objects based on the scoring relation of the user objects to the data label objects;
vectorizing the data tag objects associated with the user objects respectively aiming at the user objects to obtain data tag vectors corresponding to the user objects;
and obtaining a user object similarity matrix according to the similarity of the data label vectors corresponding to the user objects.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and querying and determining a plurality of first data tag objects with the highest scoring relation with a plurality of similar user objects in a data tag object scoring table, wherein the data tag object scoring table comprises the scoring relation of each user object to each data tag object.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining batch behavior data, wherein the batch behavior data comprises: behavior data of each user object in a preset time range to each data label object;
summarizing the batch behavior data by taking the user object as a main key to obtain a summarized behavior data table;
and performing principal component analysis on the aggregated behavior data table to obtain a data tag object rating table, wherein the data tag object rating table comprises a rating result of each user object on each data tag object.
In one embodiment, the processor when executing the computer program further performs the steps of:
acquiring user vectors corresponding to the second data tag objects respectively, and inquiring and acquiring a plurality of similar user vectors similar to the user vectors in the data tag object similarity matrix;
and determining data tag objects corresponding to the plurality of similar user vectors, and obtaining a plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object;
vectorizing user object information of each user object of each data tag object aiming at each data tag object to obtain a user vector corresponding to each data tag object;
and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
and querying and determining a plurality of data label content vectors similar to the target user object feature vector in the data label content similarity matrix.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring metadata information of each data tag object;
vectorizing metadata information of each data tag object to obtain a word vector of each data tag object;
and obtaining a data label content similarity matrix according to the similarity of the word vectors of the data label objects.
The implementation principle and technical effect of the above embodiment are similar to those of the above method embodiment, and are not described herein again.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the method of determining a data tag object provided by the above embodiments:
acquiring target user object information of a target user object;
based on the target user object information, inquiring and acquiring a plurality of similar user objects similar to the target user object, and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
determining a plurality of second data tag objects with highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
acquiring a target user object feature vector of a target user object, inquiring a plurality of data label content vectors with the highest similarity to the target user object feature vector, and acquiring a plurality of fourth data label objects corresponding to the plurality of data label content vectors, wherein the target user object feature vector is determined based on the word vector of the data label object corresponding to the target user object;
and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a target data tag vector corresponding to a target user object, and inquiring and acquiring a plurality of similar data tag vectors similar to the target data tag vector in a user object similarity matrix;
determining a user object corresponding to a number of similar data tag vectors, the number of similar user objects including the user object corresponding to the number of similar data tag vectors.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring data label objects related to the user objects based on the scoring relation of the user objects to the data label objects;
vectorizing the data tag objects associated with the user objects respectively aiming at the user objects to obtain data tag vectors corresponding to the user objects;
and obtaining a user object similarity matrix according to the similarity of the data label vectors corresponding to the user objects.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and querying and determining a plurality of first data tag objects with the highest scoring relation with a plurality of similar user objects in a data tag object scoring table, wherein the data tag object scoring table comprises the scoring relation of each user object to each data tag object.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining batch behavior data, wherein the batch behavior data comprises: behavior data of each user object to each data tag object within a preset time range;
summarizing the batch behavior data by taking the user object as a main key to obtain a summarized behavior data table;
and performing principal component analysis on the summarized behavior data table to obtain a data tag object scoring table, wherein the data tag object scoring table comprises scoring results of each user object on each data tag object.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring user vectors corresponding to the second data tag objects respectively, and inquiring and acquiring a plurality of similar user vectors similar to the user vectors in the similarity matrix of the data tag objects;
and determining data tag objects corresponding to the plurality of similar user vectors, and obtaining a plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object;
vectorizing user object information of each user object of each data tag object respectively aiming at each data tag object to obtain a user vector corresponding to each data tag object;
and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and querying and determining a plurality of data label content vectors similar to the target user object characteristic vector in the data label content similarity matrix.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring metadata information of each data tag object;
vectorizing metadata information of each data tag object to obtain a word vector of each data tag object;
and obtaining a data label content similarity matrix according to the similarity of the word vectors of the data label objects.
The implementation principle and technical effect of the above embodiment are similar to those of the above method embodiment, and are not described herein again.
In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of the method of determining a data tag object as provided by the above embodiments:
acquiring target user object information of a target user object;
based on the target user object information, inquiring and acquiring a plurality of similar user objects similar to the target user object, and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
determining a plurality of second data tag objects with highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
acquiring a target user object feature vector of a target user object, inquiring a plurality of data tag content vectors with highest similarity to the target user object feature vector, and acquiring a plurality of fourth data tag objects corresponding to the plurality of data tag content vectors, wherein the target user object feature vector is determined based on word vectors of the data tag objects corresponding to the target user object;
and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a target data tag vector corresponding to a target user object, and inquiring and acquiring a plurality of similar data tag vectors similar to the target data tag vector in a user object similarity matrix;
determining a user object corresponding to a number of similar data tag vectors, the number of similar user objects including the user object corresponding to the number of similar data tag vectors.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring data tag objects related to the user objects based on the scoring relation of the user objects to the data tag objects;
vectorizing the data tag objects associated with the user objects respectively aiming at the user objects to obtain data tag vectors corresponding to the user objects;
and obtaining a user object similarity matrix according to the similarity of the data label vectors corresponding to the user objects.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and querying and determining a plurality of first data tag objects with the highest scoring relation with a plurality of similar user objects in a data tag object scoring table, wherein the data tag object scoring table comprises the scoring relation of each user object to each data tag object.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining batch behavior data, wherein the batch behavior data comprises: behavior data of each user object in a preset time range to each data label object;
summarizing the batch behavior data by taking the user object as a main key to obtain a summarized behavior data table;
and performing principal component analysis on the aggregated behavior data table to obtain a data tag object rating table, wherein the data tag object rating table comprises a rating result of each user object on each data tag object.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring user vectors corresponding to the second data tag objects respectively, and inquiring and acquiring a plurality of similar user vectors similar to the user vectors in the data tag object similarity matrix;
and determining data tag objects corresponding to the plurality of similar user vectors, and obtaining a plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object;
vectorizing user object information of each user object of each data tag object aiming at each data tag object to obtain a user vector corresponding to each data tag object;
and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
In one embodiment, the computer program when executed by the processor further performs the steps of:
and querying and determining a plurality of data label content vectors similar to the target user object feature vector in the data label content similarity matrix.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring metadata information of each data tag object;
vectorizing metadata information of each data tag object to obtain a word vector of each data tag object;
and obtaining a data label content similarity matrix according to the similarity of the word vectors of the data label objects.
The implementation principle and technical effect of the above embodiment are similar to those of the above method embodiment, and are not described herein again. It should be noted that the data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), for example. The databases involved in the embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the various embodiments provided herein may be, without limitation, general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, or the like.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. A method of determining a data tag object, the method comprising:
acquiring target user object information of a target user object;
based on the target user object information, inquiring and acquiring a plurality of similar user objects similar to the target user object, and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
determining a plurality of second data tag objects with the highest association degree with the target user object, and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
acquiring a target user object feature vector of the target user object, inquiring a plurality of data tag content vectors with the highest similarity to the target user object feature vector, and acquiring a plurality of fourth data tag objects corresponding to the plurality of data tag content vectors, wherein the target user object feature vector is determined based on a word vector of a data tag object corresponding to the target user object;
and determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
2. The method of claim 1, wherein the querying obtains a number of similar user objects similar to the target user object, comprising:
acquiring a target data label vector corresponding to the target user object, and inquiring and acquiring a plurality of similar data label vectors similar to the target data label vector in a user object similarity matrix;
determining user objects corresponding to the plurality of similar data tag vectors, the plurality of similar user objects including the user objects corresponding to the plurality of similar data tag vectors.
3. The method according to claim 2, wherein the obtaining of the user object similarity matrix comprises:
acquiring data label objects related to the user objects based on the scoring relation of the user objects to the data label objects;
vectorizing the data tag object associated with each user object to obtain a data tag vector corresponding to each user object;
and obtaining a user object similarity matrix according to the similarity of the data label vectors corresponding to the user objects.
4. The method according to any one of claims 1 to 3, wherein the obtaining of the first data tag objects with the highest association with the similar user objects comprises:
and querying and determining a plurality of first data tag objects with the highest scoring relation with the plurality of similar user objects in a data tag object scoring table, wherein the data tag object scoring table comprises the scoring relation of each user object to each data tag object.
5. The method of claim 4, wherein the data tag object score table is established in a manner that comprises:
obtaining batch behavior data, the batch behavior data comprising: behavior data of each user object in a preset time range to each data label object;
summarizing the batch of behavior data by taking the user object as a main key to obtain a summarized behavior data table;
and performing principal component analysis on the summarized behavior data table to obtain a data tag object scoring table, wherein the data tag object scoring table comprises scoring results of the user objects on the data tag objects.
6. The method of any of claims 1 to 3, wherein obtaining a plurality of third data tag objects that are similar to the plurality of second data tag objects comprises:
acquiring user vectors corresponding to the second data tag objects respectively, and inquiring and acquiring a plurality of similar user vectors similar to the user vectors in a data tag object similarity matrix;
determining data tag objects corresponding to the plurality of similar user vectors, and obtaining the plurality of third data tag objects based on the data tag objects corresponding to the plurality of similar user vectors.
7. The method of claim 6, wherein the manner of obtaining the data tag object similarity matrix comprises:
acquiring each user object associated with each data tag object based on the scoring relation of each user object to each data tag object;
vectorizing user object information of each user object of each data tag object respectively aiming at each data tag object to obtain a user vector corresponding to each data tag object;
and obtaining a data label object similarity matrix according to the similarity of the user vectors corresponding to the data label objects.
8. The method according to any one of claims 1-3, wherein the querying of the plurality of data tag content vectors with the highest similarity to the target user object feature vector comprises:
and querying and determining a plurality of data label content vectors similar to the target user object feature vector in the data label content similarity matrix.
9. The method of claim 8, wherein the obtaining of the data tag content similarity matrix comprises:
acquiring metadata information of each data tag object;
vectorizing metadata information of each data tag object to obtain a word vector of each data tag object;
and obtaining the data label content similarity matrix according to the similarity of the word vectors of the data label objects.
10. An apparatus for determining a data tag object, the apparatus comprising:
the information acquisition module is used for acquiring target user object information of a target user object;
the first object acquisition module is used for inquiring and acquiring a plurality of similar user objects similar to the target user object based on the target user object information and acquiring a plurality of first data tag objects with highest association degree with the plurality of similar user objects;
the third object acquisition module is used for determining a plurality of second data tag objects with the highest association degree with the target user object and acquiring a plurality of third data tag objects similar to the plurality of second data tag objects;
a fourth object obtaining module, configured to obtain a target user object feature vector of the target user object, query a plurality of data tag content vectors that have a highest similarity with the target user object feature vector, and obtain a plurality of fourth data tag objects corresponding to the plurality of data tag content vectors, where the target user object feature vector is determined based on a word vector of a data tag object corresponding to the target user object;
and the recommended object determining module is used for determining a target data tag object to be recommended to the target user object based on the plurality of first data tag objects, the plurality of third data tag objects and the plurality of fourth data tag objects.
CN202210961685.0A 2022-08-11 2022-08-11 Method and device for determining data tag object Pending CN115408620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210961685.0A CN115408620A (en) 2022-08-11 2022-08-11 Method and device for determining data tag object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210961685.0A CN115408620A (en) 2022-08-11 2022-08-11 Method and device for determining data tag object

Publications (1)

Publication Number Publication Date
CN115408620A true CN115408620A (en) 2022-11-29

Family

ID=84160216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210961685.0A Pending CN115408620A (en) 2022-08-11 2022-08-11 Method and device for determining data tag object

Country Status (1)

Country Link
CN (1) CN115408620A (en)

Similar Documents

Publication Publication Date Title
WO2021004333A1 (en) Knowledge graph-based event processing method and apparatus, device, and storage medium
CN111339427B (en) Book information recommendation method, device and system and storage medium
US10438133B2 (en) Spend data enrichment and classification
CN113449187A (en) Product recommendation method, device and equipment based on double portraits and storage medium
CN113095408A (en) Risk determination method and device and server
CN108475259A (en) The system and method analysed and investigated result and generate investigation result output
CN111488385A (en) Data processing method and device based on artificial intelligence and computer equipment
CN112418978A (en) Product recommendation method, device, equipment and medium
CN115795000A (en) Joint similarity algorithm comparison-based enclosure identification method and device
CN111582932A (en) Inter-scene information pushing method and device, computer equipment and storage medium
CN114398864A (en) Report display method, device, equipment and storage medium
CN114741402A (en) Method and device for processing service feature pool, computer equipment and storage medium
CN114547385A (en) Label construction method and device, electronic equipment and storage medium
WO2022089235A1 (en) Product demonstration method and apparatus, computer device, and storage medium
CN110428342B (en) Data restoration method, server, customer service side and storage medium
CN112511632A (en) Object pushing method, device and equipment based on multi-source data and storage medium
US11170046B2 (en) Network node consolidation
CN115408620A (en) Method and device for determining data tag object
CN115186188A (en) Product recommendation method, device and equipment based on behavior analysis and storage medium
CN114547066A (en) Nuclear power business data standardization method and device and computer equipment
CN110827160A (en) Insurance information pushing method and device, computer equipment and storage medium
CN112835886A (en) Data table field adding method and device
CN115423595B (en) File information processing method and device, computer equipment and storage medium
US20220261666A1 (en) Leveraging big data, statistical computation and artificial intelligence to determine a likelihood of object renunciation prior to a resource event
CN113240472B (en) Financial product recommendation method, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination