WO2021077989A1 - Method and device for making recommendation, computer device, and storage medium - Google Patents

Method and device for making recommendation, computer device, and storage medium Download PDF

Info

Publication number
WO2021077989A1
WO2021077989A1 PCT/CN2020/118107 CN2020118107W WO2021077989A1 WO 2021077989 A1 WO2021077989 A1 WO 2021077989A1 CN 2020118107 W CN2020118107 W CN 2020118107W WO 2021077989 A1 WO2021077989 A1 WO 2021077989A1
Authority
WO
WIPO (PCT)
Prior art keywords
mapping
sample
vector
mapping vector
user
Prior art date
Application number
PCT/CN2020/118107
Other languages
French (fr)
Chinese (zh)
Inventor
丁子扬
马文晔
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021077989A1 publication Critical patent/WO2021077989A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Definitions

  • the embodiments of the present application relate to the field of computer technology, and in particular, to a recommendation method, device, computer equipment, and storage medium.
  • recommending data for the user select data similar to the data that the user has processed before, and recommend the similar data to the user. For example, recommend to the user products that are similar to the products that the user has previously purchased.
  • the embodiments of the present application provide a recommendation method, device, computer equipment, and storage medium, which expand the scope of application.
  • the technical solution is as follows:
  • a recommendation method which is applied to a server, and the method includes:
  • the first feature information is mapped to a target space to obtain a first mapping vector corresponding to the first object in the target space, and the target space includes a user mapping vector corresponding to a user identification and The data mapping vector corresponding to the candidate data;
  • a recommendation is made based on the first object and the second object, where the second mapping vector is a vector corresponding to the second object in the target space ,
  • the distance between the second mapping vector and the first mapping vector is less than a preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  • a recommendation device including:
  • a first information acquisition module configured to acquire first characteristic information of a first object, the first object belonging to a user identification or candidate data
  • the first mapping module is configured to map the first feature information to a target space based on a mapping model to obtain a first mapping vector corresponding to the first object in the target space, and the target space includes a user Identify the corresponding user mapping vector and the data mapping vector corresponding to the candidate data;
  • the recommendation module is configured to make a recommendation based on the first object and the second object according to the distance between any two mapping vectors in the target space, where the second mapping vector is that the second object is in the target space. For a corresponding vector in the space, the distance between the second mapping vector and the first mapping vector is less than a preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  • a computer device in another aspect, includes a processor and a memory, and at least one piece of program code is stored in the memory, and the at least one piece of program code is loaded and executed by the processor to realize the following: The operations performed in the recommended method.
  • a computer-readable storage medium is provided, and at least one piece of program code is stored in the computer-readable storage medium, and the at least one piece of program code is loaded and executed by a processor, so as to implement The action performed.
  • a computer program is provided, and at least one program code is stored in the computer program, and the at least one program code is loaded and executed by a processor, so as to implement the operations performed in the recommended method.
  • the method, device, computer equipment, and storage medium provided by the embodiments of the present application only need to acquire the first object, map the first object to the target space, and then, according to the distance between the respective mapping vectors included in the target space, Obtain the second object for recommendation, and then recommend based on the first object and the second object. No objects other than the first object and the second object are involved in the recommendation process, that is, there is no need to obtain other objects. It is not restricted by other objects and expands the scope of application.
  • Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • Fig. 2 is a flowchart of a recommendation method provided by an embodiment of the present application.
  • Fig. 3 is a schematic diagram of a mapping vector distance provided by an embodiment of the present application.
  • Fig. 4 is a schematic diagram of another mapping vector distance provided by an embodiment of the present application.
  • Fig. 5 is a flowchart of another recommendation method provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a recommendation interface provided by an embodiment of the present application.
  • Fig. 7 is a schematic diagram of a self-encoder provided by an embodiment of the present application.
  • Fig. 8 is a schematic diagram of a vector mapping provided by an embodiment of the present application.
  • Fig. 9 is a schematic diagram of another vector mapping provided by an embodiment of the present application.
  • Fig. 10 is a schematic diagram of another vector mapping provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a target space vector distribution provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a manifold structure provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a decoding process of a mapping vector provided by an embodiment of the present application.
  • Fig. 14 is a schematic diagram of a dual autoencoder provided by an embodiment of the present application.
  • FIG. 15 is a schematic diagram of a reconstruction process of a manifold structure provided by an embodiment of the present application.
  • FIG. 16 is a schematic diagram of a collaborative metric learning effect provided by an embodiment of the present application.
  • FIG. 17 is a schematic diagram of an in-depth model provided by an embodiment of the present application.
  • FIG. 18 is a schematic structural diagram of a recommendation device provided by an embodiment of the present application.
  • FIG. 19 is a schematic structural diagram of another recommending device provided by an embodiment of the present application.
  • FIG. 20 is a schematic structural diagram of another recommending device provided by an embodiment of the present application.
  • FIG. 21 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
  • FIG. 22 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • first, second, etc. used in this application can be used herein to describe various concepts, but unless otherwise specified, these concepts are not limited by these terms. These terms are only used to distinguish one concept from another.
  • the first object may be referred to as the second object, and the second object may be referred to as the first object.
  • At least one used in the present application includes one or more than one, and the number of at least one is an integer, for example, the at least one may be 1, 2, 3, etc.
  • plurality used in the present application includes two or more than two, and the number of the multiple is an integer, for example, the multiple may be 2, 3, 4, etc.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • the implementation environment includes: at least one terminal 101 and a server 102.
  • At least one terminal 101 is connected to the server 102 and logs in to the server 102 based on a user identifier.
  • the server 102 stores multiple data, including video data, audio data, text data, or picture data. During the operation of any terminal 101, the server 102 recommends any data to the terminal 101 for display by the terminal 101.
  • the terminal 101 is various types of devices such as mobile phones and tablet computers.
  • the server 102 is a server, or a server cluster composed of several servers, or a cloud computing service center.
  • Fig. 2 is a flowchart of a recommendation method provided by an embodiment of the present application.
  • the execution subject of the embodiment of the present application is a server. Referring to Fig. 2, the method includes:
  • the characteristic information of the user ID is used to describe the user corresponding to the user ID, and the characteristic information of the user ID includes information such as the age and gender of the user.
  • the characteristic information of the user identification also includes the user's interest tag.
  • the user's interest tag is obtained according to the candidate data processed before the user identification, for example, the user's interest tag is obtained through the user's product purchase record.
  • Favorite commodity type obtain the user's favorite article type through the article reading record, obtain the user's favorite video type through the video viewing record, etc.
  • the feature information of the candidate data is used to describe the candidate data.
  • the characteristic information is information such as the price and type of the commodity; if the candidate data is an article, the characteristic information is information such as the type and number of words of the article; If the candidate data is a video, the characteristic information is information such as the type and duration of the video.
  • the first object belongs to user identification or candidate data
  • the second object also belongs to user identification or candidate data.
  • the first object and the second object belong to different categories, that is, if the first object is a user identification, The second object is the candidate data; if the first object is the candidate data, the second object is the user identification.
  • mapping model Based on the mapping model, map the first feature information and the second feature information to the target space respectively to obtain the first mapping vector corresponding to the first feature information in the target space, and the second feature information in the target space The corresponding second mapping vector.
  • the mapping model is used to map the feature information, and the feature information is mapped to the target space through the mapping model to obtain the corresponding mapping vector.
  • the target space compared with the original space of the feature information, the target space has a different dimension from the original space, and the target space is a low-dimensional space or a high-dimensional space.
  • the feature information of different types of objects can be mapped to the target space based on the mapping model to obtain the mapping vector corresponding to the feature information.
  • the feature information of the user identification is mapped to the target space to obtain the corresponding user mapping vector
  • the feature information of the candidate data is mapped to the target space to obtain the corresponding data mapping vector.
  • the first feature information is input to the mapping model to obtain the first mapping vector corresponding to the target space
  • the second feature information is input to the mapping model to obtain the second mapping vector corresponding to the target space. Since the first feature information and the second feature information belong to different types of objects, the second mapping vector and the first mapping vector belong to different types of mapping vectors, that is, the first mapping vector is the user mapping vector, and the second mapping vector is the data mapping Vector, or the first mapping vector is a data mapping vector, and the second mapping vector is a user mapping vector.
  • the mapping model is a single mapping model. There is a one-to-one correspondence between the feature information and the mapping vector obtained based on the mapping model. Each feature information has a unique corresponding mapping vector, and each mapping vector There is a unique corresponding feature information. That is, based on the single mapping model, if any feature information is mapped, the mapping vector corresponding to the feature information can be obtained, and there is no other feature information corresponding to the mapping vector.
  • the mapping model includes multiple mapping sub-models, and the multiple mapping sub-models are used to map different types of feature information.
  • the mapping model includes a user mapping sub-model and a data mapping sub-model.
  • the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector
  • the data mapping sub-model is used to map the feature information of the candidate data to obtain the data mapping vector.
  • the feature information mapping process of the user identification and the feature information mapping process of the candidate data are performed simultaneously or sequentially.
  • the first feature information of the first object is mapped to the target space to obtain the first mapping vector corresponding to the first object in the target space, including the following two cases:
  • the feature information of the user identification is mapped to the target space, and the user mapping vector corresponding to the user identification in the target space is obtained; in the case that the first object is candidate data, based on the data mapping
  • the sub-model maps the feature information of the candidate data to the target space, and obtains the data mapping vector corresponding to the candidate data in the target space.
  • the first feature information of the second object is mapped to the target space to obtain the first mapping vector corresponding to the second object in the target space, including the following two cases:
  • the feature information of the user identification is mapped to the target space, and the user mapping vector corresponding to the user identification in the target space is obtained;
  • the second object is candidate data, based on the data mapping
  • the sub-model maps the feature information of the candidate data to the target space, and obtains the data mapping vector corresponding to the candidate data in the target space.
  • the server stores the feature information and the mapping vector obtained based on the mapping model mapping, the feature information of the user identification and the corresponding user mapping vector are stored correspondingly, and the feature information of the candidate data and the corresponding data mapping vector Corresponding storage, so that two mapping vectors can be determined later, and when a recommendation is made based on the objects corresponding to the two mapping vectors, the feature information corresponding to the mapping vector can be easily obtained.
  • the first feature information and the second feature information are mapped to the target space based on the mapping model.
  • the first feature information and the second feature information are simultaneously mapped based on the mapping model.
  • To the target space, or based on the mapping model first map the second feature information to the target space, and then map the first feature information to the target space. You only need to ensure that the distance between the first mapping vector and the second mapping vector is determined. It suffices that the second mapping vector exists in the target space.
  • the distance between the first mapping vector and the second mapping vector needs to be measured to determine the distance between the two. Therefore, it is necessary to define a metric to measure the distance between any two mapping vectors in the target space.
  • the metric needs to meet at least the following conditions:
  • the metric can be calculated by Euclidean metric.
  • mapping vectors On the premise of ensuring the distance between the mapping vectors, it is possible to embed as many mapping vectors as possible.
  • a consistent metric is defined in the embodiments of the present application, and the consistent metric is used to measure the distance between any two mapping vectors. Defined consistent metric for:
  • the maximum value between any two mapping vectors is calculated by L ⁇ distance, the maximum value is compared with a, and the minimum value is selected.
  • mapping vectors there are multiple mapping vectors in the target space.
  • the multiple mapping vectors are regarded as multiple points.
  • the line in Figure 3 refers to the The distance between is the line formed by connecting the points of a, the top view of Fig. 3 is shown in Fig. 4, the circle and the mapping vector in Fig. 4 The distance between is a, it is considered that any mapping vector in the part indicated by diagonal lines is the same as the mapping vector The distance between the two is relatively close, and any mapping vector in the area outside the circle is considered to be the same as the mapping vector The distance between them is far.
  • the consistent metric is a well-defined metric that meets the requirements of the metric definition.
  • the metric topological space induced by this metric is considered to be The distance between the vector is not less than a and The distance between is a, so even in a low-dimensional space, many vectors with equal distances can be mapped. If you need to adjust the mapping ability of the space, you only need to adjust a. The smaller a, the stronger the mapping ability; the larger the a, the weaker the mapping ability. A is equal to 0. The space is completely equivalent to the one induced by l ⁇ . Measure the topological space. Therefore, the above condition (3) is satisfied.
  • the distance between the first mapping vector and the second mapping vector is measured by the defined consistency metric, and the distance between the two mapping vectors is obtained.
  • the recommendation is made based on the first object and the second object.
  • the first object is the user identification and the second object is the candidate data, or the first object is the candidate data and the second object is the user identification, and recommendations are made based on the first object and the second object , Including: recommending candidate data to the user ID.
  • recommending candidate data to the user ID includes: sending the candidate data by the server to the terminal logged in with the user ID, and displaying the candidate data by the terminal, which can be viewed by the user.
  • the recommendation interface displayed by the terminal is shown in FIG. 6, and the recommendation interface includes a user avatar, following options, and recommendation options. Click the user’s avatar to view user information such as the user ID, click the follow option to view the articles published by other user IDs that the user ID follows, and click the recommendation option.
  • the recommendation interface displays articles that may be of interest to the user recommended by the user, as well as some popular articles. Interested articles are recommended based on the characteristic information identified by the user.
  • the preset distance is the minimum distance used to indicate that the user corresponding to the user identifier is interested in the candidate data.
  • the preset distance is randomly determined by the server or set according to needs. If the recommended accuracy rate is higher, it is recommended If the candidate data is more in line with the user's interest, a smaller preset distance is set; if it is necessary to obtain as much recommended candidate data as possible, a larger preset distance is set.
  • the user and the product are known, it is determined whether to recommend the product to the user.
  • obtain the user characteristic information corresponding to the user and the product characteristic information corresponding to the product and input the user characteristic information and the product characteristic information into the mapping model respectively.
  • obtain the distance between the user mapping vector and the product mapping vector If the distance is less than the preset distance, it means that the user is interested in the product. Recommend the product to the user. If the distance is not less than the preset distance, it means that the user is not interested in the product and there is no need to recommend the product to the user.
  • the feature information of the user identification can be input into the mapping model, and similar users can be obtained through a method similar to the embodiment of this application, and then the recommendation can be made; the feature information of the candidate data can be input To the mapping model, similar data is obtained by a method similar to the embodiment of the present application, and then the recommendation is made.
  • the characteristic information of the two users into the mapping model to obtain two corresponding user mapping vectors, and the difference between the two user mapping vectors is If the distance is less than the preset distance, the two users are considered to be similar users, and one user can be recommended to the other user.
  • the recommendation process does not involve other objects except the first object and the second object, that is, there is no need to obtain other objects, and it is not restricted by other objects during application.
  • the scope of application is not limited by other objects during application.
  • Fig. 5 is a flowchart of another recommendation method provided by an embodiment of the present application.
  • the execution subject of the embodiment of the present application is a server. Referring to FIG. 5, the method includes:
  • step 201 in the foregoing embodiment, and will not be repeated here.
  • mapping model Based on the mapping model, map the first feature information to the target space, and obtain a first mapping vector corresponding to the first feature information in the target space.
  • the target space includes a user mapping vector corresponding to the user identification and a data mapping vector corresponding to the candidate data.
  • the implementation manner in which the first feature information is mapped to obtain the corresponding first mapping vector in the embodiment of the present application is similar to the implementation manner of step 202 in the foregoing embodiment, and will not be repeated here.
  • mapping model in the embodiment of the present application is used for mapping, or other methods are used for mapping.
  • the third mapping vector and the first mapping vector belong to different categories. If the first mapping vector is the mapping vector identified by the user, the third mapping vector is the mapping vector of the candidate data; if the first mapping vector is the mapping vector of the candidate data Mapping vector, the third mapping vector is the mapping vector of the user identification.
  • the target space includes at least one third mapping vector, the position of each third mapping vector in the space is determined, and the distance between the first mapping vector and each third mapping vector is obtained according to the consistent metric defined in the target space .
  • the distance acquisition method in the embodiment of the present application is similar to the implementation method in step 203 described above, and will not be repeated here.
  • the second mapping vector is selected from the third mapping vector whose distance from the first mapping vector is less than the preset distance.
  • a second mapping vector whose distance from the first mapping vector is less than a preset distance when selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance, optionally, there are one or more second mapping vectors selected. Among them, the number of selected second mapping vectors is set as required.
  • the second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
  • the first object is the user identification and the second object is the candidate data, or the first object is the candidate data and the second object is the user identification, and recommendations are made based on the first object and the second object , Including: recommending candidate data to the user ID.
  • recommending candidate data to the user ID includes: sending the candidate data by the server to the terminal logged in with the user ID, and displaying the candidate data by the terminal, which can be viewed by the user.
  • the server stores the corresponding relationship between each mapping vector and the corresponding object, and by querying the corresponding relationship, the object corresponding to each mapping vector is determined.
  • the second mapping vector is inversely mapped to obtain the second object corresponding to the second mapping vector.
  • the de-mapping model is used to de-map the mapping vector, and the mapping vector is de-mapped to the original space through the de-mapping model to obtain corresponding feature information.
  • the de-mapping model is a SLR mapping model, that is, each mapping vector has a one-to-one correspondence with the feature information obtained based on the de-mapping model de-mapping, and each mapping vector has a unique corresponding feature Information, each feature information also has a unique corresponding mapping vector.
  • the anti-mapping model is a non-SLR mapping model, that is, each mapping vector has unique corresponding feature information, but one feature information may correspond to multiple mapping vectors.
  • the de-mapping model is a user de-mapping model
  • the de-mapping model is a data de-mapping model
  • the characteristic information of the user ID is obtained, and based on the mapping model, the characteristic information is mapped to the target space, and the user mapping corresponding to the characteristic information is obtained.
  • Vector and then determine the data mapping vector of at least one candidate data in the target space, obtain the distance between the user mapping vector and each data mapping vector, and select the distance to the user mapping vector from at least one data mapping vector For the data mapping vector less than the preset distance, the candidate data corresponding to the selected data mapping vector is determined, and the selected candidate data is recommended to the user identification.
  • the first object is the candidate data and the second object is the user identification
  • obtain the feature information of the candidate data map the feature information to the target space based on the mapping model, obtain the data mapping vector corresponding to the feature information, and then determine the target
  • the mapping vector determines the user identification corresponding to the selected user mapping vector, and then recommends the candidate data to the selected user identification.
  • the feature information of the first user identification is obtained, based on the mapping model, the feature information is mapped to the target space, and the first user mapping vector corresponding to the feature information is obtained, and then Determine the second user mapping vector of at least one second user identifier in the target space, obtain the distance between the first user mapping vector and each second user mapping vector, and select the second user mapping vector from the at least one second user mapping vector.
  • a second user mapping vector whose distance between the user mapping vectors is less than the preset distance is determined, and the second user identification corresponding to the selected second user mapping vector is determined, and the user represented by the first user identification is considered to be the same as the second user identification.
  • the interests of the users are similar, and the selected second user ID is recommended to the first user ID.
  • the candidate data is a commodity
  • the first object and the second object are both commodities
  • the characteristic information of the first commodity is obtained, and the characteristic information is mapped to the target space based on the mapping model to obtain the first data corresponding to the characteristic information Mapping vector, and then determine the second data mapping vector of at least one second product in the target space, obtain the distance between the first data mapping vector and each second data mapping vector, from the at least one second data mapping vector, Select the second data mapping vector whose distance from the first data mapping vector is less than the preset distance, and determine the second product corresponding to the selected second data mapping vector.
  • the first product and the second product are considered to be similar.
  • the second product is recommended to users who have purchased the first product.
  • the recommendation process does not involve other objects except the first object and the second object, that is, there is no need to obtain other objects, and it is not restricted by other objects during application.
  • the scope of application is not limited by other objects during application.
  • the first object is a user ID and the second object is candidate data
  • only the characteristic information of the user ID needs to be mapped to the target space, and the distance from the user mapping vector of the user ID is less than the preset value.
  • the data mapping vector of the distance is used to determine the candidate data that the user is interested in corresponding to the user ID, without the need to indirectly obtain the candidate data that the user is interested in corresponding to the user ID based on other user IDs or candidate data.
  • the first object is the candidate data and the second object is the user identification
  • the user mapping vector of the distance is used to determine the user identification interested in the candidate data, without the need to obtain the user identification interested in the candidate data indirectly based on other candidate data or user identification, which expands the scope of application.
  • this method can also absoluteize the interest points of the user identification, make the user interest more clear, and realize the inference of the characteristics of the candidate data that the user likes when there is no candidate data.
  • the mapping model and the de-mapping model are involved.
  • an autoencoder can be used.
  • the autoencoder includes an encoding model and a decoding model.
  • the encoding model is used as the mapping model
  • the decoding model is used as the de-mapping model. The following describes the training process of the autoencoder.
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data and the sample label, and the sample label is used to indicate whether to identify the recommended sample data to the sample user.
  • the sample label is 1 or -1.
  • 1 indicates that the sample user ID and sample data have a positive relationship, which means that the sample data is recommended to the user ID;
  • -1 indicates that the sample user ID and the sample data have a negative relationship, which means that there is no Recommend sample data to user identification.
  • the feature information of the sample user identification is similar to the feature information of the above-mentioned user identification, and the feature information of the sample data is similar to the feature information of the above-mentioned candidate data, and will not be repeated here.
  • the feature information of the sample user identification and the feature information of the sample data are input to the autoencoder, and the feature information of the predicted user identification or the feature information of the predicted sample data is output based on the self-encoder, and the predicted feature information and the corresponding input feature are based on The loss value generated between the information, the parameters of the autoencoder are adjusted, so that the adjusted loss value between the predicted feature information output by the autoencoder and the corresponding input feature information is reduced, so as to achieve the training of the The purpose of the self-encoder.
  • the structure of the autoencoder is shown in Figure 7, including the encoding model and the decoding model, and the feature vector Input to the coding model and get a corresponding mapping vector Use the decoding model to map the vector Decode the corresponding prediction feature vector
  • the encoding model and the decoding model further include multiple hidden layers.
  • the mapping vector is obtained based on the coding model, it is predicted between the mapping vector corresponding to the user identification and the mapping vector corresponding to the sample data The distance between the sample user ID and the sample data is negative or positive, and the obtained relationship is compared with the relationship represented by the input sample label, and the parameters of the autoencoder are adjusted to make the autoencoder after adjustment.
  • the prediction relationship of the encoder is the same as the relationship represented by the sample label, which achieves the purpose of training the autoencoder.
  • the loss function for training the autoencoder includes the following:
  • the embodiment of the present application provides two loss functions, and the first loss function is:
  • L neck1 is the loss value of the mapping model, Is the sample label, its value is 1 or -1, Is the mapping vector corresponding to the characteristic information of the sample user identification, Is the mapping vector corresponding to the feature information of the sample data.
  • the distance between the mapping vector corresponding to the user identifier and the mapping vector corresponding to the candidate data under the consistent metric is obtained, and the distance is multiplied by the corresponding label data as the first loss function .
  • the sample user ID and sample data in the training sample show a positive relationship.
  • the sample user ID and The sample data is mapped to the metric space, and the mapping vector corresponding to the sample user ID is obtained And the mapping vector corresponding to the sample data get with Refer to Figure 8 for the consistent measurement distance of the two vectors. The distance is greater than the preset distance. At this time, the gradient is 0. On the basis of this distance, the method of decreasing gradient cannot be used to continue training.
  • the second loss function is:
  • L neck2 is the first loss value of the mapping model
  • ⁇ margin is the preset parameter
  • Is the sample label its value is 1 or -1
  • Is the mapping vector corresponding to the characteristic information of the sample user identification Is the mapping vector corresponding to the feature information of the sample data.
  • the second type of loss function namely hinge loss (a loss function)
  • hinge loss a loss function
  • the loss value of the loss function in the diagonal area is higher.
  • the loss value of the loss function in the blank area is large, and the arrow direction indicates the hope vector The direction of movement so that versus The distance between them is as far as possible.
  • the picture on the left shows the training using the first loss function
  • the dashed circle is and The distance between a
  • the figure on the right shows the second loss function for training.
  • the dashed circle is the target safety limit of the negative sample.
  • the target safety limit is the distance obtained by adding a certain value to the distance a.
  • the use of the target safety limit makes the relationship between the user identification obtained by training and the sample data more accurate. In this case, the results of training with the two loss functions are the same.
  • the loss value of the loss function in the oblique area is small
  • the loss value of the loss function in the blank area is larger
  • the arrow direction indicates the desired vector The direction of movement so that versus The distance between them is as close as possible.
  • the picture on the left shows the use of the first loss function for training
  • the picture on the right shows the use of the second loss function for training.
  • the left picture shows the situation where training cannot be performed as shown in FIG. 8, and the second loss function in the right picture can avoid the situation where training cannot be performed.
  • the second type is the first type:
  • L cov is the second loss value of the mapping model
  • N is the number of sample data
  • E is with The formed matrix
  • Cov(E) is the covariance matrix of matrix E
  • f is the transposition function
  • diag( ⁇ ) is the diagonal element extraction function of the matrix.
  • the third type is the third type.
  • L reconstruct is the loss value of the self-encoder, Is the characteristic information identified by the sample user or the characteristic information of the sample data, for Based on the feature information output after processing by the autoencoder.
  • the activation function that enters the embedding layer in the last layer of the coding model needs to be a bounded activation function, such as a bounded function such as a Sigmoid (a bounded activation function) function, a tanh (hyperbolic tangent) function, etc.
  • the output feature information includes numeric features and binary features, it needs to be standardized during processing, that is, the value obtained is between 0 and 1. Therefore, in the decoding model, the value range of the activation function of the last layer entering the output layer needs to be between 0 and 1.
  • the activation function is a Sigmoid function or other functions.
  • the autoencoder trained by the above method is used to reconstruct the user identification or the interest manifold structure of the candidate data in a low-dimensional space, as shown in Figure 12, where the triangle represents a category of The mapping vector, the circle represents the mapping vector of another category.
  • What forms the interest manifold structure is the mapping vector corresponding to all user identities or candidate data.
  • the distance between the two mapping vectors can directly indicate the "favoring" relationship of the two mapping vectors. The closer the distance, the stronger the favoring relationship. The farther the distance, the weaker the relationship.
  • each mapping vector will have a clustering effect, that is, similar user identifications will be aggregated, similar candidate data will be aggregated, and the user identification and the user identification The recommended candidate data will also be aggregated.
  • the mapping vector included in the interest manifold structure can be decoded through the decoding model to obtain the feature information of the corresponding user identification or the feature information of the candidate data.
  • the circular area represents a part of the interest manifold structure after mapping
  • the mapping vector in this area obtains the decoded manifold structure through the decoding model.
  • the manifold structure obtained after decoding has continuity, so the interest manifold structure will not lose the mapping vector due to the decoding process.
  • the decoding model is not a single decoding model, that is, after multiple mapping vectors of the same category are decoded by the decoding model, the same feature information may be obtained. Therefore, the structure obtained after decoding may have a cross region.
  • the first point that needs to be explained is that the embodiment of the present application only takes the training process of an autoencoder as an example for description.
  • a self-encoder is used in the above embodiment, a double-self-encoder can be used.
  • the structure of the double-self-encoder is shown in Figure 14.
  • One of the self-encoders is used to encode and decode the characteristic information of the user identification, and the other is used to Encode and decode the feature information of the candidate data.
  • a corresponding number of multiple autoencoders can be used to respectively encode and decode the feature information of each category.
  • the target space is a high-dimensional Gaussian probability distribution space, in which KLD (Kullback–Leibler divergence, KL divergence) pairs are defined in The distance between the distributions is defined.
  • KLD Kullback–Leibler divergence, KL divergence
  • the embedding layer can be added to the original model, or other models such as wide&deep (a deep learning model) can be used to replace the basic MLP (Multilayer Perceptron, artificial neural network). ) To better obtain the information in the coefficient data.
  • MLP Multilayer Perceptron, artificial neural network
  • the fourth point that needs to be explained is that if the input data is time-series data, a neural network that changes over time can be used.
  • a neural network that changes over time can be used.
  • RNN Recurrent Neural Network
  • LSTM Long Short-Term Memory
  • Bayesian prior The posterior timing update or the use of Kalman filter for time series data learning.
  • CML Cold Management Metric Learning
  • This method is based on the known relationship between the user identifier and the candidate data in the original space and the target space. The corresponding vector is moved to obtain a distance relationship similar to the original space.
  • This method is used for fixed user identification and candidate data. This method has a small range of use, and needs to obtain the relationship between the user identification and candidate data in the original space. When recommending new user identification or candidate data, This method is not applicable.
  • FIG 16. For example, using collaborative metric learning to recommend products for users, see Figure 16.
  • the circle in the figure represents the user, the triangle represents the product the user likes, the rectangle represents the product the user dislikes, and the arrow is used to indicate the direction of the product.
  • the left picture is the original position of the product and the user.
  • the result in the right picture is obtained, so that the product that the user likes is close to the user, and the product that the user does not like is far away from the user.
  • the users and commodities in the space are fixed, and only fixed commodities can be recommended for the fixed users. If there are no commodities in the space, it is impossible to infer the commodities that the user may like.
  • the relationship between some user IDs and some candidate data must be known, otherwise the vector cannot be moved in the target space according to the known relationship .
  • a new user ID or new candidate data because the relationship between the new user ID and new candidate data and other user IDs and candidate data is unknown, it is impossible to determine the new user ID and new candidate data.
  • the position of the vector corresponding to the candidate data in the target space cannot be recommended.
  • the embodiment of the present application does not need to obtain the relationship between the user identification and the candidate data in advance, and is applicable to any user identification or candidate data, which expands the scope of application.
  • t-SNE student-t Stochastic Neighborhood Embedding, an algorithm
  • the principle of the algorithm is: the distance relationship between any two feature vectors in the high-dimensional space should be similar to the distance relationship between any two mapping vectors in the low-dimensional space. Assuming that there are two feature vectors in the high-dimensional space, the The two eigenvectors are far apart in the high-dimensional space, so in the low-dimensional space, the two mapping vectors corresponding to the two eigenvectors should also be far away, and vice versa. If there are multiple feature vectors in the original high-dimensional space, such as n feature vectors, then the low-dimensional space will have n corresponding mapping vectors.
  • Figure 15 The effect achieved by this method is shown in Figure 15.
  • the first figure on the left is the manifold structure composed of multiple feature vectors in the original high-dimensional space
  • the second figure is the mapping of multiple feature vectors in the original high-dimensional space to the low-dimensional
  • the manifold structure formed by the mapping vector obtained after the space, and then the manifold structure of the third graph and the fourth graph are obtained in turn, and finally the manifold structure of the fifth graph is obtained, which realizes the reconstruction of the high-dimensional space in the low-dimensional space.
  • VaeCF Vehicle Autoencoder Collaborative Filtering, an in-depth model
  • an in-depth model is also used for data recommendation.
  • This method can accurately obtain the relationship between user identification and candidate data.
  • the candidate data recommended for the user ID cannot be obtained, that is, the user's interest cannot be obtained based on the user ID.
  • the candidate data recommended for the user identification can be obtained based on the self-encoder, or, according to the characteristic information of any candidate data, it can be based on the self-encoder.
  • the encoder obtains the user identification that is interested in the candidate data, and then recommends the candidate data.
  • This method has a wide range of use and can make recommendations based on the user identification or the feature information of one of the candidate data.
  • the characteristic information of the user identification and the characteristic information of the candidate data must be obtained, otherwise the problem of recommendation cannot be made.
  • FIG. 18 is a schematic structural diagram of a recommendation device provided by an embodiment of the present application. Referring to Figure 18, the device includes:
  • the first information obtaining module 1801 is configured to obtain first characteristic information of a first object, and the first object belongs to user identification or candidate data;
  • the first mapping module 1802 is configured to map the first feature information to the target space based on the mapping model to obtain the first mapping vector corresponding to the first object in the target space.
  • the target space includes the user mapping vector corresponding to the user identification and The data mapping vector corresponding to the candidate data;
  • the recommendation module 1803 is configured to make recommendations based on the first object and the second object according to the distance between any two mapping vectors in the target space, where the second mapping vector is a vector corresponding to the second object in the target space, and the first The distance between the second mapping vector and the first mapping vector is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  • the device further includes:
  • the second information obtaining module 1804 is configured to obtain second characteristic information of the second object
  • the second mapping module 1805 is configured to map the second feature information to the target space based on the mapping model to obtain a second mapping vector corresponding to the second object in the target space;
  • the recommended module 1803 also includes:
  • the first distance obtaining unit 18031 is configured to obtain the distance between the first mapping vector and the second mapping vector
  • the first recommendation unit 18032 is configured to make a recommendation based on the first object and the second object if the distance is less than the preset distance.
  • the recommendation module 1803 includes:
  • the vector determining unit 18033 is configured to determine at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
  • the second distance obtaining unit 18034 is configured to obtain the distance between the first mapping vector and each third mapping vector
  • the vector selecting unit 18035 is configured to select, from at least one third mapping vector, a second mapping vector whose distance from the first mapping vector is less than a preset distance;
  • the second recommendation unit 18036 is configured to determine the second object corresponding to the second mapping vector, and make recommendations based on the first object and the second object.
  • the second recommendation unit 18036 is further configured to perform inverse mapping on the second mapping vector based on the inverse mapping model to obtain second feature information corresponding to the second mapping vector, and determine the second object to which the second feature information belongs.
  • the device further includes:
  • the first sample acquisition module 1806 is used to acquire sample information, the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label, and the sample label is used to indicate whether to identify the recommended sample data to the sample user;
  • the first training module 1807 is used to train the mapping model according to the sample information.
  • the device further includes a loss function for training the mapping model, including at least one of the following:
  • L neck is the first loss value of the mapping model
  • ⁇ margin is the preset parameter, Is the sample label, Is the mapping vector corresponding to the sample user ID, Is the mapping vector corresponding to the sample data;
  • L cov is the second loss value of the mapping model
  • N is the number of sample information
  • E is with The formed matrix
  • Cov(E) is the covariance matrix of matrix E
  • f is the transposition function
  • diag( ⁇ ) is the diagonal element extraction function of the matrix.
  • the mapping model is an encoding model in the autoencoder; the device further includes:
  • the second sample acquisition module 1808 is used to acquire sample information, the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label, and the sample label is used to indicate whether to identify the recommended sample data to the sample user;
  • the second training module 1809 is used to train the autoencoder according to the sample information.
  • the device further includes a loss function for training the autoencoder, including at least:
  • L reconstruct is the loss value of the self-encoder, Characteristic information identified by the sample user or characteristic information of the sample data, for Based on the feature information output after processing by the autoencoder.
  • the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification;
  • the recommendation module 1803 is also used to recommend candidate data to the user identification.
  • the mapping model includes a user mapping sub-model and a data mapping sub-model
  • the user mapping sub-model is used to map the characteristic information of the user identification to obtain the user mapping vector
  • the data mapping sub-model is used to map the feature information of the candidate data to obtain the data mapping vector.
  • the recommendation device provided in the above embodiment only uses the division of the above functional modules for illustration. In practical applications, the above functions can be allocated by different functional modules to complete all or all of the above descriptions according to needs. Part of the function.
  • the recommending device provided in the foregoing embodiment and the recommending method embodiment belong to the same concept, and the implementation process is detailed in the method embodiment, which will not be repeated here.
  • FIG. 21 is a schematic structural diagram of a terminal 2100 provided by an embodiment of the present application.
  • the terminal 2100 includes a processor 2101 and a memory 2102.
  • the processor 2101 includes one or more processing cores, such as a 4-core processor, an 8-core processor, and so on.
  • the processor 2101 adopts at least one hardware form among DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array, Programmable Logic Array).
  • the processor 2101 also includes a main processor and a coprocessor.
  • the main processor is a processor used to process data in the awake state, also called a CPU; the coprocessor is used to process data in a standby state. Low-power processor for processing.
  • the processor 2101 is integrated with a GPU (Graphics Processing Unit, image processing interactor), and the GPU is used to render and draw content that needs to be displayed on the display screen.
  • the processor 2101 further includes an AI (Artificial Intelligence) processor, and the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 2102 includes one or more computer-readable storage media, which are non-transitory.
  • the memory 2102 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 2102 is used to store at least one instruction, and the at least one instruction is used by the processor 2101 to implement the recommended method provided in the method embodiment of the present application. .
  • the terminal 2100 may optionally further include: a peripheral device interface 2103 and at least one peripheral device.
  • the processor 2101, the memory 2102, and the peripheral device interface 2103 are connected by a bus or signal line.
  • Each peripheral device is connected to the peripheral device interface 2103 through a bus, a signal line or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 2104, a display screen 2105, a camera component 2106, an audio circuit 2107, a positioning component 2108, and a power supply 2109.
  • the peripheral device interface 2103 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 2101 and the memory 2102.
  • the processor 2101, the memory 2102, and the peripheral device interface 2103 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 2101, the memory 2102, and the peripheral device interface 2103 or The two are implemented on separate chips or circuit boards, which are not limited in this embodiment.
  • the radio frequency circuit 2104 is used to receive and transmit RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 2104 communicates with a communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 2104 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the radio frequency circuit 2104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and so on.
  • the radio frequency circuit 2104 communicates with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 8G), wireless local area networks, and/or WiFi (Wireless Fidelity, wireless fidelity) networks.
  • the radio frequency circuit 2104 also includes a circuit related to NFC (Near Field Communication), which is not limited in this application.
  • the display screen 2105 is used to display UI (User Interface, user interface).
  • the UI includes graphics, text, icons, videos, and any combination of them.
  • the display screen 2105 also has the ability to collect touch signals on or above the surface of the display screen 2105.
  • the touch signal is input to the processor 2101 as a control signal for processing.
  • the display screen 2105 is also used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • one display screen 2105 is provided with the front panel of the terminal 2100; in other embodiments, there are at least two display screens 2105, which are respectively provided on different surfaces of the terminal 2100 or in a folding design;
  • the display screen 2105 is a flexible display screen, which is arranged on the curved surface or the folding surface of the terminal 2100.
  • the display screen 2105 can also be set as a non-rectangular irregular pattern, that is, a special-shaped screen.
  • the display screen 2105 is made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
  • the camera assembly 2106 is used to capture images or videos.
  • the camera assembly 2106 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal 2100
  • the rear camera is set on the back of the terminal 2100.
  • there are at least two rear cameras each of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function, the main camera Integrate with the wide-angle camera to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions.
  • the camera assembly 2106 also includes a flash.
  • the flash is a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash used for light compensation under different color temperatures.
  • the audio circuit 2107 includes a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 2101 for processing, or input to the radio frequency circuit 2104 to implement voice communication.
  • the microphone is an array microphone or an omnidirectional acquisition microphone.
  • the speaker is used to convert the electrical signal from the processor 2101 or the radio frequency circuit 2104 into sound waves.
  • the speaker is a traditional thin-film speaker or a piezoelectric ceramic speaker.
  • the speaker When the speaker is a piezoelectric ceramic speaker, it not only converts the electrical signal into human audible sound waves, but also converts the electrical signal into human inaudible sound waves for purposes such as distance measurement.
  • the audio circuit 2107 also includes a headphone jack.
  • the positioning component 2108 is used to locate the current geographic location of the terminal 2100 to implement navigation or LBS (Location Based Service, location-based service).
  • the positioning component 2108 is a positioning component based on the GPS (Global Positioning System, Global Positioning System) of the United States, the Beidou system of China, the Granus system of Russia, or the Galileo system of the European Union.
  • the power supply 2109 is used to supply power to various components in the terminal 2100.
  • the power source 2109 is alternating current, direct current, disposable batteries or rechargeable batteries.
  • the rechargeable battery supports wired charging or wireless charging.
  • the rechargeable battery is also used to support fast charging technology.
  • the terminal 2100 further includes one or more sensors 2110.
  • the one or more sensors 2110 include, but are not limited to: an acceleration sensor 2111, a gyroscope sensor 2112, a pressure sensor 2113, a fingerprint sensor 2114, an optical sensor 2115, and a proximity sensor 2116.
  • the acceleration sensor 2111 detects the magnitude of acceleration on the three coordinate axes of the coordinate system established by the terminal 2100. For example, the acceleration sensor 2111 is used to detect the components of gravitational acceleration on three coordinate axes.
  • the processor 2101 controls the display screen 2105 to display the user interface in a horizontal view or a vertical view according to the gravitational acceleration signal collected by the acceleration sensor 2111.
  • the acceleration sensor 2111 is also used for the collection of game or user motion data.
  • the gyroscope sensor 2112 detects the body direction and rotation angle of the terminal 2100, and the gyroscope sensor 2112 and the acceleration sensor 2111 cooperate to collect the user's 3D actions on the terminal 2100.
  • the processor 2101 implements the following functions based on the data collected by the gyroscope sensor 2112: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 2113 is arranged on the side frame of the terminal 2100 and/or the lower layer of the display screen 2105.
  • the processor 2101 performs left and right hand recognition or quick operation according to the holding signal collected by the pressure sensor 2113.
  • the processor 2101 controls the operability controls on the UI interface according to the user's pressure operation on the display screen 2105.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 2114 is used to collect the user's fingerprint.
  • the processor 2101 identifies the user's identity according to the fingerprint collected by the fingerprint sensor 1414, or the fingerprint sensor 2114 identifies the user's identity according to the collected fingerprint. When it is recognized that the user's identity is a trusted identity, the processor 2101 authorizes the user to have related sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
  • the fingerprint sensor 2114 is provided on the front, back or side of the terminal 2100. When a physical button or manufacturer logo is provided on the terminal 2100, the fingerprint sensor 2114 is integrated with the physical button or manufacturer logo.
  • the optical sensor 2115 is used to collect the ambient light intensity.
  • the processor 2101 controls the display brightness of the display screen 2105 according to the ambient light intensity collected by the optical sensor 2115.
  • the processor 2101 also dynamically adjusts the shooting parameters of the camera assembly 2106 according to the ambient light intensity collected by the optical sensor 2115.
  • the proximity sensor 2116 also called a distance sensor, is usually arranged on the front panel of the terminal 2100.
  • the proximity sensor 2116 is used to collect the distance between the user and the front of the terminal 2100.
  • the processor 2101 controls the display screen 2105 to switch from the on-screen state to the off-screen state; when the proximity sensor 2116 detects When the distance between the user and the front of the terminal 2100 gradually increases, the processor 2101 controls the display screen 2105 to switch from the screen-on state to the screen-on state.
  • FIG. 21 does not constitute a limitation to the terminal 2100, and can also include more or less components than those shown in the figure, or combine some components, or adopt different component arrangements.
  • FIG. 22 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server 2200 may have relatively large differences due to different configurations or performance, including one or more processors (Central Processing Units, CPU) 2201 and one or There are more than one memory 2202, where at least one instruction is stored in the memory 2202, and at least one instruction is loaded and executed by the processor 2201 to implement the methods provided by the foregoing method embodiments.
  • the server also has components such as a wired or wireless network interface, a keyboard, and an input and output interface for input and output.
  • the server also includes other components for implementing device functions, which will not be repeated here.
  • the server 2200 is configured to execute the steps executed by the server in the above-mentioned recommendation method.
  • the embodiment of the present application also provides a computer device, the computer device includes a processor and a memory, at least one piece of program code is stored in the memory, and the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the first feature information is mapped to the target space, and the first mapping vector corresponding to the first object in the target space is obtained.
  • the target space includes the user mapping vector corresponding to the user identification and the data mapping corresponding to the candidate data vector;
  • the second mapping vector is the vector corresponding to the second object in the target space, and the second mapping vector is the same as the first object.
  • the distance between the mapping vectors is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • mapping model Based on the mapping model, map the second feature information to the target space to obtain a second mapping vector corresponding to the second object in the target space;
  • a recommendation is made based on the first object and the second object.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • mapping vector From at least one third mapping vector, selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance
  • the second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the second mapping vector is inversely mapped to obtain the second feature information corresponding to the second mapping vector, and the second object to which the second feature information belongs is determined.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label.
  • the sample label is used to indicate whether to identify and recommend the sample data to the sample user;
  • mapping model is trained.
  • the loss function used to train the mapping model includes at least one of the following:
  • L neck is the first loss value of the mapping model
  • ⁇ margin is the preset parameter, Is the sample label, Is the mapping vector corresponding to the sample user ID, Is the mapping vector corresponding to the sample data;
  • L cov is the second loss value of the mapping model
  • N is the number of sample information
  • E is with The formed matrix
  • Cov(E) is the covariance matrix of matrix E
  • f is the transposition function
  • diag( ⁇ ) is the diagonal element extraction function of the matrix.
  • mapping model is an encoding model in the autoencoder; when the at least one piece of program code is loaded and executed by the processor, the following steps are implemented:
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label.
  • the sample label is used to indicate whether to identify and recommend the sample data to the sample user;
  • the autoencoder is trained.
  • the loss function used to train the autoencoder includes at least:
  • L reconstruct is the loss value of the self-encoder, Is the characteristic information identified by the sample user or the characteristic information of the sample data, for Based on the feature information output after processing by the autoencoder.
  • the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification; the at least one piece of program code is loaded and executed by the processor to achieve The following steps:
  • the mapping model includes a user mapping sub-model and a data mapping sub-model; the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector; the data mapping sub-model is used to map the feature information of the candidate data Perform the mapping to obtain the data mapping vector.
  • the embodiment of the present application also provides a computer-readable storage medium, in which at least one piece of program code is stored, and the at least one piece of program code is loaded and executed by a processor to implement the following steps:
  • the first feature information is mapped to the target space, and the first mapping vector corresponding to the first object in the target space is obtained.
  • the target space includes the user mapping vector corresponding to the user identification and the data mapping corresponding to the candidate data vector;
  • mapping vector is the vector corresponding to the second object in the target space.
  • the distance between the mapping vectors is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • mapping model Based on the mapping model, map the second feature information to the target space to obtain a second mapping vector corresponding to the second object in the target space;
  • a recommendation is made based on the first object and the second object.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • mapping vector From at least one third mapping vector, selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance
  • the second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the second mapping vector is inversely mapped to obtain the second feature information corresponding to the second mapping vector, and the second object to which the second feature information belongs is determined.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label.
  • the sample label is used to indicate whether to identify and recommend the sample data to the sample user;
  • mapping model is trained.
  • the loss function used to train the mapping model includes at least one of the following:
  • L neck is the first loss value of the mapping model
  • ⁇ margin is the preset parameter, Is the sample label, Is the mapping vector corresponding to the sample user ID, Is the mapping vector corresponding to the sample data;
  • L cov is the second loss value of the mapping model
  • N is the number of sample information
  • E is with The formed matrix
  • Cov(E) is the covariance matrix of matrix E
  • f is the transposition function
  • diag( ⁇ ) is the diagonal element extraction function of the matrix.
  • mapping model is an encoding model in the autoencoder; when the at least one piece of program code is loaded and executed by the processor, the following steps are implemented:
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label.
  • the sample label is used to indicate whether to identify and recommend the sample data to the sample user;
  • the autoencoder is trained.
  • the loss function used to train the autoencoder includes at least:
  • L reconstruct is the loss value of the self-encoder, Characteristic information identified by the sample user or characteristic information of the sample data, for Based on the feature information output after processing by the autoencoder.
  • the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification; the at least one piece of program code is loaded and executed by the processor to achieve The following steps:
  • the mapping model includes a user mapping sub-model and a data mapping sub-model; the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector; the data mapping sub-model is used to map the feature information of the candidate data Perform the mapping to obtain the data mapping vector.
  • the embodiment of the present application also provides a computer program in which at least one piece of program code is stored, and the at least one piece of program code is loaded and executed by a processor to implement the following steps:
  • the first feature information is mapped to the target space, and the first mapping vector corresponding to the first object in the target space is obtained.
  • the target space includes the user mapping vector corresponding to the user identification and the data mapping corresponding to the candidate data vector;
  • mapping vector is the vector corresponding to the second object in the target space.
  • the distance between the mapping vectors is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • mapping model Based on the mapping model, map the second feature information to the target space to obtain a second mapping vector corresponding to the second object in the target space;
  • a recommendation is made based on the first object and the second object.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • mapping vector From at least one third mapping vector, selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance
  • the second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the second mapping vector is inversely mapped to obtain the second feature information corresponding to the second mapping vector, and the second object to which the second feature information belongs is determined.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label.
  • the sample label is used to indicate whether to identify and recommend the sample data to the sample user;
  • mapping model is trained.
  • the loss function used to train the mapping model includes at least one of the following:
  • L neck is the first loss value of the mapping model
  • ⁇ margin is the preset parameter, Is the sample label, Is the mapping vector corresponding to the sample user ID, Is the mapping vector corresponding to the sample data;
  • L cov is the second loss value of the mapping model
  • N is the number of sample information
  • E is with The formed matrix
  • Cov(E) is the covariance matrix of matrix E
  • f is the transposition function
  • diag( ⁇ ) is the diagonal element extraction function of the matrix.
  • mapping model is an encoding model in the autoencoder; when the at least one piece of program code is loaded and executed by the processor, the following steps are implemented:
  • the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label.
  • the sample label is used to indicate whether to identify and recommend the sample data to the sample user;
  • the autoencoder is trained.
  • the loss function used to train the autoencoder includes at least:
  • L reconstruct is the loss value of the self-encoder, Is the characteristic information identified by the sample user or the characteristic information of the sample data, for Based on the feature information output after processing by the autoencoder.
  • the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification; the at least one piece of program code is loaded and executed by the processor to achieve The following steps:
  • the mapping model includes a user mapping sub-model and a data mapping sub-model; the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector; the data mapping sub-model is used to map the feature information of the candidate data Perform the mapping to obtain the data mapping vector.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and device for making a recommendation, a computer device, and a storage medium, related to the technical field of computers. The method comprises: acquiring first characteristic information of a first object; mapping the first characteristic information to a target space on the basis of a mapping model to produce a first mapping vector corresponding to the first object in the target space; acquiring, on the basis of the distance between any two mapping vectors in the target space, a second object corresponding to a second mapping vector from which the distance to the first mapping vector is less than a preset distance, and making a recommendation on the basis of the first object and of the second object. The method does not involve other objects besides the first object and the second object during a recommendation process, that is, the method obviates the need to acquire the other objects besides the first object, and is not limited by the other objects during application, thus expanding the range of applications.

Description

推荐方法、装置、计算机设备及存储介质Recommended method, device, computer equipment and storage medium
本申请要求于2019年10月25日提交、申请号为201911026124.6、发明名称为“推荐方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on October 25, 2019, the application number is 201911026124.6, and the invention title is "Recommended method, device, computer equipment and storage medium", the entire content of which is incorporated into this application by reference .
技术领域Technical field
本申请实施例涉及计算机技术领域,特别涉及一种推荐方法、装置、计算机设备及存储介质。The embodiments of the present application relate to the field of computer technology, and in particular, to a recommendation method, device, computer equipment, and storage medium.
背景技术Background technique
随着计算机技术的发展,越来越多的用户使用电子设备购买商品、阅读文章或者观看视频等,而随着数据规模的逐渐扩大,如何为用户推荐商品、文章或者视频等数据,成为亟待解决的问题。With the development of computer technology, more and more users use electronic devices to buy goods, read articles or watch videos, etc. As the scale of data gradually expands, how to recommend products, articles or videos for users has become an urgent solution The problem.
在为用户推荐数据时,选取与用户之前处理过的数据相似的数据,将该相似的数据推荐给用户。例如将与用户之前购买过的商品相似的商品推荐给用户。When recommending data for the user, select data similar to the data that the user has processed before, and recommend the similar data to the user. For example, recommend to the user products that are similar to the products that the user has previously purchased.
但是,采用上述方案必须要获取用户之前处理过的数据,才能进行推荐,局限性强,应用范围小。However, the use of the above-mentioned scheme must obtain the data that the user has processed before in order to make a recommendation, which has strong limitations and small application scope.
发明内容Summary of the invention
本申请实施例提供了一种推荐方法、装置、计算机设备及存储介质,扩展了应用范围。所述技术方案如下:The embodiments of the present application provide a recommendation method, device, computer equipment, and storage medium, which expand the scope of application. The technical solution is as follows:
一方面,提供了一种推荐方法,应用于服务器,所述方法包括:On the one hand, a recommendation method is provided, which is applied to a server, and the method includes:
获取第一对象的第一特征信息,所述第一对象属于用户标识或备选数据;Acquiring first characteristic information of a first object, where the first object belongs to a user identification or candidate data;
基于映射模型,将所述第一特征信息映射至目标空间,得到所述第一对象在所述目标空间中对应的第一映射向量,所述目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;Based on the mapping model, the first feature information is mapped to a target space to obtain a first mapping vector corresponding to the first object in the target space, and the target space includes a user mapping vector corresponding to a user identification and The data mapping vector corresponding to the candidate data;
根据所述目标空间中任两个映射向量之间的距离,基于所述第一对象及第二对象进行推荐,其中,第二映射向量为所述第二对象在所述目标空间中对应的向量,所述第二映射向量与所述第一映射向量之间的距离小于预设距离,且所述第二映射向量与所述第一映射向量属于不同类别。According to the distance between any two mapping vectors in the target space, a recommendation is made based on the first object and the second object, where the second mapping vector is a vector corresponding to the second object in the target space , The distance between the second mapping vector and the first mapping vector is less than a preset distance, and the second mapping vector and the first mapping vector belong to different categories.
另一方面,提供了一种推荐装置,所述装置包括:In another aspect, a recommendation device is provided, the device including:
第一信息获取模块,用于获取第一对象的第一特征信息,所述第一对象属于用户标识或备选数据;A first information acquisition module, configured to acquire first characteristic information of a first object, the first object belonging to a user identification or candidate data;
第一映射模块,用于基于映射模型,将所述第一特征信息映射至目标空间,得到所述第一对象在所述目标空间中对应的第一映射向量,所述目标空间中包 括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;The first mapping module is configured to map the first feature information to a target space based on a mapping model to obtain a first mapping vector corresponding to the first object in the target space, and the target space includes a user Identify the corresponding user mapping vector and the data mapping vector corresponding to the candidate data;
推荐模块,用于根据所述目标空间中任两个映射向量之间的距离,基于所述第一对象及第二对象进行推荐,其中,第二映射向量为所述第二对象在所述目标空间中对应的向量,所述第二映射向量与所述第一映射向量之间的距离小于预设距离,且所述第二映射向量与所述第一映射向量属于不同类别。The recommendation module is configured to make a recommendation based on the first object and the second object according to the distance between any two mapping vectors in the target space, where the second mapping vector is that the second object is in the target space. For a corresponding vector in the space, the distance between the second mapping vector and the first mapping vector is less than a preset distance, and the second mapping vector and the first mapping vector belong to different categories.
另一方面,提供了一种计算机设备,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条程序代码,所述至少一条程序代码由所述处理器加载并执行,以实现如所述推荐方法中所执行的操作。In another aspect, a computer device is provided, the computer device includes a processor and a memory, and at least one piece of program code is stored in the memory, and the at least one piece of program code is loaded and executed by the processor to realize the following: The operations performed in the recommended method.
另一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条程序代码,所述至少一条程序代码由处理器加载并执行,以实现如所述推荐方法中所执行的操作。In another aspect, a computer-readable storage medium is provided, and at least one piece of program code is stored in the computer-readable storage medium, and the at least one piece of program code is loaded and executed by a processor, so as to implement The action performed.
再一方面,提供了一种计算机程序,所述计算机程序中存储有至少一条程序代码,所述至少一条程序代码由处理器加载并执行,以实现如所述推荐方法中所执行的操作。In another aspect, a computer program is provided, and at least one program code is stored in the computer program, and the at least one program code is loaded and executed by a processor, so as to implement the operations performed in the recommended method.
本申请实施例提供的方法、装置、计算机设备及存储介质,只需获取第一对象,将该第一对象映射到目标空间中,根据目标空间中包括的各个映射向量之间的距离,即可获取到进行推荐的第二对象,从而基于第一对象和第二对象进行推荐,推荐过程中不涉及除第一对象和第二对象之外的其他对象,即不需要获取其他对象,在应用时不受其他对象的限制,扩展了应用范围。The method, device, computer equipment, and storage medium provided by the embodiments of the present application only need to acquire the first object, map the first object to the target space, and then, according to the distance between the respective mapping vectors included in the target space, Obtain the second object for recommendation, and then recommend based on the first object and the second object. No objects other than the first object and the second object are involved in the recommendation process, that is, there is no need to obtain other objects. It is not restricted by other objects and expands the scope of application.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请实施例的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some implementations of the embodiments of the present application. For example, for those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1是本申请实施例提供的一种实施环境的示意图。Fig. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
图2是本申请实施例提供的一种推荐方法的流程图。Fig. 2 is a flowchart of a recommendation method provided by an embodiment of the present application.
图3是本申请实施例提供的一种映射向量距离的示意图。Fig. 3 is a schematic diagram of a mapping vector distance provided by an embodiment of the present application.
图4是本申请实施例提供的另一种映射向量距离的示意图。Fig. 4 is a schematic diagram of another mapping vector distance provided by an embodiment of the present application.
图5是本申请实施例提供的另一种推荐方法的流程图。Fig. 5 is a flowchart of another recommendation method provided by an embodiment of the present application.
图6是本申请实施例提供的一种推荐界面的示意图。Fig. 6 is a schematic diagram of a recommendation interface provided by an embodiment of the present application.
图7是本申请实施例提供的一种自编码器的示意图。Fig. 7 is a schematic diagram of a self-encoder provided by an embodiment of the present application.
图8是本申请实施例提供的一种向量映射的示意图。Fig. 8 is a schematic diagram of a vector mapping provided by an embodiment of the present application.
图9是本申请实施例提供的另一种向量映射的示意图。Fig. 9 is a schematic diagram of another vector mapping provided by an embodiment of the present application.
图10是本申请实施例提供的另一种向量映射的示意图。Fig. 10 is a schematic diagram of another vector mapping provided by an embodiment of the present application.
图11是本申请实施例提供的一种目标空间向量分布的示意图。FIG. 11 is a schematic diagram of a target space vector distribution provided by an embodiment of the present application.
图12是本申请实施例提供的一种流形结构的示意图。FIG. 12 is a schematic diagram of a manifold structure provided by an embodiment of the present application.
图13是本申请实施例提供的一种映射向量的解码过程示意图。FIG. 13 is a schematic diagram of a decoding process of a mapping vector provided by an embodiment of the present application.
图14是本申请实施例提供的一种双自编码器的示意图。Fig. 14 is a schematic diagram of a dual autoencoder provided by an embodiment of the present application.
图15是本申请实施例提供的一种流形结构的重建过程示意图。FIG. 15 is a schematic diagram of a reconstruction process of a manifold structure provided by an embodiment of the present application.
图16是本申请实施例提供的一种协同度量学习效果的示意图。FIG. 16 is a schematic diagram of a collaborative metric learning effect provided by an embodiment of the present application.
图17是本申请实施例提供的一种深度化模型的示意图。FIG. 17 is a schematic diagram of an in-depth model provided by an embodiment of the present application.
图18是本申请实施例提供的一种推荐装置的结构示意图。FIG. 18 is a schematic structural diagram of a recommendation device provided by an embodiment of the present application.
图19是本申请实施例提供的另一种推荐装置的结构示意图。FIG. 19 is a schematic structural diagram of another recommending device provided by an embodiment of the present application.
图20是本申请实施例提供的另一种推荐装置的结构示意图。FIG. 20 is a schematic structural diagram of another recommending device provided by an embodiment of the present application.
图21是本申请实施例提供的一种终端的结构示意图。FIG. 21 is a schematic structural diagram of a terminal provided by an embodiment of the present application.
图22是本申请实施例提供的一种服务器的结构示意图。FIG. 22 is a schematic structural diagram of a server provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the objectives, technical solutions, and advantages of the embodiments of the present application clearer, the following further describes the embodiments of the present application in detail with reference to the accompanying drawings.
本申请所使用的术语“第一”、“第二”等可在本文中用于描述各种概念,但除非特别说明,这些概念不受这些术语限制。这些术语仅用于将一个概念与另一个概念区分。举例来说,在不脱离本申请的范围的情况下,可以将第一对象称为第二对象,将第二对象称为第一对象。The terms "first", "second", etc. used in this application can be used herein to describe various concepts, but unless otherwise specified, these concepts are not limited by these terms. These terms are only used to distinguish one concept from another. For example, without departing from the scope of the present application, the first object may be referred to as the second object, and the second object may be referred to as the first object.
本申请所使用的术语“至少一个”包括一个或者一个以上,且至少一个的数量为整数,例如,该至少一个可以是1个、2个、3个等。本申请所使用的术语“多个”包括两个或者两个以上,且多个的数量为整数,例如,多个可以是2个、3个、4个等。The term "at least one" used in the present application includes one or more than one, and the number of at least one is an integer, for example, the at least one may be 1, 2, 3, etc. The term "plurality" used in the present application includes two or more than two, and the number of the multiple is an integer, for example, the multiple may be 2, 3, 4, etc.
图1是本申请实施例提供的一种实施环境的示意图,该实施环境包括:至少一个终端101和服务器102,至少一个终端101与服务器102连接,基于用户标识登录服务器102。FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application. The implementation environment includes: at least one terminal 101 and a server 102. At least one terminal 101 is connected to the server 102 and logs in to the server 102 based on a user identifier.
服务器102中存储有多个数据,数据包括视频数据、音频数据、文本数据或图片数据等,在任一终端101运行过程中,服务器102将任一个数据推荐给终端101,由终端101进行显示。The server 102 stores multiple data, including video data, audio data, text data, or picture data. During the operation of any terminal 101, the server 102 recommends any data to the terminal 101 for display by the terminal 101.
其中,终端101为手机、平板电脑等多种类型的设备。服务器102是一台服务器,或者由若干台服务器组成的服务器集群,或者是一个云计算服务中心。Among them, the terminal 101 is various types of devices such as mobile phones and tablet computers. The server 102 is a server, or a server cluster composed of several servers, or a cloud computing service center.
图2是本申请实施例提供的一种推荐方法的流程图,本申请实施例的执行主体为服务器,参见图2,该方法包括:Fig. 2 is a flowchart of a recommendation method provided by an embodiment of the present application. The execution subject of the embodiment of the present application is a server. Referring to Fig. 2, the method includes:
201、获取第一对象的第一特征信息,以及第二对象的第二特征信息。201. Acquire first characteristic information of a first object and second characteristic information of a second object.
本申请实施例中提供了两种类别的对象:用户标识和备选数据。用户标识的特征信息用于描述该用户标识对应的用户,该用户标识的特征信息包括用户的年龄、性别等信息。Two types of objects are provided in the embodiments of this application: user identification and candidate data. The characteristic information of the user ID is used to describe the user corresponding to the user ID, and the characteristic information of the user ID includes information such as the age and gender of the user.
在一种可能实现方式中,该用户标识的特征信息还包括用户的兴趣标签,可选地,根据该用户标识之前处理过的备选数据获取用户兴趣标签,例如通过用户的商品购买记录获取用户喜欢的商品类型,通过文章阅读记录获取用户喜欢的文章类型,通过视频观看记录获取用户喜欢的视频类型等。In a possible implementation manner, the characteristic information of the user identification also includes the user's interest tag. Optionally, the user's interest tag is obtained according to the candidate data processed before the user identification, for example, the user's interest tag is obtained through the user's product purchase record. Favorite commodity type, obtain the user's favorite article type through the article reading record, obtain the user's favorite video type through the video viewing record, etc.
另外,备选数据的特征信息用于描述备选数据。In addition, the feature information of the candidate data is used to describe the candidate data.
在一种可能实现方式中,如果该备选数据为商品,则特征信息为该商品的价格、类型等信息;如果该备选数据为文章,则特征信息为文章的类型、文字数量等信息;如果该备选数据为视频,则特征信息为视频的类型、时长等信息。In a possible implementation, if the candidate data is a commodity, the characteristic information is information such as the price and type of the commodity; if the candidate data is an article, the characteristic information is information such as the type and number of words of the article; If the candidate data is a video, the characteristic information is information such as the type and duration of the video.
本申请实施例中,第一对象属于用户标识或备选数据,第二对象也属于用户标识或备选数据,第一对象与第二对象属于不同的类别,即如果第一对象为用户标识,则第二对象为备选数据;如果第一对象为备选数据,则第二对象为用户标识。In the embodiment of this application, the first object belongs to user identification or candidate data, and the second object also belongs to user identification or candidate data. The first object and the second object belong to different categories, that is, if the first object is a user identification, The second object is the candidate data; if the first object is the candidate data, the second object is the user identification.
202、基于映射模型,将第一特征信息和第二特征信息分别映射至目标空间,得到该第一特征信息在该目标空间中对应的第一映射向量,以及第二特征信息在该目标空间中对应的第二映射向量。202. Based on the mapping model, map the first feature information and the second feature information to the target space respectively to obtain the first mapping vector corresponding to the first feature information in the target space, and the second feature information in the target space The corresponding second mapping vector.
本申请实施例中,映射模型用于对特征信息进行映射,将特征信息通过映射模型映射到目标空间中,得到对应的映射向量。其中,与特征信息原始的空间相比较,目标空间与原始的空间维度不同,目标空间为一个低维空间或者为一个高维空间。In the embodiment of the present application, the mapping model is used to map the feature information, and the feature information is mapped to the target space through the mapping model to obtain the corresponding mapping vector. Among them, compared with the original space of the feature information, the target space has a different dimension from the original space, and the target space is a low-dimensional space or a high-dimensional space.
并且,不同类别的对象的特征信息均可基于该映射模型映射至目标空间中,得到特征信息对应的映射向量。其中,用户标识的特征信息映射至目标空间,得到对应的用户映射向量,备选数据的特征信息映射至目标空间,得到对应的数据映射向量。In addition, the feature information of different types of objects can be mapped to the target space based on the mapping model to obtain the mapping vector corresponding to the feature information. Among them, the feature information of the user identification is mapped to the target space to obtain the corresponding user mapping vector, and the feature information of the candidate data is mapped to the target space to obtain the corresponding data mapping vector.
将第一特征信息输入至映射模型,得到在目标空间中对应的第一映射向量,将第二特征信息输入至映射模型,得到在目标空间中对应的第二映射向量。由于第一特征信息与第二特征信息属于不同类别的对象,因此第二映射向量与第一映射向量属于不同类别的映射向量,即第一映射向量为用户映射向量,第二映射向量为数据映射向量,或者第一映射向量为数据映射向量,第二映射向量为用户映射向量。The first feature information is input to the mapping model to obtain the first mapping vector corresponding to the target space, and the second feature information is input to the mapping model to obtain the second mapping vector corresponding to the target space. Since the first feature information and the second feature information belong to different types of objects, the second mapping vector and the first mapping vector belong to different types of mapping vectors, that is, the first mapping vector is the user mapping vector, and the second mapping vector is the data mapping Vector, or the first mapping vector is a data mapping vector, and the second mapping vector is a user mapping vector.
在一种可能实现方式中,该映射模型为单映射模型,特征信息与基于映射模型映射得到的映射向量之间是一一对应的,每个特征信息有唯一对应的映射向量,每个映射向量有唯一对应的特征信息。即基于单映射模型,对任一特征信息进行映射,能够得到该特征信息对应的映射向量,且不存在与该映射向量对应的其他特征信息。In one possible implementation, the mapping model is a single mapping model. There is a one-to-one correspondence between the feature information and the mapping vector obtained based on the mapping model. Each feature information has a unique corresponding mapping vector, and each mapping vector There is a unique corresponding feature information. That is, based on the single mapping model, if any feature information is mapped, the mapping vector corresponding to the feature information can be obtained, and there is no other feature information corresponding to the mapping vector.
在一种可能实现方式中,该映射模型包括多个映射子模型,该多个映射子模型用于映射不同类别的特征信息,例如,该映射模型包括用户映射子模型和数据映射子模型。其中,用户映射子模型用于对用户标识的特征信息进行映射,得到用户映射向量,数据映射子模型用于对备选数据的特征信息进行映射,得到数据映射向量。可选地,用户标识的特征信息映射过程和备选数据的特征信 息映射过程,同时执行,或者先后执行。In a possible implementation manner, the mapping model includes multiple mapping sub-models, and the multiple mapping sub-models are used to map different types of feature information. For example, the mapping model includes a user mapping sub-model and a data mapping sub-model. Among them, the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector, and the data mapping sub-model is used to map the feature information of the candidate data to obtain the data mapping vector. Optionally, the feature information mapping process of the user identification and the feature information mapping process of the candidate data are performed simultaneously or sequentially.
可选地,基于映射模型,将第一对象的第一特征信息映射至目标空间,得到第一对象在目标空间中对应的第一映射向量,包括以下两种情况:在第一对象为用户标识的情况下,基于用户映射子模型,将用户标识的特征信息映射至目标空间,得到该用户标识在目标空间中对应的用户映射向量;在第一对象为备选数据的情况下,基于数据映射子模型,将备选数据的特征信息映射至目标空间,得到该备选数据在目标空间中对应的数据映射向量。Optionally, based on the mapping model, the first feature information of the first object is mapped to the target space to obtain the first mapping vector corresponding to the first object in the target space, including the following two cases: In the case of the user mapping sub-model, the feature information of the user identification is mapped to the target space, and the user mapping vector corresponding to the user identification in the target space is obtained; in the case that the first object is candidate data, based on the data mapping The sub-model maps the feature information of the candidate data to the target space, and obtains the data mapping vector corresponding to the candidate data in the target space.
可选地,基于映射模型,将第二对象的第一特征信息映射至目标空间,得到第二对象在目标空间中对应的第一映射向量,包括以下两种情况:在第二对象为用户标识的情况下,基于用户映射子模型,将用户标识的特征信息映射至目标空间,得到该用户标识在目标空间中对应的用户映射向量;在第二对象为备选数据的情况下,基于数据映射子模型,将备选数据的特征信息映射至目标空间,得到该备选数据在目标空间中对应的数据映射向量。Optionally, based on the mapping model, the first feature information of the second object is mapped to the target space to obtain the first mapping vector corresponding to the second object in the target space, including the following two cases: In the case of the user mapping sub-model, the feature information of the user identification is mapped to the target space, and the user mapping vector corresponding to the user identification in the target space is obtained; in the case that the second object is candidate data, based on the data mapping The sub-model maps the feature information of the candidate data to the target space, and obtains the data mapping vector corresponding to the candidate data in the target space.
在一种可能实现方式中,服务器存储特征信息和基于映射模型映射得到的映射向量,将用户标识的特征信息和对应的用户映射向量对应存储,将备选数据的特征信息和对应的数据映射向量对应存储,以便后续确定两个映射向量,基于两个映射向量对应的对象进行推荐时,能够方便地获取到映射向量对应的特征信息。In one possible implementation, the server stores the feature information and the mapping vector obtained based on the mapping model mapping, the feature information of the user identification and the corresponding user mapping vector are stored correspondingly, and the feature information of the candidate data and the corresponding data mapping vector Corresponding storage, so that two mapping vectors can be determined later, and when a recommendation is made based on the objects corresponding to the two mapping vectors, the feature information corresponding to the mapping vector can be easily obtained.
需要说明的是,本申请实施例中,基于映射模型,将第一特征信息和第二特征信息映射至目标空间,可选地,基于映射模型,同时将第一特征信息和第二特征信息映射至目标空间,或者基于映射模型,先将第二特征信息映射至目标空间,再将第一特征信息映射至目标空间,只需保证确定第一映射向量和第二映射向量之间的距离时,目标空间中存在第二映射向量即可。It should be noted that, in the embodiment of the present application, the first feature information and the second feature information are mapped to the target space based on the mapping model. Optionally, the first feature information and the second feature information are simultaneously mapped based on the mapping model. To the target space, or based on the mapping model, first map the second feature information to the target space, and then map the first feature information to the target space. You only need to ensure that the distance between the first mapping vector and the second mapping vector is determined. It suffices that the second mapping vector exists in the target space.
203、获取第一映射向量与第二映射向量之间的距离。203. Obtain the distance between the first mapping vector and the second mapping vector.
本申请实施例中,需要对第一映射向量和第二映射向量之间的距离进行测量,以确定两者之间的距离。因此,需要在目标空间中,定义一个度量对任两个映射向量之间的距离进行测量。该度量至少需要满足以下几个条件:In the embodiment of the present application, the distance between the first mapping vector and the second mapping vector needs to be measured to determine the distance between the two. Therefore, it is necessary to define a metric to measure the distance between any two mapping vectors in the target space. The metric needs to meet at least the following conditions:
(1)满足数学中对度量的各项要求,即非负性、不可区分者的同一性、对称性和三角不等式。(1) Satisfy the various requirements of measurement in mathematics, namely the identity of non-negativity, indistinguishability, symmetry and triangle inequality.
(2)该度量能够通过欧式度量计算得到。(2) The metric can be calculated by Euclidean metric.
(3)在保证各映射向量之间的距离的前提下,能够尽可能地嵌入更多映射向量。(3) On the premise of ensuring the distance between the mapping vectors, it is possible to embed as many mapping vectors as possible.
本申请实施例中定义了一致度量,采用一致度量测量任两个映射向量之间的距离。定义的一致度量
Figure PCTCN2020118107-appb-000001
为:
A consistent metric is defined in the embodiments of the present application, and the consistent metric is used to measure the distance between any two mapping vectors. Defined consistent metric
Figure PCTCN2020118107-appb-000001
for:
Figure PCTCN2020118107-appb-000002
Figure PCTCN2020118107-appb-000002
Figure PCTCN2020118107-appb-000003
Figure PCTCN2020118107-appb-000003
其中,
Figure PCTCN2020118107-appb-000004
为目标空间中的任意两个映射向量;
Figure PCTCN2020118107-appb-000005
为切比雪夫距离, 即L 距离;a为预设距离,且a>0;sup{·}为上确界函数,i取值不同,|x i-y i|能够得到多个不同的数值,sup{·}表示该多个不同的数值对应的最小上界,如果多个数值中存在最大值,该最大值即为最小上界,如果最大的多个数值无限趋近于某个数值,该数值即为最小上界;min{·}表示多个数值中的最小值。
among them,
Figure PCTCN2020118107-appb-000004
Is any two mapping vectors in the target space;
Figure PCTCN2020118107-appb-000005
Is the Chebyshev distance, that is, L distance; a is the preset distance, and a>0; sup{·} is the supremum function, i takes different values, and |x i -y i | can get multiple different Numerical value, sup{·} represents the minimum upper bound corresponding to the multiple different values. If there is a maximum value among multiple values, the maximum value is the minimum upper bound. If the largest multiple values are infinitely close to a certain value , This value is the minimum upper bound; min{·} represents the minimum value among multiple values.
一致度量中通过L 距离计算得到任意两个映射向量之间的最大值,将该最大值与a进行比较,选取其中的最小值。 In the consistency metric , the maximum value between any two mapping vectors is calculated by L ∞ distance, the maximum value is compared with a, and the minimum value is selected.
例如,以
Figure PCTCN2020118107-appb-000006
表示某一映射向量,目标空间中存在多个映射向量,将该多个映射向量看作多个点,图3中的线是指与
Figure PCTCN2020118107-appb-000007
之间的距离为a的各个点连接而成的线,图3的俯视图参见图4,图4中的圆与映射向量
Figure PCTCN2020118107-appb-000008
之间的距离为a,认为以斜线表示的部分中的任一映射向量,与映射向量
Figure PCTCN2020118107-appb-000009
之间的距离较近,认为圆外部的区域中的任一映射向量,与映射向量
Figure PCTCN2020118107-appb-000010
之间的距离较远。
For example, to
Figure PCTCN2020118107-appb-000006
Indicates a certain mapping vector. There are multiple mapping vectors in the target space. The multiple mapping vectors are regarded as multiple points. The line in Figure 3 refers to the
Figure PCTCN2020118107-appb-000007
The distance between is the line formed by connecting the points of a, the top view of Fig. 3 is shown in Fig. 4, the circle and the mapping vector in Fig. 4
Figure PCTCN2020118107-appb-000008
The distance between is a, it is considered that any mapping vector in the part indicated by diagonal lines is the same as the mapping vector
Figure PCTCN2020118107-appb-000009
The distance between the two is relatively close, and any mapping vector in the area outside the circle is considered to be the same as the mapping vector
Figure PCTCN2020118107-appb-000010
The distance between them is far.
证明定义的一致度量满足上述度量条件Prove that the defined consistent metric satisfies the above metric conditions
一致度量证明满足上述条件(1)的证明过程如下:The proof process for the consistent measurement to prove that the above condition (1) is satisfied is as follows:
关于非负性证明:由于
Figure PCTCN2020118107-appb-000011
且a>0、
Figure PCTCN2020118107-appb-000012
因此取最小值时,非负性成立。
Regarding non-negative proof: due to
Figure PCTCN2020118107-appb-000011
And a>0,
Figure PCTCN2020118107-appb-000012
Therefore, when the minimum value is taken, non-negativity is established.
关于不可区分者的同一性证明:如果
Figure PCTCN2020118107-appb-000013
那么
Figure PCTCN2020118107-appb-000014
进而得到
Figure PCTCN2020118107-appb-000015
如果
Figure PCTCN2020118107-appb-000016
那么
Figure PCTCN2020118107-appb-000017
进而得到
Figure PCTCN2020118107-appb-000018
因此,不可区分者的同一性成立。
Proof of the identity of the indistinguishable: if
Figure PCTCN2020118107-appb-000013
Then
Figure PCTCN2020118107-appb-000014
And get
Figure PCTCN2020118107-appb-000015
in case
Figure PCTCN2020118107-appb-000016
Then
Figure PCTCN2020118107-appb-000017
And get
Figure PCTCN2020118107-appb-000018
Therefore, the identity of the indistinguishable is established.
关于对称性证明:由于
Figure PCTCN2020118107-appb-000019
是一个明确定义的度量,因此其具有对称性,即
Figure PCTCN2020118107-appb-000020
那么
Figure PCTCN2020118107-appb-000021
进而得到
Figure PCTCN2020118107-appb-000022
因此,对称性成立。
Regarding the proof of symmetry: due to
Figure PCTCN2020118107-appb-000019
Is a well-defined metric, so it has symmetry, that is
Figure PCTCN2020118107-appb-000020
Then
Figure PCTCN2020118107-appb-000021
And get
Figure PCTCN2020118107-appb-000022
Therefore, symmetry holds.
关于三角不等式证明:如果有三个向量
Figure PCTCN2020118107-appb-000023
那么其中任一向量与另外两个向量之间的距离之和应该大于另外两个向量之间的距离。
About the proof of triangle inequality: if there are three vectors
Figure PCTCN2020118107-appb-000023
Then the sum of the distance between any one of the vectors and the other two vectors should be greater than the distance between the other two vectors.
其中任一向量与另外两个向量之间的距离之和为:The sum of the distance between any vector and the other two vectors is:
Figure PCTCN2020118107-appb-000024
Figure PCTCN2020118107-appb-000024
定义一个新的度量d(x,y)=min{d(x,y),a},该度量的非负性、不可区分者的同一性和对称性显而易见,其三角不等式性质:Define a new metric d(x, y) = min{d(x, y), a}, the non-negativity of the metric, the identity and symmetry of the indistinguishable are obvious, and its triangular inequality properties:
Figure PCTCN2020118107-appb-000025
Figure PCTCN2020118107-appb-000025
因此,d(x,y)=min{d(x,y),a}是一个定义明确的度量。上述不等式写为:Therefore, d(x,y)=min{d(x,y),a} is a well-defined metric. The above inequality is written as:
Figure PCTCN2020118107-appb-000026
Figure PCTCN2020118107-appb-000026
通过上述证明,能够确定一致度量为一个定义明确的度量,满足度量定义的各项要求。Through the above proof, it can be determined that the consistent metric is a well-defined metric that meets the requirements of the metric definition.
由于该一致度量是基于欧式空间的度量,所以满足上述条件(2)。Since this consistency metric is a metric based on Euclidean space, it satisfies the above condition (2).
该一致度量不同于其他欧式空间度量,以该度量诱导产生的度量拓扑空间中,认为与
Figure PCTCN2020118107-appb-000027
之间的距离不小于a的向量与
Figure PCTCN2020118107-appb-000028
之间的距离都为a,因此即使在低维空间中也能够映射很多距离相等的向量。而如果需要调整空间的映射能力,则只需要调整a,a越小,映射能力越强;a越大,映射能力越弱,a等于0,空间完全等价于以l 为度量诱导出的度量拓扑空间。因此,满足上述条件(3)。
This consistent metric is different from other Euclidean space metrics. The metric topological space induced by this metric is considered to be
Figure PCTCN2020118107-appb-000027
The distance between the vector is not less than a and
Figure PCTCN2020118107-appb-000028
The distance between is a, so even in a low-dimensional space, many vectors with equal distances can be mapped. If you need to adjust the mapping ability of the space, you only need to adjust a. The smaller a, the stronger the mapping ability; the larger the a, the weaker the mapping ability. A is equal to 0. The space is completely equivalent to the one induced by l ∞. Measure the topological space. Therefore, the above condition (3) is satisfied.
通过定义的一致度量对第一映射向量与第二映射向量之间的距离进行测量,得到两个映射向量之间的距离。The distance between the first mapping vector and the second mapping vector is measured by the defined consistency metric, and the distance between the two mapping vectors is obtained.
204、如果该距离小于预设距离,基于第一对象和第二对象进行推荐。204. If the distance is less than the preset distance, make a recommendation based on the first object and the second object.
本申请实施例中,如果第一映射向量与第二映射向量之间的距离小于预设距离,基于第一对象和第二对象进行推荐。In the embodiment of the present application, if the distance between the first mapping vector and the second mapping vector is less than the preset distance, the recommendation is made based on the first object and the second object.
如果目标空间中的第一映射向量与第二映射向量之间的距离不小于预设距离,则不基于第一对象和第二对象进行推荐。If the distance between the first mapping vector and the second mapping vector in the target space is not less than the preset distance, no recommendation is made based on the first object and the second object.
在一种可能实现方式中,第一对象为用户标识,第二对象为备选数据,或者,第一对象为备选数据,第二对象为用户标识,基于第一对象及第二对象进行推荐,包括:向用户标识推荐备选数据。In one possible implementation, the first object is the user identification and the second object is the candidate data, or the first object is the candidate data and the second object is the user identification, and recommendations are made based on the first object and the second object , Including: recommending candidate data to the user ID.
在一种可能实现方式中,向用户标识推荐备选数据包括:由服务器向登录有该用户标识的终端发送备选数据,由终端显示该备选数据,用户即可查看。In a possible implementation manner, recommending candidate data to the user ID includes: sending the candidate data by the server to the terminal logged in with the user ID, and displaying the candidate data by the terminal, which can be viewed by the user.
例如,终端显示的推荐界面参见图6,该推荐界面中包括用户头像、关注选项和推荐选项。点击用户头像查看用户标识等用户信息,点击关注选项查看该用户标识关注的其他用户标识发表的文章,点击推荐选项,推荐界面显示为用户推荐的可能感兴趣的文章,以及一些热门文章,其中可能感兴趣的文章是根据用户标识的特征信息推荐的。For example, the recommendation interface displayed by the terminal is shown in FIG. 6, and the recommendation interface includes a user avatar, following options, and recommendation options. Click the user’s avatar to view user information such as the user ID, click the follow option to view the articles published by other user IDs that the user ID follows, and click the recommendation option. The recommendation interface displays articles that may be of interest to the user recommended by the user, as well as some popular articles. Interested articles are recommended based on the characteristic information identified by the user.
其中,预设距离为用于表示用户标识对应的用户对备选数据感兴趣的最小距离,该预设距离由服务器随机确定,或者根据需要设置,如果需要推荐的准确率更高,即推荐的备选数据更加符合用户的兴趣,则设置较小的预设距离;如果需要尽可能多地获取推荐的备选数据,则设置较大的预设距离。Among them, the preset distance is the minimum distance used to indicate that the user corresponding to the user identifier is interested in the candidate data. The preset distance is randomly determined by the server or set according to needs. If the recommended accuracy rate is higher, it is recommended If the candidate data is more in line with the user's interest, a smaller preset distance is set; if it is necessary to obtain as much recommended candidate data as possible, a larger preset distance is set.
例如,已知用户和商品,判断是否将该商品推荐给该用户,首先获取该用户对应的用户特征信息和该商品对应的商品特征信息,将用户特征信息和商品 特征信息分别输入至映射模型,得到用户对应的用户映射向量和商品对应的商品映射向量,基于一致度量,获得用户映射向量和商品映射向量之间的距离,如果该距离小于预设距离,则说明该用户对该商品感兴趣,将该商品推荐给该用户,如果该距离不小于预设距离,则说明该用户对该商品不感兴趣,不需要将该商品推荐给该用户。For example, if the user and the product are known, it is determined whether to recommend the product to the user. First, obtain the user characteristic information corresponding to the user and the product characteristic information corresponding to the product, and input the user characteristic information and the product characteristic information into the mapping model respectively. Obtain the user mapping vector corresponding to the user and the product mapping vector corresponding to the product. Based on the consistent measurement, obtain the distance between the user mapping vector and the product mapping vector. If the distance is less than the preset distance, it means that the user is interested in the product. Recommend the product to the user. If the distance is not less than the preset distance, it means that the user is not interested in the product and there is no need to recommend the product to the user.
需要说明的是,在另一实施例中,能够将用户标识的特征信息输入至映射模型,通过与本申请实施例类似的方法获得相似的用户,然后进行推荐;将备选数据的特征信息输入至映射模型,通过与本申请实施例类似的方法获得相似的数据,然后进行推荐。It should be noted that, in another embodiment, the feature information of the user identification can be input into the mapping model, and similar users can be obtained through a method similar to the embodiment of this application, and then the recommendation can be made; the feature information of the candidate data can be input To the mapping model, similar data is obtained by a method similar to the embodiment of the present application, and then the recommendation is made.
例如,对于一个能够添加好友的应用客户端中的两个用户,将这两个用户的特征信息分别输入至映射模型,得到对应的两个用户映射向量,且这两个用户映射向量之间的距离小于预设距离,则认为这两个用户为相似用户,能够将其中的一个用户推荐给另一个用户。For example, for two users in an application client that can add friends, input the characteristic information of the two users into the mapping model to obtain two corresponding user mapping vectors, and the difference between the two user mapping vectors is If the distance is less than the preset distance, the two users are considered to be similar users, and one user can be recommended to the other user.
本申请实施例提供的方法,只需获取第一对象,将该第一对象映射到目标空间中,根据目标空间中包括的各个映射向量之间的距离,即可获取到进行推荐的第二对象,从而基于第一对象和第二对象进行推荐,推荐过程中不涉及除第一对象和第二对象之外的其他对象,即不需要获取其他对象,在应用时不受其他对象的限制,扩展了应用范围。In the method provided by the embodiments of the present application, only the first object is acquired, the first object is mapped to the target space, and the recommended second object can be acquired according to the distance between the mapping vectors included in the target space. Therefore, recommendations are made based on the first object and the second object. The recommendation process does not involve other objects except the first object and the second object, that is, there is no need to obtain other objects, and it is not restricted by other objects during application. The scope of application.
例如,对于一个用户和一个商品而言,相关技术中,需要获取用户之前购买过的商品,根据这个商品是否与购买过的商品类似,确定是否将这个商品推荐给用户。而本申请实施例中,只需要根据用户的用户特征信息和商品的商品特征信息,得到用户映射向量和商品映射向量在目标空间中的距离,根据该距离进行推荐,如果该距离小于预设距离,则将该商品推荐给用户,不需要通过其他商品间接确定是否进行推荐。For example, for a user and a product, in related technologies, it is necessary to obtain the product that the user has previously purchased, and determine whether to recommend the product to the user according to whether the product is similar to the purchased product. In the embodiment of the present application, it is only necessary to obtain the distance between the user mapping vector and the product mapping vector in the target space according to the user feature information of the user and the product feature information of the product, and then make recommendations based on the distance. If the distance is less than the preset distance , The product is recommended to the user, and there is no need to indirectly determine whether to recommend through other products.
图5是本申请实施例提供的另一种推荐方法的流程图。本申请实施例的执行主体为服务器,参见图5,该方法包括:Fig. 5 is a flowchart of another recommendation method provided by an embodiment of the present application. The execution subject of the embodiment of the present application is a server. Referring to FIG. 5, the method includes:
501、获取第一对象的第一特征信息。501. Acquire first characteristic information of a first object.
具体实施方式与上述实施例中步骤201的实施方式类似,在此不再一一赘述。The specific implementation is similar to the implementation of step 201 in the foregoing embodiment, and will not be repeated here.
502、基于映射模型,将第一特征信息映射至目标空间,得到该第一特征信息在该目标空间中对应的第一映射向量。502. Based on the mapping model, map the first feature information to the target space, and obtain a first mapping vector corresponding to the first feature information in the target space.
目标空间中包括用户标识对应的用户映射向量和备选数据对应的数据映射向量。The target space includes a user mapping vector corresponding to the user identification and a data mapping vector corresponding to the candidate data.
本申请实施例中第一特征信息映射得到对应的第一映射向量的实施方式与上述实施例中步骤202的实施方式类似,在此不再一一赘述。The implementation manner in which the first feature information is mapped to obtain the corresponding first mapping vector in the embodiment of the present application is similar to the implementation manner of step 202 in the foregoing embodiment, and will not be repeated here.
需要说明的是,本申请实施例中,只需要将第一特征信息映射至目标空间,得到对应的第一映射向量,对于目标空间中除用户映射向量之外的其他映射向量的映射方式不做限定,可选地,采用本申请实施例中的映射模型进行映射, 或者采用其他方式进行映射。It should be noted that in this embodiment of the application, only the first feature information needs to be mapped to the target space to obtain the corresponding first mapping vector, and the mapping method for other mapping vectors in the target space except the user mapping vector is not used. Limited, optionally, the mapping model in the embodiment of the present application is used for mapping, or other methods are used for mapping.
503、确定目标空间中与该第一映射向量属于不同类别的至少一个第三映射向量。503. Determine at least one third mapping vector that belongs to a different category from the first mapping vector in the target space.
504、获取该第一映射向量与每个第三映射向量之间的距离。504. Obtain a distance between the first mapping vector and each third mapping vector.
其中,第三映射向量与第一映射向量属于不同类别,如果第一映射向量为用户标识的映射向量,则第三映射向量为备选数据的映射向量;如果第一映射向量为备选数据的映射向量,则第三映射向量为用户标识的映射向量。Among them, the third mapping vector and the first mapping vector belong to different categories. If the first mapping vector is the mapping vector identified by the user, the third mapping vector is the mapping vector of the candidate data; if the first mapping vector is the mapping vector of the candidate data Mapping vector, the third mapping vector is the mapping vector of the user identification.
该目标空间中包括至少一个第三映射向量,确定该空间中每一个第三映射向量的位置,根据目标空间中定义的一致度量,得到第一映射向量与每个第三映射向量之间的距离。The target space includes at least one third mapping vector, the position of each third mapping vector in the space is determined, and the distance between the first mapping vector and each third mapping vector is obtained according to the consistent metric defined in the target space .
本申请实施例中的距离获取方式与上述步骤203中的实施方式类似,在此不再一一赘述。The distance acquisition method in the embodiment of the present application is similar to the implementation method in step 203 described above, and will not be repeated here.
505、从至少一个第三映射向量中,选取与该第一映射向量之间的距离小于预设距离的第二映射向量。505. From at least one third mapping vector, select a second mapping vector whose distance to the first mapping vector is less than a preset distance.
根据步骤504获取第一映射向量与每个第三映射向量之间的距离之后,从与第一映射向量之间的距离小于预设距离的第三映射向量中,选取第二映射向量。After obtaining the distance between the first mapping vector and each third mapping vector according to step 504, the second mapping vector is selected from the third mapping vector whose distance from the first mapping vector is less than the preset distance.
在一种可能实现方式中,在选取与第一映射向量之间的距离小于预设距离的第二映射向量时,可选地,选取的第二映射向量为一个或者多个。其中,选取的第二映射向量的数量根据需要进行设置。In a possible implementation manner, when selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance, optionally, there are one or more second mapping vectors selected. Among them, the number of selected second mapping vectors is set as required.
506、确定第二映射向量对应的第二对象,基于第一对象和第二对象进行推荐。506. Determine a second object corresponding to the second mapping vector, and make a recommendation based on the first object and the second object.
根据选取的第二映射向量,确定该第二映射向量对应的第二对象,基于第一对象和第二对象进行推荐。According to the selected second mapping vector, the second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
在一种可能实现方式中,第一对象为用户标识,第二对象为备选数据,或者,第一对象为备选数据,第二对象为用户标识,基于第一对象及第二对象进行推荐,包括:向用户标识推荐备选数据。In one possible implementation, the first object is the user identification and the second object is the candidate data, or the first object is the candidate data and the second object is the user identification, and recommendations are made based on the first object and the second object , Including: recommending candidate data to the user ID.
在一种可能实现方式中,向用户标识推荐备选数据包括:由服务器向登录有该用户标识的终端发送备选数据,由终端显示该备选数据,用户即可查看。In a possible implementation manner, recommending candidate data to the user ID includes: sending the candidate data by the server to the terminal logged in with the user ID, and displaying the candidate data by the terminal, which can be viewed by the user.
在一种可能实现方式中,服务器存储有每个映射向量与对应的对象之间的对应关系,通过查询该对应关系,确定每个映射向量对应的对象。In a possible implementation manner, the server stores the corresponding relationship between each mapping vector and the corresponding object, and by querying the corresponding relationship, the object corresponding to each mapping vector is determined.
在另一种可能实现方式中,基于反映射模型,对第二映射向量进行反映射,得到该第二映射向量对应的第二对象。其中,该反映射模型用于对映射向量进行反映射,映射向量通过反映射模型反映射到原来的空间中,得到对应的特征信息。In another possible implementation manner, based on the inverse mapping model, the second mapping vector is inversely mapped to obtain the second object corresponding to the second mapping vector. Among them, the de-mapping model is used to de-map the mapping vector, and the mapping vector is de-mapped to the original space through the de-mapping model to obtain corresponding feature information.
在一种可能实现方式中,该反映射模型为单反映射模型,即每个映射向量与基于反映射模型反映射得到的特征信息之间是一一对应的,每个映射向量有唯一对应的特征信息,每个特征信息也有唯一对应的映射向量。In one possible implementation, the de-mapping model is a SLR mapping model, that is, each mapping vector has a one-to-one correspondence with the feature information obtained based on the de-mapping model de-mapping, and each mapping vector has a unique corresponding feature Information, each feature information also has a unique corresponding mapping vector.
在另一种可能实现方式中,该反映射模型为非单反映射模型,即每个映射 向量有唯一对应的特征信息,但是一个特征信息可能对应有多个映射向量。In another possible implementation, the anti-mapping model is a non-SLR mapping model, that is, each mapping vector has unique corresponding feature information, but one feature information may correspond to multiple mapping vectors.
另外,如果第二对象为用户标识,则该反映射模型为用户反映射模型,如果第二对象为备选数据,则该反映射模型为数据反映射模型。In addition, if the second object is a user identifier, the de-mapping model is a user de-mapping model, and if the second object is candidate data, the de-mapping model is a data de-mapping model.
需要说明的一点是,如果第一对象为用户标识,第二对象为备选数据,获取用户标识的特征信息,基于映射模型,将该特征信息映射至目标空间,得到该特征信息对应的用户映射向量,然后确定目标空间中的至少一个备选数据的数据映射向量,获取用户映射向量与每个数据映射向量之间的距离,从至少一个数据映射向量中,选取与用户映射向量之间的距离小于预设距离的数据映射向量,确定选取的数据映射向量对应的备选数据,则将选取的备选数据推荐给用户标识。One thing to note is that if the first object is a user ID and the second object is candidate data, the characteristic information of the user ID is obtained, and based on the mapping model, the characteristic information is mapped to the target space, and the user mapping corresponding to the characteristic information is obtained. Vector, and then determine the data mapping vector of at least one candidate data in the target space, obtain the distance between the user mapping vector and each data mapping vector, and select the distance to the user mapping vector from at least one data mapping vector For the data mapping vector less than the preset distance, the candidate data corresponding to the selected data mapping vector is determined, and the selected candidate data is recommended to the user identification.
如果第一对象为备选数据,第二对象为用户标识,获取备选数据的特征信息,基于映射模型,将该特征信息映射至目标空间,得到该特征信息对应的数据映射向量,然后确定目标空间中的至少一个用户标识的用户映射向量,获取数据映射向量与每个用户映射向量之间的距离,从至少一个用户映射向量中,选取与数据映射向量之间的距离小于预设距离的用户映射向量,确定选取的用户映射向量对应的用户标识,则将备选数据推荐给选取的用户标识。If the first object is the candidate data and the second object is the user identification, obtain the feature information of the candidate data, map the feature information to the target space based on the mapping model, obtain the data mapping vector corresponding to the feature information, and then determine the target The user mapping vector of at least one user identification in the space, the distance between the data mapping vector and each user mapping vector is obtained, and the user whose distance to the data mapping vector is less than the preset distance is selected from at least one user mapping vector The mapping vector determines the user identification corresponding to the selected user mapping vector, and then recommends the candidate data to the selected user identification.
需要说明的另一点是,在另一实施例中,对于同一类别的对象,能够通过与本申请实施例类似的方法获得相似的用户标识或者相似的备选数据。Another point that needs to be explained is that, in another embodiment, for objects of the same category, similar user identifications or similar candidate data can be obtained through a method similar to the embodiment of the present application.
例如,第一对象和第二对象均为用户标识,则获取第一用户标识的特征信息,基于映射模型,将该特征信息映射至目标空间,得到该特征信息对应的第一用户映射向量,然后确定目标空间中的至少一个第二用户标识的第二用户映射向量,获取第一用户映射向量与每个第二用户映射向量之间的距离,从至少一个第二用户映射向量中,选取与第一用户映射向量之间的距离小于预设距离的第二用户映射向量,确定选取的第二用户映射向量对应的第二用户标识,则认为第一用户标识表示的用户与第二用户标识表示的用户的兴趣相似,将选取的第二用户标识推荐给第一用户标识。For example, if the first object and the second object are both user identifications, then the feature information of the first user identification is obtained, based on the mapping model, the feature information is mapped to the target space, and the first user mapping vector corresponding to the feature information is obtained, and then Determine the second user mapping vector of at least one second user identifier in the target space, obtain the distance between the first user mapping vector and each second user mapping vector, and select the second user mapping vector from the at least one second user mapping vector. A second user mapping vector whose distance between the user mapping vectors is less than the preset distance is determined, and the second user identification corresponding to the selected second user mapping vector is determined, and the user represented by the first user identification is considered to be the same as the second user identification. The interests of the users are similar, and the selected second user ID is recommended to the first user ID.
例如,备选数据为商品,第一对象和第二对象均为商品,则获取第一商品的特征信息,基于映射模型,将该特征信息映射至目标空间,得到该特征信息对应的第一数据映射向量,然后确定目标空间中的至少一个第二商品的第二数据映射向量,获取第一数据映射向量与每个第二数据映射向量之间的距离,从至少一个第二数据映射向量中,选取与第一数据映射向量之间的距离小于预设距离的第二数据映射向量,确定选取的第二数据映射向量对应的第二商品,则认为第一商品和第二商品相似,将选取的第二商品推荐给购买过第一商品的用户。For example, if the candidate data is a commodity, and the first object and the second object are both commodities, the characteristic information of the first commodity is obtained, and the characteristic information is mapped to the target space based on the mapping model to obtain the first data corresponding to the characteristic information Mapping vector, and then determine the second data mapping vector of at least one second product in the target space, obtain the distance between the first data mapping vector and each second data mapping vector, from the at least one second data mapping vector, Select the second data mapping vector whose distance from the first data mapping vector is less than the preset distance, and determine the second product corresponding to the selected second data mapping vector. The first product and the second product are considered to be similar. The second product is recommended to users who have purchased the first product.
本申请实施例提供的方法,只需获取第一对象,将该第一对象映射到目标空间中,根据目标空间中包括的各个映射向量之间的距离,即可获取到进行推荐的第二对象,从而基于第一对象和第二对象进行推荐,推荐过程中不涉及除第一对象和第二对象之外的其他对象,即不需要获取其他对象,在应用时不受其他对象的限制,扩展了应用范围。In the method provided by the embodiments of the present application, only the first object is acquired, the first object is mapped to the target space, and the recommended second object can be acquired according to the distance between the mapping vectors included in the target space. Therefore, recommendations are made based on the first object and the second object. The recommendation process does not involve other objects except the first object and the second object, that is, there is no need to obtain other objects, and it is not restricted by other objects during application. The scope of application.
并且,相关技术中,在为用户标识推荐备选数据时,需要获取用户之前处理过的数据,对于新的用户标识或新的备选数据,无法获取新的用户标识和新的备选数据与其他用户标识或备选数据之间的关系,也就无法对新的用户标识推荐备选数据,或无法向用户标识推荐新的备选数据。而本申请在向用户标识推荐备选数据时,不涉及除该用户标识和该备选数据之前的其他用户标识或备选数据,因此,对于新的用户标识或新的备选数据,也能够进行推荐,扩展了应用范围。Moreover, in related technologies, when recommending candidate data for a user ID, it is necessary to obtain data that the user has previously processed. For a new user ID or new candidate data, it is impossible to obtain a new user ID and new candidate data and data. The relationship between other user identifications or candidate data, it is impossible to recommend candidate data to the new user identification, or to recommend new candidate data to the user identification. However, when recommending candidate data to a user ID in this application, it does not involve other user IDs or candidate data before the user ID and the candidate data. Therefore, it can also be used for new user IDs or new candidate data. Recommendations are made to expand the scope of application.
并且,如果第一对象为用户标识,第二对象为备选数据,只需将用户标识的特征信息映射到目标空间中,即可得到与该用户标识的用户映射向量之间的距离小于预设距离的数据映射向量,从而确定用户标识对应的用户感兴趣的备选数据,而不需要根据其他的用户标识或者备选数据,间接得到该用户标识对应的用户感兴趣的备选数据,扩展了应用范围。如果第一对象为备选数据,第二对象为用户标识,只需将备选数据的特征信息映射到目标空间中,即可得到与该备选数据的数据映射向量之间的距离小于预设距离的用户映射向量,从而确定对该备选数据感兴趣的用户标识,而不需要根据其他的备选数据或者用户标识,间接得到对该备选数据感兴趣的用户标识,扩展了应用范围。Moreover, if the first object is a user ID and the second object is candidate data, only the characteristic information of the user ID needs to be mapped to the target space, and the distance from the user mapping vector of the user ID is less than the preset value. The data mapping vector of the distance is used to determine the candidate data that the user is interested in corresponding to the user ID, without the need to indirectly obtain the candidate data that the user is interested in corresponding to the user ID based on other user IDs or candidate data. The scope of application. If the first object is the candidate data and the second object is the user identification, you only need to map the feature information of the candidate data to the target space, and the distance from the data mapping vector of the candidate data is less than the preset The user mapping vector of the distance is used to determine the user identification interested in the candidate data, without the need to obtain the user identification interested in the candidate data indirectly based on other candidate data or user identification, which expands the scope of application.
并且,该方法还能够绝对化用户标识的兴趣点,更加明确用户兴趣,在无备选数据的情况下,实现对用户喜欢的备选数据的特征的推测。In addition, this method can also absoluteize the interest points of the user identification, make the user interest more clear, and realize the inference of the characteristics of the candidate data that the user likes when there is no candidate data.
上述实施例中涉及到映射模型和反映射模型,为了便于对这两个模型进行训练,能够采用自编码器。自编码器包括编码模型和解码模型,将编码模型作为映射模型,将解码模型作为反映射模型。下面对自编码器的训练过程进行说明。In the above embodiments, the mapping model and the de-mapping model are involved. In order to facilitate the training of these two models, an autoencoder can be used. The autoencoder includes an encoding model and a decoding model. The encoding model is used as the mapping model, and the decoding model is used as the de-mapping model. The following describes the training process of the autoencoder.
(1)获取样本数据。(1) Obtain sample data.
获取样本信息,该样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,该样本标签用于表示是否向样本用户标识推荐样本数据。Obtain sample information, the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data and the sample label, and the sample label is used to indicate whether to identify the recommended sample data to the sample user.
可选地,该样本标签为1或-1,1表示样本用户标识与样本数据为正关系,即表示向用户标识推荐样本数据;-1表示样本用户标识与样本数据为负关系,即表示不向用户标识推荐样本数据。Optionally, the sample label is 1 or -1. 1 indicates that the sample user ID and sample data have a positive relationship, which means that the sample data is recommended to the user ID; -1 indicates that the sample user ID and the sample data have a negative relationship, which means that there is no Recommend sample data to user identification.
样本用户标识的特征信息与上述用户标识的特征信息类似,样本数据的特征信息与上述备选数据的特征信息类似,在此不再一一赘述。The feature information of the sample user identification is similar to the feature information of the above-mentioned user identification, and the feature information of the sample data is similar to the feature information of the above-mentioned candidate data, and will not be repeated here.
(2)根据样本数据,对自编码器进行训练。(2) According to the sample data, train the autoencoder.
将样本用户标识的特征信息、样本数据的特征信息输入至自编码器,基于自编码器输出预测用户标识的特征信息或预测样本数据的特征信息,根据该预测的特征信息与对应的输入的特征信息之间产生的损失值,对该自编码器的参数进行调整,以使调整后该自编码器输出的预测的特征信息与对应的输入的特征信息之间的损失值减小,达到训练该自编码器的目的。The feature information of the sample user identification and the feature information of the sample data are input to the autoencoder, and the feature information of the predicted user identification or the feature information of the predicted sample data is output based on the self-encoder, and the predicted feature information and the corresponding input feature are based on The loss value generated between the information, the parameters of the autoencoder are adjusted, so that the adjusted loss value between the predicted feature information output by the autoencoder and the corresponding input feature information is reduced, so as to achieve the training of the The purpose of the self-encoder.
其中,自编码器的结构参见图7,包括编码模型和解码模型,将特征向量
Figure PCTCN2020118107-appb-000029
输入至编码模型,对应得到一个映射向量
Figure PCTCN2020118107-appb-000030
采用解码模型对映射向量
Figure PCTCN2020118107-appb-000031
解码得 到对应的预测特征向量
Figure PCTCN2020118107-appb-000032
可选地,编码模型和解码模型中还包括多个隐藏层。
Among them, the structure of the autoencoder is shown in Figure 7, including the encoding model and the decoding model, and the feature vector
Figure PCTCN2020118107-appb-000029
Input to the coding model and get a corresponding mapping vector
Figure PCTCN2020118107-appb-000030
Use the decoding model to map the vector
Figure PCTCN2020118107-appb-000031
Decode the corresponding prediction feature vector
Figure PCTCN2020118107-appb-000032
Optionally, the encoding model and the decoding model further include multiple hidden layers.
在一种可能实现方式中,将样本标签一起输入至自编码器时,在训练过程中,基于编码模型得到映射的映射向量之后,预测用户标识对应的映射向量与样本数据对应的映射向量之间的距离,进而得到样本用户标识与样本数据为负关系还是正关系,将该得到的关系与输入的样本标签代表的关系进行比较,对该自编码器的参数进行调整,以使调整后该自编码器的预测关系与样本标签代表的关系相同,达到训练该自编码器的目的。In a possible implementation, when the sample labels are input to the autoencoder, in the training process, after the mapping vector is obtained based on the coding model, it is predicted between the mapping vector corresponding to the user identification and the mapping vector corresponding to the sample data The distance between the sample user ID and the sample data is negative or positive, and the obtained relationship is compared with the relationship represented by the input sample label, and the parameters of the autoencoder are adjusted to make the autoencoder after adjustment. The prediction relationship of the encoder is the same as the relationship represented by the sample label, which achieves the purpose of training the autoencoder.
在一种可能实现方式中,训练自编码器的损失函数包括以下几种:In one possible implementation, the loss function for training the autoencoder includes the following:
第一种:The first:
本申请实施例提供了两种损失函数,其中第一种损失函数为:The embodiment of the present application provides two loss functions, and the first loss function is:
Figure PCTCN2020118107-appb-000033
Figure PCTCN2020118107-appb-000033
其中,L neck1为映射模型的损失值,
Figure PCTCN2020118107-appb-000034
为样本标签,其取值为1或者-1,
Figure PCTCN2020118107-appb-000035
为样本用户标识的特征信息对应的映射向量,
Figure PCTCN2020118107-appb-000036
为样本数据的特征信息对应的映射向量。
Among them, L neck1 is the loss value of the mapping model,
Figure PCTCN2020118107-appb-000034
Is the sample label, its value is 1 or -1,
Figure PCTCN2020118107-appb-000035
Is the mapping vector corresponding to the characteristic information of the sample user identification,
Figure PCTCN2020118107-appb-000036
Is the mapping vector corresponding to the feature information of the sample data.
根据上述实施例中定义的一致度量,得到在该一致度量下用户标识对应的映射向量和备选数据对应的映射向量之间的距离,将该距离乘以对应的标签数据作为第一种损失函数。According to the consistent metric defined in the above embodiment, the distance between the mapping vector corresponding to the user identifier and the mapping vector corresponding to the candidate data under the consistent metric is obtained, and the distance is multiplied by the corresponding label data as the first loss function .
如果采用第一种损失函数进行训练,考虑到一种特殊情形,即训练样本中样本用户标识和样本数据呈现正关系,基于未经训练的自编码器中的编码模型,分别将样本用户标识和样本数据映射至度量空间,得到样本用户标识对应的映射向量
Figure PCTCN2020118107-appb-000037
以及样本数据对应的映射向量
Figure PCTCN2020118107-appb-000038
得到
Figure PCTCN2020118107-appb-000039
Figure PCTCN2020118107-appb-000040
两个向量的一致度量距离,参见图8,该距离大于预设距离,此时梯度为0,无法在该距离的基础上,采用梯度递减的方法继续进行训练。
If the first loss function is used for training, considering a special situation, that is, the sample user ID and sample data in the training sample show a positive relationship. Based on the coding model in the untrained autoencoder, the sample user ID and The sample data is mapped to the metric space, and the mapping vector corresponding to the sample user ID is obtained
Figure PCTCN2020118107-appb-000037
And the mapping vector corresponding to the sample data
Figure PCTCN2020118107-appb-000038
get
Figure PCTCN2020118107-appb-000039
with
Figure PCTCN2020118107-appb-000040
Refer to Figure 8 for the consistent measurement distance of the two vectors. The distance is greater than the preset distance. At this time, the gradient is 0. On the basis of this distance, the method of decreasing gradient cannot be used to continue training.
第二种损失函数为:The second loss function is:
Figure PCTCN2020118107-appb-000041
Figure PCTCN2020118107-appb-000041
其中,L neck2为映射模型的第一损失值,λ margin为预设参数,
Figure PCTCN2020118107-appb-000042
为样本标签,其取值为1或者-1,
Figure PCTCN2020118107-appb-000043
为样本用户标识的特征信息对应的映射向量,
Figure PCTCN2020118107-appb-000044
为样本数据的特征信息对应的映射向量。
Among them, L neck2 is the first loss value of the mapping model, and λ margin is the preset parameter,
Figure PCTCN2020118107-appb-000042
Is the sample label, its value is 1 or -1,
Figure PCTCN2020118107-appb-000043
Is the mapping vector corresponding to the characteristic information of the sample user identification,
Figure PCTCN2020118107-appb-000044
Is the mapping vector corresponding to the feature information of the sample data.
采用第二种损失函数,即hinge loss(一种损失函数)进行训练时,对于样本用户标识和样本数据呈现负关系的情况,参见图9,其中,斜线区域中的损失函数的损失值较小,空白区域中的损失函数的损失值较大,箭头方向表示希望向量
Figure PCTCN2020118107-appb-000045
移动的方向,使
Figure PCTCN2020118107-appb-000046
Figure PCTCN2020118107-appb-000047
之间的距离尽可能远。左图为采用第一种损失函数进行训练,虚线圆为与
Figure PCTCN2020118107-appb-000048
之间的距离为a的
Figure PCTCN2020118107-appb-000049
右图为采用第二种损失函数进行训练,虚线圆为负样本的目标安全界限,该目标安全界限为在距离a的基础上,加上某一个数值,得到的距离。采用该目标安全界限使训练得到的用户标识和样本数据之间的关系更为准确。在该情况下,采用两种损失函数进行训练的结果相同。
When the second type of loss function, namely hinge loss (a loss function) is used for training, for the case where the sample user ID and the sample data show a negative relationship, see Figure 9, where the loss value of the loss function in the diagonal area is higher. Small, the loss value of the loss function in the blank area is large, and the arrow direction indicates the hope vector
Figure PCTCN2020118107-appb-000045
The direction of movement so that
Figure PCTCN2020118107-appb-000046
versus
Figure PCTCN2020118107-appb-000047
The distance between them is as far as possible. The picture on the left shows the training using the first loss function, and the dashed circle is and
Figure PCTCN2020118107-appb-000048
The distance between a
Figure PCTCN2020118107-appb-000049
The figure on the right shows the second loss function for training. The dashed circle is the target safety limit of the negative sample. The target safety limit is the distance obtained by adding a certain value to the distance a. The use of the target safety limit makes the relationship between the user identification obtained by training and the sample data more accurate. In this case, the results of training with the two loss functions are the same.
对于样本用户标识和样本数据呈现正关系的情况,参见图10,其中,斜线 区域中的损失函数的损失值较小,空白区域中的损失函数的损失值较大,箭头方向表示希望向量
Figure PCTCN2020118107-appb-000050
移动的方向,使
Figure PCTCN2020118107-appb-000051
Figure PCTCN2020118107-appb-000052
之间的距离尽可能近。左图为采用第一种损失函数进行训练,右图为采用第二种损失函数进行训练。在该情况下,左图为图8所示的无法进行训练的情况,而右图中采用第二种损失函数能够避免无法进行训练的情况。
For the case where the sample user ID and the sample data show a positive relationship, see Figure 10, where the loss value of the loss function in the oblique area is small, the loss value of the loss function in the blank area is larger, and the arrow direction indicates the desired vector
Figure PCTCN2020118107-appb-000050
The direction of movement so that
Figure PCTCN2020118107-appb-000051
versus
Figure PCTCN2020118107-appb-000052
The distance between them is as close as possible. The picture on the left shows the use of the first loss function for training, and the picture on the right shows the use of the second loss function for training. In this case, the left picture shows the situation where training cannot be performed as shown in FIG. 8, and the second loss function in the right picture can avoid the situation where training cannot be performed.
综上所述,从数学方面来讲,基于一致度量定义的两个向量之间的距离,直接对自编码器进行训练与使用hinge loss对自编码器进行训练,效果相同。在实际应用时,采用第二种损失函数,即hinge loss进行训练,对一致度量中的预设距离a进行训练,得到更好的训练效果。In summary, from the mathematical point of view, directly training the autoencoder based on the distance between two vectors defined by a consistent metric has the same effect as training the autoencoder using hinge loss. In practical applications, the second loss function, namely hinge loss, is used for training, and the preset distance a in the consistent metric is trained to obtain a better training effect.
第二种:The second type:
考虑到将特征信息映射到低维空间时,映射向量在空间中的分布情况需要尽可能均匀分布,采用下述损失函数进行训练:Considering that when the feature information is mapped to a low-dimensional space, the distribution of the mapping vector in the space needs to be distributed as evenly as possible, and the following loss function is used for training:
Figure PCTCN2020118107-appb-000053
Figure PCTCN2020118107-appb-000053
其中,L cov为映射模型的第二损失值,N为样本数据的数量,E为
Figure PCTCN2020118107-appb-000054
Figure PCTCN2020118107-appb-000055
构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
Among them, L cov is the second loss value of the mapping model, N is the number of sample data, and E is
Figure PCTCN2020118107-appb-000054
with
Figure PCTCN2020118107-appb-000055
The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
在编码过程中,将样本用户标识对应的特征信息或者样本数据对应的特征信息映射到一个低维空间时,得到的对应的映射向量在低维空间中的分布情况参见图11,其中,三角形表示一个类别的映射向量,圆形表示另一个类别的映射向量。左图为映射之后,得到的高共线性的分布,这种情况浪费空间,并且会导致对自编码器进行训练时,为了映射更多的映射向量,只能调高空间的维度,进而导致对当前训练的样本能够完美预测,但是对于新的样本预测效果较差。采用该损失函数对自编码器进行训练能够达到右图的效果,使映射得到的映射向量在空间中均匀分布。In the encoding process, when the feature information corresponding to the sample user ID or the feature information corresponding to the sample data is mapped to a low-dimensional space, the distribution of the corresponding mapping vector in the low-dimensional space is shown in Figure 11, where the triangle represents The mapping vector of one category, and the circle represents the mapping vector of another category. The picture on the left shows the high collinearity distribution obtained after mapping. This situation wastes space and will cause the autoencoder to be trained. In order to map more mapping vectors, the dimension of the space can only be increased, which will lead to The current training samples can be perfectly predicted, but the prediction effect for new samples is poor. Using this loss function to train the autoencoder can achieve the effect of the right figure, so that the mapping vector obtained by the mapping is evenly distributed in the space.
第三种:The third type:
需要自编码器输入的特征信息与输出的特征信息尽可能相同,采用下述损失函数进行训练:It is necessary that the input feature information of the autoencoder is as same as the output feature information, and the following loss function is used for training:
Figure PCTCN2020118107-appb-000056
Figure PCTCN2020118107-appb-000056
其中,L reconstruct为自编码器的损失值,
Figure PCTCN2020118107-appb-000057
为样本用户标识的特征信息或所述样本数据的特征信息,
Figure PCTCN2020118107-appb-000058
Figure PCTCN2020118107-appb-000059
基于自编码器进行处理后输出的特征信息。
Among them, L reconstruct is the loss value of the self-encoder,
Figure PCTCN2020118107-appb-000057
Is the characteristic information identified by the sample user or the characteristic information of the sample data,
Figure PCTCN2020118107-appb-000058
for
Figure PCTCN2020118107-appb-000059
Based on the feature information output after processing by the autoencoder.
或者,将上述三个损失函数进行结合,得到一个整体的损失函数:Or, combine the above three loss functions to get an overall loss function:
L=αL neck2+βL reconstruct+γL covL=αL neck2 +βL reconstruct +γL cov ;
上述训练过程中,如果映射的空间无限大,无法对该空间中的全部映射向量之间的距离进行测量,因此,需要映射的低维空间为一个有限的空间。因此,编码模型中最后一层进入嵌入层的激活函数需要为有界激活函数,例如Sigmoid(一种有界激活函数)函数、tanh(双曲正切)函数等有界函数。In the above training process, if the mapped space is infinite, the distance between all the mapping vectors in the space cannot be measured. Therefore, the low-dimensional space that needs to be mapped is a limited space. Therefore, the activation function that enters the embedding layer in the last layer of the coding model needs to be a bounded activation function, such as a bounded function such as a Sigmoid (a bounded activation function) function, a tanh (hyperbolic tangent) function, etc.
并且,由于输出的特征信息中包括数值特征以及2进制特征,在处理时需 要将其标准化,即得到的数值在0和1之间。因此,解码模型中,最后一层进入输出层的激活函数的值域需要在0和1之间,可选地,激活函数为Sigmoid函数或者其他函数。Moreover, since the output feature information includes numeric features and binary features, it needs to be standardized during processing, that is, the value obtained is between 0 and 1. Therefore, in the decoding model, the value range of the activation function of the last layer entering the output layer needs to be between 0 and 1. Optionally, the activation function is a Sigmoid function or other functions.
在一种可能实现方式中,采用上述方法训练得到的自编码器,在低维空间中重新构建出来的用户标识或备选数据的兴趣流形结构,参见图12,其中,三角形表示一个类别的映射向量,圆形表示另一个类别的映射向量。形成该兴趣流形结构的是所有用户标识或备选数据对应的的映射向量。其中,任意两个映射向量,不论两个映射向量是否是同一类别,两个映射向量之间的距离能够直接表示两个映射向量的“倾睐”关系,距离越近,倾睐关系越强,距离越远,倾睐关系越弱。如果距离小于预设距离,则该两个映射向量为正关系,如果距离不小于预设距离,该两个映射向量为负关系。由于一致度量的三角不等式的性质,会对各个映射向量起到聚类作用,即相似的用户标识会被聚合在一起,相似的备选数据会被聚合在一起,以及用户标识和向该用户标识推荐的备选数据也会被聚合在一起。In a possible implementation manner, the autoencoder trained by the above method is used to reconstruct the user identification or the interest manifold structure of the candidate data in a low-dimensional space, as shown in Figure 12, where the triangle represents a category of The mapping vector, the circle represents the mapping vector of another category. What forms the interest manifold structure is the mapping vector corresponding to all user identities or candidate data. Among them, for any two mapping vectors, regardless of whether the two mapping vectors are of the same category, the distance between the two mapping vectors can directly indicate the "favoring" relationship of the two mapping vectors. The closer the distance, the stronger the favoring relationship. The farther the distance, the weaker the relationship. If the distance is less than the preset distance, the two mapping vectors are in a positive relationship, and if the distance is not less than the preset distance, the two mapping vectors are in a negative relationship. Due to the nature of the triangular inequality of the consistent measurement, each mapping vector will have a clustering effect, that is, similar user identifications will be aggregated, similar candidate data will be aggregated, and the user identification and the user identification The recommended candidate data will also be aggregated.
并且,得到兴趣流形结构之后,能够通过解码模型,对该兴趣流形结构中包括的映射向量进行解码,得到对应的用户标识的特征信息或者备选数据的特征信息。该解码过程参见图13,其中,圆形区域表示映射之后的兴趣流形结构中的一部分区域,该区域中的映射向量通过解码模型得到解码之后的流形结构。解码后得到的流形结构具有连续性,因此兴趣流形结构不会因为解码过程导致丢失映射向量。且如果解码模型不是单解码模型,即同一类别的多个映射向量通过解码模型解码之后,可能得到同一特征信息,因此,解码后得到的结构可能存在交叉区域。Moreover, after the interest manifold structure is obtained, the mapping vector included in the interest manifold structure can be decoded through the decoding model to obtain the feature information of the corresponding user identification or the feature information of the candidate data. Refer to FIG. 13 for the decoding process, where the circular area represents a part of the interest manifold structure after mapping, and the mapping vector in this area obtains the decoded manifold structure through the decoding model. The manifold structure obtained after decoding has continuity, so the interest manifold structure will not lose the mapping vector due to the decoding process. And if the decoding model is not a single decoding model, that is, after multiple mapping vectors of the same category are decoded by the decoding model, the same feature information may be obtained. Therefore, the structure obtained after decoding may have a cross region.
需要说明的第一点是,本申请实施例仅是以一个自编码器的训练过程为例进行说明。上述实施例中使用自编码器时,能够采用双自编码器,双自编码器的结构参见图14,其中一个自编码器用于对用户标识的特征信息进行编码和解码,另一个自编码器用于对备选数据的特征信息进行编码和解码。在另一实施例中,如果包括多个类别的特征信息,则能够采用对应数量的多个自编码器,分别对各个类别的特征信息进行编码和解码。The first point that needs to be explained is that the embodiment of the present application only takes the training process of an autoencoder as an example for description. When a self-encoder is used in the above embodiment, a double-self-encoder can be used. The structure of the double-self-encoder is shown in Figure 14. One of the self-encoders is used to encode and decode the characteristic information of the user identification, and the other is used to Encode and decode the feature information of the candidate data. In another embodiment, if multiple categories of feature information are included, a corresponding number of multiple autoencoders can be used to respectively encode and decode the feature information of each category.
需要说明的第二点是,在另一实施例中能够根据不同的数据类型采用不同类型的自编码器,例如VAE(Variational Auto-Encoder,变分自编码器)、Contrastive(对照)自编码器等。使用不同类型的编码器需要遵循对应的原则,需要将特征信息映射至一个具有明确度量定义的有界空间中。The second point that needs to be explained is that in another embodiment, different types of auto-encoders can be used according to different data types, such as VAE (Variational Auto-Encoder) and Contrastive auto-encoders. Wait. The use of different types of encoders needs to follow corresponding principles, and the characteristic information needs to be mapped into a bounded space with a clear metric definition.
例如,采用VAE(Variational Auto-Encoder,变分自编码器)时,目标空间为一个高维高斯分布的概率分布空间,在该空间中定义KLD(Kullback–Leibler divergence,KL散度)对分布于分布之间的距离进行定义。For example, when using VAE (Variational Auto-Encoder), the target space is a high-dimensional Gaussian probability distribution space, in which KLD (Kullback–Leibler divergence, KL divergence) pairs are defined in The distance between the distributions is defined.
需要说明的第三点是,对于不同的数据稀疏度,能够相应在原来的模型中加入嵌入层,或者采用wide&deep(一种深度学习模型)等其他模型代替基础的MLP(Multilayer Perceptron,人工神经网络),以更好地获取系数数据中的信息。The third point that needs to be explained is that for different data sparsity, the embedding layer can be added to the original model, or other models such as wide&deep (a deep learning model) can be used to replace the basic MLP (Multilayer Perceptron, artificial neural network). ) To better obtain the information in the coefficient data.
需要说明的第四点是,如果输入的数据是具有时序的数据,则能够采用随时间变化的神经网络。例如,从深度学习方面来说,采用RNN(Recurrent Neural Network,循环卷积神经网络)或者LSTM(Long Short-Term Memory,长短期记忆模型);从统计学习方面来说,采用贝叶斯先验后验的时序更新,或者使用卡尔曼滤波器进行时间序列数据的学习。The fourth point that needs to be explained is that if the input data is time-series data, a neural network that changes over time can be used. For example, from the aspect of deep learning, RNN (Recurrent Neural Network) or LSTM (Long Short-Term Memory) is used; from the aspect of statistical learning, Bayesian prior The posterior timing update, or the use of Kalman filter for time series data learning.
相关技术中,采用CML(Collaborative Metric Learning,协同度量学习)进行推荐,该方法在已知原始的空间中用户标识和备选数据的关系的情况下,基于该已知的关系,对目标空间中对应的向量进行移动,以得到和原始空间类似的距离关系。该方法用于固定的用户标识和备选数据,该方法使用范围较小,并且,需要获得原始空间中的用户标识和备选数据的关系,对新的用户标识或者备选数据进行推荐时,该方法不适用。In the related technology, CML (Collaborative Metric Learning) is used for recommendation. This method is based on the known relationship between the user identifier and the candidate data in the original space and the target space. The corresponding vector is moved to obtain a distance relationship similar to the original space. This method is used for fixed user identification and candidate data. This method has a small range of use, and needs to obtain the relationship between the user identification and candidate data in the original space. When recommending new user identification or candidate data, This method is not applicable.
例如,采用协同度量学习为用户推荐商品,参见图16,图中的圆表示用户,三角形表示用户喜欢的商品,矩形表示用户不喜欢的商品,箭头用于表示商品的移动方向。左图为商品和用户的原始位置,基于协同度量学习,得到右图中的结果,使用户喜欢的商品靠近用户,用户不喜欢的商品远离用户。采用该方法时,空间中的用户和商品是固定不变的,只能为固定的用户推荐固定的商品,如果该空间中没有商品时,无法推测出该用户可能喜欢的商品。For example, using collaborative metric learning to recommend products for users, see Figure 16. The circle in the figure represents the user, the triangle represents the product the user likes, the rectangle represents the product the user dislikes, and the arrow is used to indicate the direction of the product. The left picture is the original position of the product and the user. Based on the collaborative measurement learning, the result in the right picture is obtained, so that the product that the user likes is close to the user, and the product that the user does not like is far away from the user. When this method is adopted, the users and commodities in the space are fixed, and only fixed commodities can be recommended for the fixed users. If there are no commodities in the space, it is impossible to infer the commodities that the user may like.
上述CML方法中,对于多个用户标识和多个备选数据,部分用户标识和部分备选数据之间的关系必须是已知的,否则无法在目标空间中根据已知的关系对向量进行移动,且对于新的用户标识或者新的备选数据,由于新的用户标识和新的备选数据与其他用户标识和备选数据之间的关系是未知的,则无法确定新的用户标识和新的备选数据对应的向量在目标空间中的位置,无法进行推荐。而本申请实施例不需要预先获取用户标识和备选数据之间的关系,且适用于任一用户标识或备选数据,扩展了应用范围。In the above CML method, for multiple user IDs and multiple candidate data, the relationship between some user IDs and some candidate data must be known, otherwise the vector cannot be moved in the target space according to the known relationship , And for a new user ID or new candidate data, because the relationship between the new user ID and new candidate data and other user IDs and candidate data is unknown, it is impossible to determine the new user ID and new candidate data. The position of the vector corresponding to the candidate data in the target space cannot be recommended. However, the embodiment of the present application does not need to obtain the relationship between the user identification and the candidate data in advance, and is applicable to any user identification or candidate data, which expands the scope of application.
相关技术中,采用t-SNE(student-t Stochastic Neighborhood Embedding,一种算法)重新构建数据流形结构。该算法的原理为:高维度空间中任两个特征向量之间的距离关系,要与低维空间中的任两个映射向量的距离关系类似,假设高维空间中有两个特征向量,该两个特征向量在高维空间中距离很远,那么在低维空间中,用于这两个特征向量对应的两个映射向量距离也应该很远,反之亦然。如果原高维空间中有多个特征向量,比如说n个特征向量,那么低维空间就会有n个对应的映射向量。该方法实现的效果参见图15,左侧第一个图为原始高维空间中多个特征向量构成的流形结构,第二个图是将原始高维空间中多个特征向量映射到低维空间之后得到的映射向量构成的流形结构,然后依次得到第三个图和第四个图的流形结构,最终得到第五个图的流形结构,实现在低维空间中重新构建了高维空间中n个特征向量的流形结构。In related technologies, t-SNE (student-t Stochastic Neighborhood Embedding, an algorithm) is used to reconstruct the data manifold structure. The principle of the algorithm is: the distance relationship between any two feature vectors in the high-dimensional space should be similar to the distance relationship between any two mapping vectors in the low-dimensional space. Assuming that there are two feature vectors in the high-dimensional space, the The two eigenvectors are far apart in the high-dimensional space, so in the low-dimensional space, the two mapping vectors corresponding to the two eigenvectors should also be far away, and vice versa. If there are multiple feature vectors in the original high-dimensional space, such as n feature vectors, then the low-dimensional space will have n corresponding mapping vectors. The effect achieved by this method is shown in Figure 15. The first figure on the left is the manifold structure composed of multiple feature vectors in the original high-dimensional space, and the second figure is the mapping of multiple feature vectors in the original high-dimensional space to the low-dimensional The manifold structure formed by the mapping vector obtained after the space, and then the manifold structure of the third graph and the fourth graph are obtained in turn, and finally the manifold structure of the fifth graph is obtained, which realizes the reconstruction of the high-dimensional space in the low-dimensional space. Manifold structure of n eigenvectors in a dimensional space.
相关技术中,还采用VaeCF(Variational Autoencoder Collaborative Filtering,一种深度化模型)进行数据推荐,该深度化模型参见图17,通过该方法能够准确获取用户标识与备选数据之间的关系,但是,使用该方法时,如果只根据用 户标识的特征信息,没有给出备选数据的特征信息,则无法获取为该用户标识推荐的备选数据,即不能够根据用户标识获取用户的兴趣。In related technologies, VaeCF (Variational Autoencoder Collaborative Filtering, an in-depth model) is also used for data recommendation. Refer to Figure 17 for the in-depth model. This method can accurately obtain the relationship between user identification and candidate data. However, When using this method, if only the characteristic information of the user ID is used, and the characteristic information of the candidate data is not given, the candidate data recommended for the user ID cannot be obtained, that is, the user's interest cannot be obtained based on the user ID.
而本申请实施例中提供的方法,根据任一用户标识的特征信息,能够基于自编码器获取为该用户标识推荐的备选数据,或者,根据任一备选数据的特征信息,能够基于自编码器获取为对该备选数据感兴趣的用户标识,然后进行备选数据的推荐,该方法的使用范围大,且能够根据用户标识或者备选数据其中一个的特征信息,进行推荐,解决了相关技术中必须获取用户标识的特征信息和备选数据的特征信息,否则无法进行推荐的问题。In the method provided in the embodiments of the present application, according to the characteristic information of any user identification, the candidate data recommended for the user identification can be obtained based on the self-encoder, or, according to the characteristic information of any candidate data, it can be based on the self-encoder. The encoder obtains the user identification that is interested in the candidate data, and then recommends the candidate data. This method has a wide range of use and can make recommendations based on the user identification or the feature information of one of the candidate data. In related technologies, the characteristic information of the user identification and the characteristic information of the candidate data must be obtained, otherwise the problem of recommendation cannot be made.
图18是本申请实施例提供的一种推荐装置的结构示意图。参见图18,该装置包括:FIG. 18 is a schematic structural diagram of a recommendation device provided by an embodiment of the present application. Referring to Figure 18, the device includes:
第一信息获取模块1801,用于获取第一对象的第一特征信息,第一对象属于用户标识或备选数据;The first information obtaining module 1801 is configured to obtain first characteristic information of a first object, and the first object belongs to user identification or candidate data;
第一映射模块1802,用于基于映射模型,将第一特征信息映射至目标空间,得到第一对象在目标空间中对应的第一映射向量,目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;The first mapping module 1802 is configured to map the first feature information to the target space based on the mapping model to obtain the first mapping vector corresponding to the first object in the target space. The target space includes the user mapping vector corresponding to the user identification and The data mapping vector corresponding to the candidate data;
推荐模块1803,用于根据目标空间中任两个映射向量之间的距离,基于第一对象及第二对象进行推荐,其中,第二映射向量为第二对象在目标空间中对应的向量,第二映射向量与第一映射向量之间的距离小于预设距离,且第二映射向量与第一映射向量属于不同类别。The recommendation module 1803 is configured to make recommendations based on the first object and the second object according to the distance between any two mapping vectors in the target space, where the second mapping vector is a vector corresponding to the second object in the target space, and the first The distance between the second mapping vector and the first mapping vector is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
可选地,参见图19,装置还包括:Optionally, referring to FIG. 19, the device further includes:
第二信息获取模块1804,用于获取第二对象的第二特征信息;The second information obtaining module 1804 is configured to obtain second characteristic information of the second object;
第二映射模块1805,用于基于映射模型,将第二特征信息映射至目标空间,得到第二对象在目标空间中对应的第二映射向量;The second mapping module 1805 is configured to map the second feature information to the target space based on the mapping model to obtain a second mapping vector corresponding to the second object in the target space;
推荐模块1803还包括:The recommended module 1803 also includes:
第一距离获取单元18031,用于获取第一映射向量与第二映射向量之间的距离;The first distance obtaining unit 18031 is configured to obtain the distance between the first mapping vector and the second mapping vector;
第一推荐单元18032,用于如果距离小于预设距离时,基于第一对象和第二对象进行推荐。The first recommendation unit 18032 is configured to make a recommendation based on the first object and the second object if the distance is less than the preset distance.
可选地,参见图19,推荐模块1803包括:Optionally, referring to FIG. 19, the recommendation module 1803 includes:
向量确定单元18033,用于确定目标空间中的至少一个第三映射向量,第三映射向量与第一映射向量属于不同类别;The vector determining unit 18033 is configured to determine at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
第二距离获取单元18034,用于获取第一映射向量与每个第三映射向量之间的距离;The second distance obtaining unit 18034 is configured to obtain the distance between the first mapping vector and each third mapping vector;
向量选取单元18035,用于从至少一个第三映射向量中,选取与第一映射向量之间的距离小于预设距离的第二映射向量;The vector selecting unit 18035 is configured to select, from at least one third mapping vector, a second mapping vector whose distance from the first mapping vector is less than a preset distance;
第二推荐单元18036,用于确定第二映射向量对应的第二对象,基于第一对象和第二对象进行推荐。The second recommendation unit 18036 is configured to determine the second object corresponding to the second mapping vector, and make recommendations based on the first object and the second object.
可选地,第二推荐单元18036还用于基于反映射模型,对第二映射向量进 行反映射,得到第二映射向量对应的第二特征信息,确定第二特征信息所属的第二对象。Optionally, the second recommendation unit 18036 is further configured to perform inverse mapping on the second mapping vector based on the inverse mapping model to obtain second feature information corresponding to the second mapping vector, and determine the second object to which the second feature information belongs.
可选地,参见图19,装置还包括:Optionally, referring to FIG. 19, the device further includes:
第一样本获取模块1806,用于获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;The first sample acquisition module 1806 is used to acquire sample information, the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label, and the sample label is used to indicate whether to identify the recommended sample data to the sample user;
第一训练模块1807,用于根据样本信息,对映射模型进行训练。The first training module 1807 is used to train the mapping model according to the sample information.
可选地,装置还包括用于训练映射模型的损失函数,包括以下至少一个:Optionally, the device further includes a loss function for training the mapping model, including at least one of the following:
Figure PCTCN2020118107-appb-000060
Figure PCTCN2020118107-appb-000060
其中,L neck为映射模型的第一损失值,λ margin为预设参数,
Figure PCTCN2020118107-appb-000061
为样本标签,
Figure PCTCN2020118107-appb-000062
为样本用户标识对应的映射向量,
Figure PCTCN2020118107-appb-000063
为样本数据对应的映射向量;
Among them, L neck is the first loss value of the mapping model, and λ margin is the preset parameter,
Figure PCTCN2020118107-appb-000061
Is the sample label,
Figure PCTCN2020118107-appb-000062
Is the mapping vector corresponding to the sample user ID,
Figure PCTCN2020118107-appb-000063
Is the mapping vector corresponding to the sample data;
Figure PCTCN2020118107-appb-000064
Figure PCTCN2020118107-appb-000064
其中,L cov为映射模型的第二损失值,N为样本信息的数量,E为
Figure PCTCN2020118107-appb-000065
Figure PCTCN2020118107-appb-000066
构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
Among them, L cov is the second loss value of the mapping model, N is the number of sample information, and E is
Figure PCTCN2020118107-appb-000065
with
Figure PCTCN2020118107-appb-000066
The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
可选地,参见图20,映射模型为自编码器中的编码模型;装置还包括:Optionally, referring to Figure 20, the mapping model is an encoding model in the autoencoder; the device further includes:
第二样本获取模块1808,用于获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;The second sample acquisition module 1808 is used to acquire sample information, the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label, and the sample label is used to indicate whether to identify the recommended sample data to the sample user;
第二训练模块1809,用于根据样本信息,对自编码器进行训练。The second training module 1809 is used to train the autoencoder according to the sample information.
可选地,装置还包括用于训练自编码器的损失函数,至少包括:Optionally, the device further includes a loss function for training the autoencoder, including at least:
Figure PCTCN2020118107-appb-000067
Figure PCTCN2020118107-appb-000067
其中,L reconstruct为自编码器的损失值,
Figure PCTCN2020118107-appb-000068
为样本用户标识的特征信息或样本数据的特征信息,
Figure PCTCN2020118107-appb-000069
Figure PCTCN2020118107-appb-000070
基于自编码器进行处理后输出的特征信息。
Among them, L reconstruct is the loss value of the self-encoder,
Figure PCTCN2020118107-appb-000068
Characteristic information identified by the sample user or characteristic information of the sample data,
Figure PCTCN2020118107-appb-000069
for
Figure PCTCN2020118107-appb-000070
Based on the feature information output after processing by the autoencoder.
可选地,第一对象为用户标识,第二对象为备选数据,或者,第一对象为备选数据,第二对象为用户标识;Optionally, the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification;
推荐模块1803,还用于向用户标识推荐备选数据。The recommendation module 1803 is also used to recommend candidate data to the user identification.
可选地,映射模型包括用户映射子模型和数据映射子模型;Optionally, the mapping model includes a user mapping sub-model and a data mapping sub-model;
用户映射子模型用于对用户标识的特征信息进行映射,得到用户映射向量;The user mapping sub-model is used to map the characteristic information of the user identification to obtain the user mapping vector;
数据映射子模型用于对备选数据的特征信息进行映射,得到数据映射向量。The data mapping sub-model is used to map the feature information of the candidate data to obtain the data mapping vector.
需要说明的是:上述实施例提供的推荐装置,仅以上述各功能模块的划分进行举例说明,实际应用中,能够根据需要而将上述功能分配由不同的功能模块完成以完成以上描述的全部或者部分功能。另外,上述实施例提供的推荐装置与推荐方法实施例属于同一构思,其实现过程详见方法实施例,这里不再赘述。It should be noted that the recommendation device provided in the above embodiment only uses the division of the above functional modules for illustration. In practical applications, the above functions can be allocated by different functional modules to complete all or all of the above descriptions according to needs. Part of the function. In addition, the recommending device provided in the foregoing embodiment and the recommending method embodiment belong to the same concept, and the implementation process is detailed in the method embodiment, which will not be repeated here.
图21是本申请实施例提供的一种终端2100的结构示意图。FIG. 21 is a schematic structural diagram of a terminal 2100 provided by an embodiment of the present application.
通常,终端2100包括有:处理器2101和存储器2102。Generally, the terminal 2100 includes a processor 2101 and a memory 2102.
处理器2101包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器2101采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器2101还包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU;协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器2101在集成有GPU(Graphics Processing Unit,图像处理的交互器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器2101还包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。The processor 2101 includes one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 2101 adopts at least one hardware form among DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array, Programmable Logic Array). . The processor 2101 also includes a main processor and a coprocessor. The main processor is a processor used to process data in the awake state, also called a CPU; the coprocessor is used to process data in a standby state. Low-power processor for processing. In some embodiments, the processor 2101 is integrated with a GPU (Graphics Processing Unit, image processing interactor), and the GPU is used to render and draw content that needs to be displayed on the display screen. In some embodiments, the processor 2101 further includes an AI (Artificial Intelligence) processor, and the AI processor is used to process computing operations related to machine learning.
存储器2102包括一个或多个计算机可读存储介质,该计算机可读存储介质是非暂态的。存储器2102还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器2102中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器2101所具有以实现本申请中方法实施例提供的推荐方法。The memory 2102 includes one or more computer-readable storage media, which are non-transitory. The memory 2102 may also include high-speed random access memory and non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices. In some embodiments, the non-transitory computer-readable storage medium in the memory 2102 is used to store at least one instruction, and the at least one instruction is used by the processor 2101 to implement the recommended method provided in the method embodiment of the present application. .
在一些实施例中,终端2100还可选包括有:外围设备接口2103和至少一个外围设备。处理器2101、存储器2102和外围设备接口2103之间通过总线或信号线相连。各个外围设备通过总线、信号线或电路板与外围设备接口2103相连。可选地,外围设备包括:射频电路2104、显示屏2105、摄像头组件2106、音频电路2107、定位组件2108和电源2109中的至少一种。In some embodiments, the terminal 2100 may optionally further include: a peripheral device interface 2103 and at least one peripheral device. The processor 2101, the memory 2102, and the peripheral device interface 2103 are connected by a bus or signal line. Each peripheral device is connected to the peripheral device interface 2103 through a bus, a signal line or a circuit board. Optionally, the peripheral device includes: at least one of a radio frequency circuit 2104, a display screen 2105, a camera component 2106, an audio circuit 2107, a positioning component 2108, and a power supply 2109.
外围设备接口2103可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器2101和存储器2102。在一些实施例中,处理器2101、存储器2102和外围设备接口2103被集成在同一芯片或电路板上;在一些其他实施例中,处理器2101、存储器2102和外围设备接口2103中的任意一个或两个在单独的芯片或电路板上实现,本实施例对此不加以限定。The peripheral device interface 2103 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 2101 and the memory 2102. In some embodiments, the processor 2101, the memory 2102, and the peripheral device interface 2103 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 2101, the memory 2102, and the peripheral device interface 2103 or The two are implemented on separate chips or circuit boards, which are not limited in this embodiment.
射频电路2104用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路2104通过电磁信号与通信网络以及其他通信设备进行通信。射频电路2104将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路2104包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路2104通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及8G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路2104还包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。The radio frequency circuit 2104 is used to receive and transmit RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals. The radio frequency circuit 2104 communicates with a communication network and other communication devices through electromagnetic signals. The radio frequency circuit 2104 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals. Optionally, the radio frequency circuit 2104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and so on. The radio frequency circuit 2104 communicates with other terminals through at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 8G), wireless local area networks, and/or WiFi (Wireless Fidelity, wireless fidelity) networks. In some embodiments, the radio frequency circuit 2104 also includes a circuit related to NFC (Near Field Communication), which is not limited in this application.
显示屏2105用于显示UI(User Interface,用户界面)。该UI包括图形、文本、图标、视频及其它们的任意组合。当显示屏2105是触摸显示屏时,显示屏2105还具有采集在显示屏2105的表面或表面上方的触摸信号的能力。该触摸信 号作为控制信号输入至处理器2101进行处理。此时,显示屏2105还用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏2105为一个,设置终端2100的前面板;在另一些实施例中,显示屏2105为至少两个,分别设置在终端2100的不同表面或呈折叠设计;在再一些实施例中,显示屏2105是柔性显示屏,设置在终端2100的弯曲表面上或折叠面上。甚至,显示屏2105还能够设置成非矩形的不规则图形,也即异形屏。显示屏2105采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。The display screen 2105 is used to display UI (User Interface, user interface). The UI includes graphics, text, icons, videos, and any combination of them. When the display screen 2105 is a touch display screen, the display screen 2105 also has the ability to collect touch signals on or above the surface of the display screen 2105. The touch signal is input to the processor 2101 as a control signal for processing. At this time, the display screen 2105 is also used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards. In some embodiments, one display screen 2105 is provided with the front panel of the terminal 2100; in other embodiments, there are at least two display screens 2105, which are respectively provided on different surfaces of the terminal 2100 or in a folding design; In the embodiment, the display screen 2105 is a flexible display screen, which is arranged on the curved surface or the folding surface of the terminal 2100. Furthermore, the display screen 2105 can also be set as a non-rectangular irregular pattern, that is, a special-shaped screen. The display screen 2105 is made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
摄像头组件2106用于采集图像或视频。可选地,摄像头组件2106包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端2100的前面板,后置摄像头设置在终端2100的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件2106还包括闪光灯。可选地,闪光灯是单色温闪光灯或者双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,用于不同色温下的光线补偿。The camera assembly 2106 is used to capture images or videos. Optionally, the camera assembly 2106 includes a front camera and a rear camera. Generally, the front camera is set on the front panel of the terminal 2100, and the rear camera is set on the back of the terminal 2100. In some embodiments, there are at least two rear cameras, each of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function, the main camera Integrate with the wide-angle camera to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, the camera assembly 2106 also includes a flash. Optionally, the flash is a single-color temperature flash or a dual-color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash used for light compensation under different color temperatures.
音频电路2107包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器2101进行处理,或者输入至射频电路2104以实现语音通信。出于立体声采集或降噪的目的,麦克风为多个,分别设置在终端2100的不同部位。麦克风是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器2101或射频电路2104的电信号转换为声波。可选地,扬声器是传统的薄膜扬声器或者是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅将电信号转换为人类可听见的声波,也将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路2107还包括耳机插孔。The audio circuit 2107 includes a microphone and a speaker. The microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 2101 for processing, or input to the radio frequency circuit 2104 to implement voice communication. For the purpose of stereo collection or noise reduction, there are multiple microphones, which are respectively set in different parts of the terminal 2100. The microphone is an array microphone or an omnidirectional acquisition microphone. The speaker is used to convert the electrical signal from the processor 2101 or the radio frequency circuit 2104 into sound waves. Optionally, the speaker is a traditional thin-film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, it not only converts the electrical signal into human audible sound waves, but also converts the electrical signal into human inaudible sound waves for purposes such as distance measurement. In some embodiments, the audio circuit 2107 also includes a headphone jack.
定位组件2108用于定位终端2100的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件2108是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。The positioning component 2108 is used to locate the current geographic location of the terminal 2100 to implement navigation or LBS (Location Based Service, location-based service). The positioning component 2108 is a positioning component based on the GPS (Global Positioning System, Global Positioning System) of the United States, the Beidou system of China, the Granus system of Russia, or the Galileo system of the European Union.
电源2109用于为终端2100中的各个组件进行供电。电源2109是交流电、直流电、一次性电池或可充电电池。当电源2109包括可充电电池时,该可充电电池支持有线充电或无线充电。该可充电电池还用于支持快充技术。The power supply 2109 is used to supply power to various components in the terminal 2100. The power source 2109 is alternating current, direct current, disposable batteries or rechargeable batteries. When the power source 2109 includes a rechargeable battery, the rechargeable battery supports wired charging or wireless charging. The rechargeable battery is also used to support fast charging technology.
在一些实施例中,终端2100还包括有一个或多个传感器2110。该一个或多个传感器2110包括但不限于:加速度传感器2111、陀螺仪传感器2112、压力传感器2113、指纹传感器2114、光学传感器2115以及接近传感器2116。In some embodiments, the terminal 2100 further includes one or more sensors 2110. The one or more sensors 2110 include, but are not limited to: an acceleration sensor 2111, a gyroscope sensor 2112, a pressure sensor 2113, a fingerprint sensor 2114, an optical sensor 2115, and a proximity sensor 2116.
加速度传感器2111检测以终端2100建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器2111用于检测重力加速度在三个坐标轴上的分量。处理器2101根据加速度传感器2111采集的重力加速度信号,控制显示屏2105以横向视图或纵向视图进行用户界面的显示。加速度传感器2111还用于游戏或 者用户的运动数据的采集。The acceleration sensor 2111 detects the magnitude of acceleration on the three coordinate axes of the coordinate system established by the terminal 2100. For example, the acceleration sensor 2111 is used to detect the components of gravitational acceleration on three coordinate axes. The processor 2101 controls the display screen 2105 to display the user interface in a horizontal view or a vertical view according to the gravitational acceleration signal collected by the acceleration sensor 2111. The acceleration sensor 2111 is also used for the collection of game or user motion data.
陀螺仪传感器2112检测终端2100的机体方向及转动角度,陀螺仪传感器2112与加速度传感器2111协同采集用户对终端2100的3D动作。处理器2101根据陀螺仪传感器2112采集的数据,实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。The gyroscope sensor 2112 detects the body direction and rotation angle of the terminal 2100, and the gyroscope sensor 2112 and the acceleration sensor 2111 cooperate to collect the user's 3D actions on the terminal 2100. The processor 2101 implements the following functions based on the data collected by the gyroscope sensor 2112: motion sensing (such as changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
压力传感器2113设置在终端2100的侧边框和/或显示屏2105的下层。当压力传感器2113设置在终端2100的侧边框时,检测用户对终端2100的握持信号,由处理器2101根据压力传感器2113采集的握持信号进行左右手识别或快捷操作。当压力传感器2113设置在显示屏2105的下层时,由处理器2101根据用户对显示屏2105的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。The pressure sensor 2113 is arranged on the side frame of the terminal 2100 and/or the lower layer of the display screen 2105. When the pressure sensor 2113 is arranged on the side frame of the terminal 2100, the user's holding signal of the terminal 2100 is detected, and the processor 2101 performs left and right hand recognition or quick operation according to the holding signal collected by the pressure sensor 2113. When the pressure sensor 2113 is arranged on the lower layer of the display screen 2105, the processor 2101 controls the operability controls on the UI interface according to the user's pressure operation on the display screen 2105. The operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
指纹传感器2114用于采集用户的指纹,由处理器2101根据指纹传感器1414采集到的指纹识别用户的身份,或者,由指纹传感器2114根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器2101授权该用户具有相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器2114被设置终端2100的正面、背面或侧面。当终端2100上设置有物理按键或厂商Logo时,指纹传感器2114与物理按键或厂商标志集成在一起。The fingerprint sensor 2114 is used to collect the user's fingerprint. The processor 2101 identifies the user's identity according to the fingerprint collected by the fingerprint sensor 1414, or the fingerprint sensor 2114 identifies the user's identity according to the collected fingerprint. When it is recognized that the user's identity is a trusted identity, the processor 2101 authorizes the user to have related sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings. The fingerprint sensor 2114 is provided on the front, back or side of the terminal 2100. When a physical button or manufacturer logo is provided on the terminal 2100, the fingerprint sensor 2114 is integrated with the physical button or manufacturer logo.
光学传感器2115用于采集环境光强度。在一个实施例中,处理器2101根据光学传感器2115采集的环境光强度,控制显示屏2105的显示亮度。可选地,当环境光强度较高时,调高显示屏2105的显示亮度;当环境光强度较低时,调低显示屏2105的显示亮度。在另一个实施例中,处理器2101还根据光学传感器2115采集的环境光强度,动态调整摄像头组件2106的拍摄参数。The optical sensor 2115 is used to collect the ambient light intensity. In an embodiment, the processor 2101 controls the display brightness of the display screen 2105 according to the ambient light intensity collected by the optical sensor 2115. Optionally, when the ambient light intensity is high, the display brightness of the display screen 2105 is increased; when the ambient light intensity is low, the display brightness of the display screen 2105 is decreased. In another embodiment, the processor 2101 also dynamically adjusts the shooting parameters of the camera assembly 2106 according to the ambient light intensity collected by the optical sensor 2115.
接近传感器2116,也称距离传感器,通常设置在终端2100的前面板。接近传感器2116用于采集用户与终端2100的正面之间的距离。在一个实施例中,当接近传感器2116检测到用户与终端2100的正面之间的距离逐渐变小时,由处理器2101控制显示屏2105从亮屏状态切换为息屏状态;当接近传感器2116检测到用户与终端2100的正面之间的距离逐渐变大时,由处理器2101控制显示屏2105从息屏状态切换为亮屏状态。The proximity sensor 2116, also called a distance sensor, is usually arranged on the front panel of the terminal 2100. The proximity sensor 2116 is used to collect the distance between the user and the front of the terminal 2100. In one embodiment, when the proximity sensor 2116 detects that the distance between the user and the front of the terminal 2100 gradually decreases, the processor 2101 controls the display screen 2105 to switch from the on-screen state to the off-screen state; when the proximity sensor 2116 detects When the distance between the user and the front of the terminal 2100 gradually increases, the processor 2101 controls the display screen 2105 to switch from the screen-on state to the screen-on state.
本领域技术人员理解,图21中示出的结构并不构成对终端2100的限定,还能够包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。Those skilled in the art understand that the structure shown in FIG. 21 does not constitute a limitation to the terminal 2100, and can also include more or less components than those shown in the figure, or combine some components, or adopt different component arrangements.
图22是本申请实施例提供的一种服务器的结构示意图,该服务器2200可因配置或性能不同而产生比较大的差异,包括一个或一个以上处理器(Central Processing Units,CPU)2201和一个或一个以上的存储器2202,其中,存储器2202中存储有至少一条指令,至少一条指令由处理器2201加载并执行以实现上述各个方法实施例提供的方法。当然,该服务器还具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进行输入输出,该服务器还包括其他用于 实现设备功能的部件,在此不做赘述。FIG. 22 is a schematic structural diagram of a server provided by an embodiment of the present application. The server 2200 may have relatively large differences due to different configurations or performance, including one or more processors (Central Processing Units, CPU) 2201 and one or There are more than one memory 2202, where at least one instruction is stored in the memory 2202, and at least one instruction is loaded and executed by the processor 2201 to implement the methods provided by the foregoing method embodiments. Of course, the server also has components such as a wired or wireless network interface, a keyboard, and an input and output interface for input and output. The server also includes other components for implementing device functions, which will not be repeated here.
服务器2200用于执行上述推荐方法中服务器所执行的步骤。The server 2200 is configured to execute the steps executed by the server in the above-mentioned recommendation method.
本申请实施例还提供了一种计算机设备,该计算机设备包括处理器和存储器,存储器中存储有至少一条程序代码,该至少一条程序代码由处理器加载并执行,以实现如下步骤:The embodiment of the present application also provides a computer device, the computer device includes a processor and a memory, at least one piece of program code is stored in the memory, and the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取第一对象的第一特征信息,第一对象属于用户标识或备选数据;Acquire first characteristic information of the first object, where the first object belongs to user identification or candidate data;
基于映射模型,将第一特征信息映射至目标空间,得到第一对象在目标空间中对应的第一映射向量,目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;Based on the mapping model, the first feature information is mapped to the target space, and the first mapping vector corresponding to the first object in the target space is obtained. The target space includes the user mapping vector corresponding to the user identification and the data mapping corresponding to the candidate data vector;
根据目标空间中任两个映射向量之间的距离,基于第一对象及第二对象进行推荐,其中,第二映射向量为第二对象在目标空间中对应的向量,第二映射向量与第一映射向量之间的距离小于预设距离,且第二映射向量与第一映射向量属于不同类别。According to the distance between any two mapping vectors in the target space, recommend based on the first object and the second object, where the second mapping vector is the vector corresponding to the second object in the target space, and the second mapping vector is the same as the first object. The distance between the mapping vectors is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取第二对象的第二特征信息;Acquiring second characteristic information of the second object;
基于映射模型,将第二特征信息映射至目标空间,得到第二对象在目标空间中对应的第二映射向量;Based on the mapping model, map the second feature information to the target space to obtain a second mapping vector corresponding to the second object in the target space;
获取第一映射向量与第二映射向量之间的距离;Obtaining the distance between the first mapping vector and the second mapping vector;
如果距离小于预设距离,基于第一对象和第二对象进行推荐。If the distance is less than the preset distance, a recommendation is made based on the first object and the second object.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
确定目标空间中的至少一个第三映射向量,第三映射向量与第一映射向量属于不同类别;Determine at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
获取第一映射向量与每个第三映射向量之间的距离;Obtaining the distance between the first mapping vector and each third mapping vector;
从至少一个第三映射向量中,选取与第一映射向量之间的距离小于预设距离的第二映射向量;From at least one third mapping vector, selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance;
确定第二映射向量对应的第二对象,基于第一对象和第二对象进行推荐。The second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
基于反映射模型,对第二映射向量进行反映射,得到第二映射向量对应的第二特征信息,确定第二特征信息所属的第二对象。Based on the inverse mapping model, the second mapping vector is inversely mapped to obtain the second feature information corresponding to the second mapping vector, and the second object to which the second feature information belongs is determined.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;Obtain sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. The sample label is used to indicate whether to identify and recommend the sample data to the sample user;
根据样本信息,对映射模型进行训练。According to the sample information, the mapping model is trained.
可选地,用于训练映射模型的损失函数,包括以下至少一个:Optionally, the loss function used to train the mapping model includes at least one of the following:
Figure PCTCN2020118107-appb-000071
Figure PCTCN2020118107-appb-000071
其中,L neck为映射模型的第一损失值,λ margin为预设参数,
Figure PCTCN2020118107-appb-000072
为样本标签,
Figure PCTCN2020118107-appb-000073
为样本用户标识对应的映射向量,
Figure PCTCN2020118107-appb-000074
为样本数据对应的映射向量;
Among them, L neck is the first loss value of the mapping model, and λ margin is the preset parameter,
Figure PCTCN2020118107-appb-000072
Is the sample label,
Figure PCTCN2020118107-appb-000073
Is the mapping vector corresponding to the sample user ID,
Figure PCTCN2020118107-appb-000074
Is the mapping vector corresponding to the sample data;
Figure PCTCN2020118107-appb-000075
Figure PCTCN2020118107-appb-000075
其中,L cov为映射模型的第二损失值,N为样本信息的数量,E为
Figure PCTCN2020118107-appb-000076
Figure PCTCN2020118107-appb-000077
构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
Among them, L cov is the second loss value of the mapping model, N is the number of sample information, and E is
Figure PCTCN2020118107-appb-000076
with
Figure PCTCN2020118107-appb-000077
The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
可选地,映射模型为自编码器中的编码模型;该至少一条程序代码由处理器加载并执行时,以实现如下步骤:Optionally, the mapping model is an encoding model in the autoencoder; when the at least one piece of program code is loaded and executed by the processor, the following steps are implemented:
获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;Obtain sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. The sample label is used to indicate whether to identify and recommend the sample data to the sample user;
根据样本信息,对自编码器进行训练。According to the sample information, the autoencoder is trained.
可选地,用于训练自编码器的损失函数至少包括:Optionally, the loss function used to train the autoencoder includes at least:
Figure PCTCN2020118107-appb-000078
Figure PCTCN2020118107-appb-000078
其中,L reconstruct为自编码器的损失值,
Figure PCTCN2020118107-appb-000079
为样本用户标识的特征信息或样本数据的特征信息,
Figure PCTCN2020118107-appb-000080
Figure PCTCN2020118107-appb-000081
基于自编码器进行处理后输出的特征信息。
Among them, L reconstruct is the loss value of the self-encoder,
Figure PCTCN2020118107-appb-000079
Is the characteristic information identified by the sample user or the characteristic information of the sample data,
Figure PCTCN2020118107-appb-000080
for
Figure PCTCN2020118107-appb-000081
Based on the feature information output after processing by the autoencoder.
可选地,第一对象为用户标识,第二对象为备选数据,或者,第一对象为备选数据,第二对象为用户标识;该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification; the at least one piece of program code is loaded and executed by the processor to achieve The following steps:
向用户标识推荐备选数据。Recommend alternative data to the user ID.
可选地,映射模型包括用户映射子模型和数据映射子模型;用户映射子模型用于对用户标识的特征信息进行映射,得到用户映射向量;数据映射子模型用于对备选数据的特征信息进行映射,得到数据映射向量。Optionally, the mapping model includes a user mapping sub-model and a data mapping sub-model; the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector; the data mapping sub-model is used to map the feature information of the candidate data Perform the mapping to obtain the data mapping vector.
本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条程序代码,该至少一条程序代码由处理器加载并执行,以实现如下步骤:The embodiment of the present application also provides a computer-readable storage medium, in which at least one piece of program code is stored, and the at least one piece of program code is loaded and executed by a processor to implement the following steps:
获取第一对象的第一特征信息,第一对象属于用户标识或备选数据;Acquire first characteristic information of the first object, where the first object belongs to user identification or candidate data;
基于映射模型,将第一特征信息映射至目标空间,得到第一对象在目标空间中对应的第一映射向量,目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;Based on the mapping model, the first feature information is mapped to the target space, and the first mapping vector corresponding to the first object in the target space is obtained. The target space includes the user mapping vector corresponding to the user identification and the data mapping corresponding to the candidate data vector;
根据目标空间中任两个映射向量之间的距离,基于第一对象及第二对象进行推荐,其中,第二映射向量为第二对象在目标空间中对应的向量,第二映射向量与第一映射向量之间的距离小于预设距离,且第二映射向量与第一映射向量属于不同类别。According to the distance between any two mapping vectors in the target space, recommendation is made based on the first object and the second object. The second mapping vector is the vector corresponding to the second object in the target space. The distance between the mapping vectors is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取第二对象的第二特征信息;Acquiring second characteristic information of the second object;
基于映射模型,将第二特征信息映射至目标空间,得到第二对象在目标空间中对应的第二映射向量;Based on the mapping model, map the second feature information to the target space to obtain a second mapping vector corresponding to the second object in the target space;
获取第一映射向量与第二映射向量之间的距离;Obtaining the distance between the first mapping vector and the second mapping vector;
如果距离小于预设距离,基于第一对象和第二对象进行推荐。If the distance is less than the preset distance, a recommendation is made based on the first object and the second object.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
确定目标空间中的至少一个第三映射向量,第三映射向量与第一映射向量属于不同类别;Determine at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
获取第一映射向量与每个第三映射向量之间的距离;Obtaining the distance between the first mapping vector and each third mapping vector;
从至少一个第三映射向量中,选取与第一映射向量之间的距离小于预设距离的第二映射向量;From at least one third mapping vector, selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance;
确定第二映射向量对应的第二对象,基于第一对象和第二对象进行推荐。The second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
基于反映射模型,对第二映射向量进行反映射,得到第二映射向量对应的第二特征信息,确定第二特征信息所属的第二对象。Based on the inverse mapping model, the second mapping vector is inversely mapped to obtain the second feature information corresponding to the second mapping vector, and the second object to which the second feature information belongs is determined.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;Obtain sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. The sample label is used to indicate whether to identify and recommend the sample data to the sample user;
根据样本信息,对映射模型进行训练。According to the sample information, the mapping model is trained.
可选地,用于训练映射模型的损失函数,包括以下至少一个:Optionally, the loss function used to train the mapping model includes at least one of the following:
Figure PCTCN2020118107-appb-000082
Figure PCTCN2020118107-appb-000082
其中,L neck为映射模型的第一损失值,λ margin为预设参数,
Figure PCTCN2020118107-appb-000083
为样本标签,
Figure PCTCN2020118107-appb-000084
为样本用户标识对应的映射向量,
Figure PCTCN2020118107-appb-000085
为样本数据对应的映射向量;
Among them, L neck is the first loss value of the mapping model, and λ margin is the preset parameter,
Figure PCTCN2020118107-appb-000083
Is the sample label,
Figure PCTCN2020118107-appb-000084
Is the mapping vector corresponding to the sample user ID,
Figure PCTCN2020118107-appb-000085
Is the mapping vector corresponding to the sample data;
Figure PCTCN2020118107-appb-000086
Figure PCTCN2020118107-appb-000086
其中,L cov为映射模型的第二损失值,N为样本信息的数量,E为
Figure PCTCN2020118107-appb-000087
Figure PCTCN2020118107-appb-000088
构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
Among them, L cov is the second loss value of the mapping model, N is the number of sample information, and E is
Figure PCTCN2020118107-appb-000087
with
Figure PCTCN2020118107-appb-000088
The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
可选地,映射模型为自编码器中的编码模型;该至少一条程序代码由处理器加载并执行时,以实现如下步骤:Optionally, the mapping model is an encoding model in the autoencoder; when the at least one piece of program code is loaded and executed by the processor, the following steps are implemented:
获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;Obtain sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. The sample label is used to indicate whether to identify and recommend the sample data to the sample user;
根据样本信息,对自编码器进行训练。According to the sample information, the autoencoder is trained.
可选地,用于训练自编码器的损失函数至少包括:Optionally, the loss function used to train the autoencoder includes at least:
Figure PCTCN2020118107-appb-000089
Figure PCTCN2020118107-appb-000089
其中,L reconstruct为自编码器的损失值,
Figure PCTCN2020118107-appb-000090
为样本用户标识的特征信息或样本数据的特征信息,
Figure PCTCN2020118107-appb-000091
Figure PCTCN2020118107-appb-000092
基于自编码器进行处理后输出的特征信息。
Among them, L reconstruct is the loss value of the self-encoder,
Figure PCTCN2020118107-appb-000090
Characteristic information identified by the sample user or characteristic information of the sample data,
Figure PCTCN2020118107-appb-000091
for
Figure PCTCN2020118107-appb-000092
Based on the feature information output after processing by the autoencoder.
可选地,第一对象为用户标识,第二对象为备选数据,或者,第一对象为备选数据,第二对象为用户标识;该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification; the at least one piece of program code is loaded and executed by the processor to achieve The following steps:
向用户标识推荐备选数据。Recommend alternative data to the user ID.
可选地,映射模型包括用户映射子模型和数据映射子模型;用户映射子模型用于对用户标识的特征信息进行映射,得到用户映射向量;数据映射子模型用于对备选数据的特征信息进行映射,得到数据映射向量。Optionally, the mapping model includes a user mapping sub-model and a data mapping sub-model; the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector; the data mapping sub-model is used to map the feature information of the candidate data Perform the mapping to obtain the data mapping vector.
本申请实施例还提供了一种计算机程序,该计算机程序中存储有至少一条程序代码,该至少一条程序代码由处理器加载并执行,以实现如下步骤:The embodiment of the present application also provides a computer program in which at least one piece of program code is stored, and the at least one piece of program code is loaded and executed by a processor to implement the following steps:
获取第一对象的第一特征信息,第一对象属于用户标识或备选数据;Acquire first characteristic information of the first object, where the first object belongs to user identification or candidate data;
基于映射模型,将第一特征信息映射至目标空间,得到第一对象在目标空间中对应的第一映射向量,目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;Based on the mapping model, the first feature information is mapped to the target space, and the first mapping vector corresponding to the first object in the target space is obtained. The target space includes the user mapping vector corresponding to the user identification and the data mapping corresponding to the candidate data vector;
根据目标空间中任两个映射向量之间的距离,基于第一对象及第二对象进行推荐,其中,第二映射向量为第二对象在目标空间中对应的向量,第二映射向量与第一映射向量之间的距离小于预设距离,且第二映射向量与第一映射向量属于不同类别。According to the distance between any two mapping vectors in the target space, recommendation is made based on the first object and the second object. The second mapping vector is the vector corresponding to the second object in the target space. The distance between the mapping vectors is less than the preset distance, and the second mapping vector and the first mapping vector belong to different categories.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取第二对象的第二特征信息;Acquiring second characteristic information of the second object;
基于映射模型,将第二特征信息映射至目标空间,得到第二对象在目标空间中对应的第二映射向量;Based on the mapping model, map the second feature information to the target space to obtain a second mapping vector corresponding to the second object in the target space;
获取第一映射向量与第二映射向量之间的距离;Obtaining the distance between the first mapping vector and the second mapping vector;
如果距离小于预设距离,基于第一对象和第二对象进行推荐。If the distance is less than the preset distance, a recommendation is made based on the first object and the second object.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
确定目标空间中的至少一个第三映射向量,第三映射向量与第一映射向量属于不同类别;Determine at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
获取第一映射向量与每个第三映射向量之间的距离;Obtaining the distance between the first mapping vector and each third mapping vector;
从至少一个第三映射向量中,选取与第一映射向量之间的距离小于预设距离的第二映射向量;From at least one third mapping vector, selecting a second mapping vector whose distance from the first mapping vector is less than a preset distance;
确定第二映射向量对应的第二对象,基于第一对象和第二对象进行推荐。The second object corresponding to the second mapping vector is determined, and recommendations are made based on the first object and the second object.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
基于反映射模型,对第二映射向量进行反映射,得到第二映射向量对应的第二特征信息,确定第二特征信息所属的第二对象。Based on the inverse mapping model, the second mapping vector is inversely mapped to obtain the second feature information corresponding to the second mapping vector, and the second object to which the second feature information belongs is determined.
可选地,该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the at least one piece of program code is loaded and executed by the processor to implement the following steps:
获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;Obtain sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. The sample label is used to indicate whether to identify and recommend the sample data to the sample user;
根据样本信息,对映射模型进行训练。According to the sample information, the mapping model is trained.
可选地,用于训练映射模型的损失函数,包括以下至少一个:Optionally, the loss function used to train the mapping model includes at least one of the following:
Figure PCTCN2020118107-appb-000093
Figure PCTCN2020118107-appb-000093
其中,L neck为映射模型的第一损失值,λ margin为预设参数,
Figure PCTCN2020118107-appb-000094
为样本标 签,
Figure PCTCN2020118107-appb-000095
为样本用户标识对应的映射向量,
Figure PCTCN2020118107-appb-000096
为样本数据对应的映射向量;
Among them, L neck is the first loss value of the mapping model, and λ margin is the preset parameter,
Figure PCTCN2020118107-appb-000094
Is the sample label,
Figure PCTCN2020118107-appb-000095
Is the mapping vector corresponding to the sample user ID,
Figure PCTCN2020118107-appb-000096
Is the mapping vector corresponding to the sample data;
Figure PCTCN2020118107-appb-000097
Figure PCTCN2020118107-appb-000097
其中,L cov为映射模型的第二损失值,N为样本信息的数量,E为
Figure PCTCN2020118107-appb-000098
Figure PCTCN2020118107-appb-000099
构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
Among them, L cov is the second loss value of the mapping model, N is the number of sample information, and E is
Figure PCTCN2020118107-appb-000098
with
Figure PCTCN2020118107-appb-000099
The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
可选地,映射模型为自编码器中的编码模型;该至少一条程序代码由处理器加载并执行时,以实现如下步骤:Optionally, the mapping model is an encoding model in the autoencoder; when the at least one piece of program code is loaded and executed by the processor, the following steps are implemented:
获取样本信息,样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,样本标签用于表示是否向样本用户标识推荐样本数据;Obtain sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. The sample label is used to indicate whether to identify and recommend the sample data to the sample user;
根据样本信息,对自编码器进行训练。According to the sample information, the autoencoder is trained.
可选地,用于训练自编码器的损失函数至少包括:Optionally, the loss function used to train the autoencoder includes at least:
Figure PCTCN2020118107-appb-000100
Figure PCTCN2020118107-appb-000100
其中,L reconstruct为自编码器的损失值,
Figure PCTCN2020118107-appb-000101
为样本用户标识的特征信息或样本数据的特征信息,
Figure PCTCN2020118107-appb-000102
Figure PCTCN2020118107-appb-000103
基于自编码器进行处理后输出的特征信息。
Among them, L reconstruct is the loss value of the self-encoder,
Figure PCTCN2020118107-appb-000101
Is the characteristic information identified by the sample user or the characteristic information of the sample data,
Figure PCTCN2020118107-appb-000102
for
Figure PCTCN2020118107-appb-000103
Based on the feature information output after processing by the autoencoder.
可选地,第一对象为用户标识,第二对象为备选数据,或者,第一对象为备选数据,第二对象为用户标识;该至少一条程序代码由处理器加载并执行,以实现如下步骤:Optionally, the first object is a user identification and the second object is candidate data, or the first object is candidate data and the second object is a user identification; the at least one piece of program code is loaded and executed by the processor to achieve The following steps:
向用户标识推荐备选数据。Recommend alternative data to the user ID.
可选地,映射模型包括用户映射子模型和数据映射子模型;用户映射子模型用于对用户标识的特征信息进行映射,得到用户映射向量;数据映射子模型用于对备选数据的特征信息进行映射,得到数据映射向量。Optionally, the mapping model includes a user mapping sub-model and a data mapping sub-model; the user mapping sub-model is used to map the feature information of the user identification to obtain the user mapping vector; the data mapping sub-model is used to map the feature information of the candidate data Perform the mapping to obtain the data mapping vector.
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the foregoing embodiments can be implemented by hardware, or by a program to instruct relevant hardware to be completed. The program can be stored in a computer-readable storage medium. The storage medium can be read-only memory, magnetic disk or optical disk, etc.
以上所述仅为本申请实施例的可选实施例,并不用以限制本申请实施例,凡在本申请实施例的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The foregoing descriptions are only optional embodiments of the embodiments of the present application, and are not intended to limit the embodiments of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present application shall be It is included in the scope of protection of this application.

Claims (21)

  1. 一种推荐方法,其特征在于,应用于服务器,所述方法包括:A recommendation method, characterized in that it is applied to a server, and the method includes:
    获取第一对象的第一特征信息,所述第一对象属于用户标识或备选数据;Acquiring first characteristic information of a first object, where the first object belongs to a user identification or candidate data;
    基于映射模型,将所述第一特征信息映射至目标空间,得到所述第一对象在所述目标空间中对应的第一映射向量,所述目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;Based on the mapping model, the first feature information is mapped to a target space to obtain a first mapping vector corresponding to the first object in the target space, and the target space includes a user mapping vector corresponding to a user identification and The data mapping vector corresponding to the candidate data;
    根据所述目标空间中任两个映射向量之间的距离,基于所述第一对象及第二对象进行推荐,其中,第二映射向量为所述第二对象在所述目标空间中对应的向量,所述第二映射向量与所述第一映射向量之间的距离小于预设距离,且所述第二映射向量与所述第一映射向量属于不同类别。According to the distance between any two mapping vectors in the target space, a recommendation is made based on the first object and the second object, where the second mapping vector is a vector corresponding to the second object in the target space , The distance between the second mapping vector and the first mapping vector is less than a preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    获取所述第二对象的第二特征信息;Acquiring second characteristic information of the second object;
    基于所述映射模型,将所述第二特征信息映射至所述目标空间,得到所述第二映射向量;Mapping the second feature information to the target space based on the mapping model to obtain the second mapping vector;
    所述根据所述目标空间中任两个映射向量之间的距离,基于所述第一对象及第二对象进行推荐,包括:The performing recommendation based on the first object and the second object according to the distance between any two mapping vectors in the target space includes:
    获取所述第一映射向量与所述第二映射向量之间的距离;Acquiring the distance between the first mapping vector and the second mapping vector;
    如果所述距离小于所述预设距离,基于所述第一对象和所述第二对象进行推荐。If the distance is less than the preset distance, a recommendation is made based on the first object and the second object.
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述目标空间中任两个映射向量之间的距离,基于所述第一对象及第二对象进行推荐,包括:The method according to claim 1, wherein the recommending based on the first object and the second object according to the distance between any two mapping vectors in the target space comprises:
    确定所述目标空间中的至少一个第三映射向量,所述第三映射向量与所述第一映射向量属于不同类别;Determining at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
    获取所述第一映射向量与每个第三映射向量之间的距离;Acquiring the distance between the first mapping vector and each third mapping vector;
    从所述至少一个第三映射向量中,选取与所述第一映射向量之间的距离小于所述预设距离的第二映射向量;Selecting, from the at least one third mapping vector, a second mapping vector whose distance from the first mapping vector is less than the preset distance;
    确定所述第二映射向量对应的所述第二对象,基于所述第一对象和所述第二对象进行推荐。The second object corresponding to the second mapping vector is determined, and a recommendation is made based on the first object and the second object.
  4. 根据权利要求3所述的方法,其特征在于,所述确定所述第二映射向量对应的所述第二对象,包括:The method according to claim 3, wherein the determining the second object corresponding to the second mapping vector comprises:
    基于反映射模型,对所述第二映射向量进行反映射,得到所述第二映射向量对应的第二特征信息,确定所述第二特征信息所属的第二对象。Based on the inverse mapping model, perform inverse mapping on the second mapping vector to obtain second feature information corresponding to the second mapping vector, and determine the second object to which the second feature information belongs.
  5. 根据权利要求1所述的方法,其特征在于,所述基于映射模型,将所述第一特征信息映射至目标空间,得到所述第一对象在所述目标空间中对应的第 一映射向量之前,所述方法还包括:The method according to claim 1, characterized in that, based on the mapping model, the first feature information is mapped to a target space to obtain that the first object is before the corresponding first mapping vector in the target space , The method further includes:
    获取样本信息,所述样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,所述样本标签用于表示是否向所述样本用户标识推荐所述样本数据;Acquiring sample information, where the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label, where the sample label is used to indicate whether to recommend the sample data to the sample user identification;
    根据所述样本信息,对所述映射模型进行训练。According to the sample information, the mapping model is trained.
  6. 根据权利要求5所述的方法,其特征在于,用于训练所述映射模型的损失函数,包括以下至少一个:The method according to claim 5, wherein the loss function used for training the mapping model comprises at least one of the following:
    Figure PCTCN2020118107-appb-100001
    Figure PCTCN2020118107-appb-100001
    其中,L neck为所述映射模型的第一损失值,λ margin为预设参数,
    Figure PCTCN2020118107-appb-100002
    为所述样本标签,
    Figure PCTCN2020118107-appb-100003
    为所述样本用户标识对应的映射向量,
    Figure PCTCN2020118107-appb-100004
    为所述样本数据对应的映射向量;
    Wherein, L neck is the first loss value of the mapping model, and λ margin is a preset parameter,
    Figure PCTCN2020118107-appb-100002
    Is the sample label,
    Figure PCTCN2020118107-appb-100003
    Is the mapping vector corresponding to the sample user ID,
    Figure PCTCN2020118107-appb-100004
    Is the mapping vector corresponding to the sample data;
    Figure PCTCN2020118107-appb-100005
    Figure PCTCN2020118107-appb-100005
    其中,L cov为所述映射模型的第二损失值,N为所述样本信息的数量,E为所述
    Figure PCTCN2020118107-appb-100006
    和所述
    Figure PCTCN2020118107-appb-100007
    构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
    Wherein, L cov is the second loss value of the mapping model, N is the number of sample information, and E is the
    Figure PCTCN2020118107-appb-100006
    And said
    Figure PCTCN2020118107-appb-100007
    The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
  7. 根据权利要求1所述的方法,其特征在于,所述映射模型为自编码器中的编码模型;The method according to claim 1, wherein the mapping model is an encoding model in an autoencoder;
    所述基于映射模型,将所述第一特征信息映射至目标空间,得到所述第一对象在所述目标空间中对应的第一映射向量之前,所述方法还包括:Before the mapping the first feature information to the target space based on the mapping model to obtain the first mapping vector corresponding to the first object in the target space, the method further includes:
    获取样本信息,所述样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,所述样本标签用于表示是否向所述样本用户标识推荐所述样本数据;Acquiring sample information, where the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label, where the sample label is used to indicate whether to recommend the sample data to the sample user identification;
    根据所述样本信息,对所述自编码器进行训练。According to the sample information, the autoencoder is trained.
  8. 根据权利要求7所述的方法,其特征在于,用于训练所述自编码器的损失函数至少包括:The method according to claim 7, wherein the loss function used for training the autoencoder at least comprises:
    Figure PCTCN2020118107-appb-100008
    Figure PCTCN2020118107-appb-100008
    其中,L reconstruct为所述自编码器的损失值,
    Figure PCTCN2020118107-appb-100009
    为所述样本用户标识的特征信息或所述样本数据的特征信息,
    Figure PCTCN2020118107-appb-100010
    为所述
    Figure PCTCN2020118107-appb-100011
    基于所述自编码器进行处理后输出的特征信息。
    Among them, L reconstruct is the loss value of the self-encoder,
    Figure PCTCN2020118107-appb-100009
    Is the characteristic information identified by the sample user or the characteristic information of the sample data,
    Figure PCTCN2020118107-appb-100010
    As said
    Figure PCTCN2020118107-appb-100011
    Based on the feature information output after processing by the self-encoder.
  9. 根据权利要求1-8任一项所述的方法,其特征在于,所述第一对象为用户标识,所述第二对象为备选数据,或者,所述第一对象为备选数据,所述第二对象为用户标识;The method according to any one of claims 1-8, wherein the first object is a user ID, the second object is candidate data, or the first object is candidate data, so The second object is a user ID;
    所述基于所述第一对象及第二对象进行推荐,包括:向所述用户标识推荐 所述备选数据。The recommending based on the first object and the second object includes: recommending the candidate data to the user identifier.
  10. 根据权利要求1所述的方法,其特征在于,所述映射模型包括用户映射子模型和数据映射子模型;The method according to claim 1, wherein the mapping model includes a user mapping sub-model and a data mapping sub-model;
    所述用户映射子模型用于对用户标识的特征信息进行映射,得到用户映射向量;The user mapping sub-model is used to map the characteristic information of the user identification to obtain a user mapping vector;
    所述数据映射子模型用于对备选数据的特征信息进行映射,得到数据映射向量。The data mapping sub-model is used to map the feature information of the candidate data to obtain a data mapping vector.
  11. 一种推荐装置,其特征在于,所述装置包括:A recommendation device, characterized in that the device comprises:
    第一信息获取模块,用于获取第一对象的第一特征信息,所述第一对象属于用户标识或备选数据;A first information acquisition module, configured to acquire first characteristic information of a first object, the first object belonging to a user identification or candidate data;
    第一映射模块,用于基于映射模型,将所述第一特征信息映射至目标空间,得到所述第一对象在所述目标空间中对应的第一映射向量,所述目标空间中包括与用户标识对应的用户映射向量和与备选数据对应的数据映射向量;The first mapping module is configured to map the first feature information to a target space based on a mapping model to obtain a first mapping vector corresponding to the first object in the target space, and the target space includes a user Identify the corresponding user mapping vector and the data mapping vector corresponding to the candidate data;
    推荐模块,用于根据所述目标空间中任两个映射向量之间的距离,基于所述第一对象及第二对象进行推荐,其中,第二映射向量为所述第二对象在所述目标空间中对应的向量,所述第二映射向量与所述第一映射向量之间的距离小于预设距离,且所述第二映射向量与所述第一映射向量属于不同类别。The recommendation module is configured to make a recommendation based on the first object and the second object according to the distance between any two mapping vectors in the target space, where the second mapping vector is that the second object is in the target space. For a corresponding vector in the space, the distance between the second mapping vector and the first mapping vector is less than a preset distance, and the second mapping vector and the first mapping vector belong to different categories.
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括:The device according to claim 11, wherein the device further comprises:
    第二信息获取模块,用于获取所述第二对象的第二特征信息;A second information acquiring module, configured to acquire second characteristic information of the second object;
    第二映射模块,用于基于所述映射模型,将所述第二特征信息映射至所述目标空间,得到所述第二映射向量;A second mapping module, configured to map the second feature information to the target space based on the mapping model to obtain the second mapping vector;
    推荐模块还包括:Recommended modules also include:
    第一距离获取单元,用于获取所述第一映射向量与所述第二映射向量之间的距离;A first distance acquiring unit, configured to acquire the distance between the first mapping vector and the second mapping vector;
    第一推荐单元,用于如果所述距离小于所述预设距离,基于所述第一对象和所述第二对象进行推荐。The first recommendation unit is configured to make a recommendation based on the first object and the second object if the distance is less than the preset distance.
  13. 根据权利要求11所述的装置,其特征在于,所述推荐模块包括:The device according to claim 11, wherein the recommendation module comprises:
    向量确定单元,用于确定所述目标空间中的至少一个第三映射向量,所述第三映射向量与所述第一映射向量属于不同类别;A vector determining unit, configured to determine at least one third mapping vector in the target space, where the third mapping vector and the first mapping vector belong to different categories;
    第二距离获取单元,用于获取所述第一映射向量与每个第三映射向量之间的距离;A second distance acquiring unit, configured to acquire the distance between the first mapping vector and each third mapping vector;
    向量选取单元,用于从所述至少一个第三映射向量中,选取与所述第一映射向量之间的距离小于所述预设距离的第二映射向量;A vector selecting unit, configured to select, from the at least one third mapping vector, a second mapping vector whose distance from the first mapping vector is less than the preset distance;
    第二推荐单元,用于确定所述第二映射向量对应的所述第二对象,基于所述第一对象和所述第二对象进行推荐。The second recommendation unit is configured to determine the second object corresponding to the second mapping vector, and perform recommendation based on the first object and the second object.
  14. 根据权利要求13所述的装置,其特征在于,所述第二推荐单元还用于基于反映射模型,对所述第二映射向量进行反映射,得到所述第二映射向量对应的第二特征信息,确定所述第二特征信息所属的第二对象。The device according to claim 13, wherein the second recommendation unit is further configured to perform inverse mapping on the second mapping vector based on an inverse mapping model to obtain a second feature corresponding to the second mapping vector Information, determining the second object to which the second characteristic information belongs.
  15. 根据权利要求11所述的装置,其特征在于,所述装置还包括:The device according to claim 11, wherein the device further comprises:
    第一样本获取模块,用于获取样本信息,所述样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,所述样本标签用于表示是否向所述样本用户标识推荐所述样本数据;The first sample acquisition module is used to acquire sample information. The sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data, and the sample label. Describe the sample data;
    第一训练模块,用于根据所述样本信息,对所述映射模型进行训练。The first training module is used to train the mapping model according to the sample information.
  16. 根据权利要求15所述的装置,其特征在于,所述装置还包括用于训练所述映射模型的损失函数,包括以下至少一个:The device according to claim 15, wherein the device further comprises a loss function for training the mapping model, comprising at least one of the following:
    Figure PCTCN2020118107-appb-100012
    Figure PCTCN2020118107-appb-100012
    其中,L neck为所述映射模型的第一损失值,λ margin为预设参数,
    Figure PCTCN2020118107-appb-100013
    为所述样本标签,
    Figure PCTCN2020118107-appb-100014
    为所述样本用户标识对应的映射向量,
    Figure PCTCN2020118107-appb-100015
    为所述样本数据对应的映射向量;
    Wherein, L neck is the first loss value of the mapping model, and λ margin is a preset parameter,
    Figure PCTCN2020118107-appb-100013
    Is the sample label,
    Figure PCTCN2020118107-appb-100014
    Is the mapping vector corresponding to the sample user ID,
    Figure PCTCN2020118107-appb-100015
    Is the mapping vector corresponding to the sample data;
    Figure PCTCN2020118107-appb-100016
    Figure PCTCN2020118107-appb-100016
    其中,L cov为所述映射模型的第二损失值,N为所述样本信息的数量,E为所述
    Figure PCTCN2020118107-appb-100017
    和所述
    Figure PCTCN2020118107-appb-100018
    构成的矩阵,Cov(E)为矩阵E的协方差矩阵,||·|| f为转置函数,diag(·)为矩阵对角元素提取函数。
    Wherein, L cov is the second loss value of the mapping model, N is the number of sample information, and E is the
    Figure PCTCN2020118107-appb-100017
    And said
    Figure PCTCN2020118107-appb-100018
    The formed matrix, Cov(E) is the covariance matrix of matrix E, ||·|| f is the transposition function, and diag(·) is the diagonal element extraction function of the matrix.
  17. 根据权利要求11所述的装置,其特征在于,所述映射模型为自编码器中的编码模型,所述装置还包括:The device according to claim 11, wherein the mapping model is an encoding model in a self-encoder, and the device further comprises:
    第二样本获取模块,用于获取样本信息,所述样本信息包括样本用户标识的特征信息、样本数据的特征信息及样本标签,所述样本标签用于表示是否向所述样本用户标识推荐所述样本数据;The second sample acquisition module is used to acquire sample information, the sample information includes the characteristic information of the sample user identification, the characteristic information of the sample data and the sample label, and the sample label is used to indicate whether to recommend the sample user identification to the sample user sample;
    第二训练模块,用于根据所述样本信息,对所述自编码器进行训练。The second training module is used to train the autoencoder according to the sample information.
  18. 根据权利要求17所述的装置,其特征在于,所述装置还包括用于训练所述自编码器的损失函数,至少包括:The device according to claim 17, wherein the device further comprises a loss function for training the autoencoder, at least comprising:
    Figure PCTCN2020118107-appb-100019
    Figure PCTCN2020118107-appb-100019
    其中,L reconstruct为所述自编码器的损失值,
    Figure PCTCN2020118107-appb-100020
    为所述样本用户标识的特征信息或所述样本数据的特征信息,
    Figure PCTCN2020118107-appb-100021
    为所述
    Figure PCTCN2020118107-appb-100022
    基于所述自编码器进行处理后输出的特征信息。
    Among them, L reconstruct is the loss value of the self-encoder,
    Figure PCTCN2020118107-appb-100020
    Is the characteristic information identified by the sample user or the characteristic information of the sample data,
    Figure PCTCN2020118107-appb-100021
    As said
    Figure PCTCN2020118107-appb-100022
    Based on the feature information output after processing by the self-encoder.
  19. 根据权利要求11-18任一项所述的装置,其特征在于,所述第一对象为 用户标识,第二对象为备选数据,或者,所述第一对象为备选数据,第二对象为用户标识;The device according to any one of claims 11-18, wherein the first object is a user identifier, and the second object is candidate data, or the first object is candidate data, and the second object Is the user ID;
    所述推荐模块,还用于向所述用户标识推荐所述备选数据。The recommendation module is further configured to recommend the candidate data to the user identifier.
  20. 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器中存储有至少一条程序代码,所述至少一条程序代码由所述处理器加载并执行,以实现如权利要求1至10任一权利要求所述的推荐方法中所执行的操作。A computer device, wherein the computer device includes a processor and a memory, and at least one piece of program code is stored in the memory, and the at least one piece of program code is loaded and executed by the processor to implement The operation performed in the recommendation method according to any one of claims 1 to 10.
  21. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条程序代码,所述至少一条程序代码由处理器加载并执行,以实现如权利要求1至10任一权利要求所述的推荐方法中所执行的操作。A computer-readable storage medium, wherein at least one piece of program code is stored in the computer-readable storage medium, and the at least one piece of program code is loaded and executed by a processor to implement any one of claims 1 to 10 The operation performed in the recommendation method described in the claim.
PCT/CN2020/118107 2019-10-25 2020-09-27 Method and device for making recommendation, computer device, and storage medium WO2021077989A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911026124.6A CN110795625B (en) 2019-10-25 2019-10-25 Recommendation method and device, computer equipment and storage medium
CN201911026124.6 2019-10-25

Publications (1)

Publication Number Publication Date
WO2021077989A1 true WO2021077989A1 (en) 2021-04-29

Family

ID=69441323

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118107 WO2021077989A1 (en) 2019-10-25 2020-09-27 Method and device for making recommendation, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN110795625B (en)
WO (1) WO2021077989A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505304A (en) * 2021-09-10 2021-10-15 明品云(北京)数据科技有限公司 Target object recommendation method and system
CN113704607A (en) * 2021-08-26 2021-11-26 阿里巴巴(中国)有限公司 Recommendation and display method and device and electronic equipment
CN113763927A (en) * 2021-05-13 2021-12-07 腾讯科技(深圳)有限公司 Speech recognition method, speech recognition device, computer equipment and readable storage medium

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795625B (en) * 2019-10-25 2021-11-23 腾讯科技(深圳)有限公司 Recommendation method and device, computer equipment and storage medium
CN111445283B (en) * 2020-03-25 2023-09-01 北京百度网讯科技有限公司 Digital person processing method, device and storage medium based on interaction device
CN111651558B (en) * 2020-05-09 2023-04-07 清华大学深圳国际研究生院 Hyperspherical surface cooperative measurement recommendation device and method based on pre-training semantic model
CN111629052B (en) * 2020-05-26 2021-12-07 中国联合网络通信集团有限公司 Content caching method, node, equipment and storage medium based on MEC
CN111918094B (en) * 2020-06-29 2023-01-24 北京百度网讯科技有限公司 Video processing method and device, electronic equipment and storage medium
CN114002949A (en) * 2020-07-28 2022-02-01 华为技术有限公司 Control method and control device based on artificial intelligence
CN113762467B (en) * 2021-08-12 2022-10-21 生态环境部卫星环境应用中心 Method for obtaining near-ground ozone concentration based on ultraviolet and visible hyperspectrum

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060020614A1 (en) * 1997-08-08 2006-01-26 Kolawa Adam K Method and apparatus for automated selection, organization, and recommendation of items based on user preference topography
CN108280738A (en) * 2017-12-13 2018-07-13 西安电子科技大学 Method of Commodity Recommendation based on image and socialized label
CN109710845A (en) * 2018-12-25 2019-05-03 百度在线网络技术(北京)有限公司 Information recommended method, device, computer equipment and readable storage medium storing program for executing
CN110162700A (en) * 2019-04-23 2019-08-23 腾讯科技(深圳)有限公司 The training method of information recommendation and model, device, equipment and storage medium
CN110795625A (en) * 2019-10-25 2020-02-14 腾讯科技(深圳)有限公司 Recommendation method and device, computer equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101094335B (en) * 2006-06-20 2010-10-13 株式会社日立制作所 TV program recommender and method thereof
CN103177093B (en) * 2013-03-13 2016-08-17 北京开心人信息技术有限公司 A kind of general recommendations method and system based on object tag
US10109051B1 (en) * 2016-06-29 2018-10-23 A9.Com, Inc. Item recommendation based on feature match
CN108460073A (en) * 2017-12-27 2018-08-28 广州市百果园信息技术有限公司 Group recommending method, storage medium and server
CN108804670B (en) * 2018-06-11 2023-03-31 腾讯科技(深圳)有限公司 Data recommendation method and device, computer equipment and storage medium
CN110232153A (en) * 2019-05-29 2019-09-13 华南理工大学 A kind of cross-cutting recommended method based on content

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060020614A1 (en) * 1997-08-08 2006-01-26 Kolawa Adam K Method and apparatus for automated selection, organization, and recommendation of items based on user preference topography
CN108280738A (en) * 2017-12-13 2018-07-13 西安电子科技大学 Method of Commodity Recommendation based on image and socialized label
CN109710845A (en) * 2018-12-25 2019-05-03 百度在线网络技术(北京)有限公司 Information recommended method, device, computer equipment and readable storage medium storing program for executing
CN110162700A (en) * 2019-04-23 2019-08-23 腾讯科技(深圳)有限公司 The training method of information recommendation and model, device, equipment and storage medium
CN110795625A (en) * 2019-10-25 2020-02-14 腾讯科技(深圳)有限公司 Recommendation method and device, computer equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763927A (en) * 2021-05-13 2021-12-07 腾讯科技(深圳)有限公司 Speech recognition method, speech recognition device, computer equipment and readable storage medium
CN113763927B (en) * 2021-05-13 2024-03-08 腾讯科技(深圳)有限公司 Speech recognition method, device, computer equipment and readable storage medium
CN113704607A (en) * 2021-08-26 2021-11-26 阿里巴巴(中国)有限公司 Recommendation and display method and device and electronic equipment
CN113704607B (en) * 2021-08-26 2023-10-20 阿里巴巴(中国)有限公司 Recommendation and display method and device and electronic equipment
CN113505304A (en) * 2021-09-10 2021-10-15 明品云(北京)数据科技有限公司 Target object recommendation method and system

Also Published As

Publication number Publication date
CN110795625A (en) 2020-02-14
CN110795625B (en) 2021-11-23

Similar Documents

Publication Publication Date Title
WO2021077989A1 (en) Method and device for making recommendation, computer device, and storage medium
WO2020215962A1 (en) Video recommendation method and device, computer device and storage medium
EP4266244A1 (en) Surface defect detection method, apparatus, system, storage medium, and program product
CN109086709B (en) Feature extraction model training method and device and storage medium
CN110059744B (en) Method for training neural network, method and equipment for processing image and storage medium
CN110134804B (en) Image retrieval method, device and storage medium
CN111897996B (en) Topic label recommendation method, device, equipment and storage medium
CN110413837B (en) Video recommendation method and device
CN111737573A (en) Resource recommendation method, device, equipment and storage medium
CN111104980B (en) Method, device, equipment and storage medium for determining classification result
CN111680697B (en) Method, device, electronic equipment and medium for realizing field adaptation
WO2020211607A1 (en) Video generation method, apparatus, electronic device, and medium
CN110162604B (en) Statement generation method, device, equipment and storage medium
CN111506758A (en) Method and device for determining article name, computer equipment and storage medium
CN110070143B (en) Method, device and equipment for acquiring training data and storage medium
WO2023066373A1 (en) Sample image determination method and apparatus, device, and storage medium
CN113343709B (en) Method for training intention recognition model, method, device and equipment for intention recognition
CN112417263B (en) Data recommendation method, device and storage medium
CN113139614A (en) Feature extraction method and device, electronic equipment and storage medium
CN112287193A (en) Data clustering method and device, computer equipment and storage medium
CN111652432A (en) Method and device for determining user attribute information, electronic equipment and storage medium
CN111429106A (en) Resource transfer certificate processing method, server, electronic device and storage medium
CN112990424A (en) Method and device for training neural network model
CN112308104A (en) Abnormity identification method and device and computer storage medium
CN111984738A (en) Data association method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20878413

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.08.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20878413

Country of ref document: EP

Kind code of ref document: A1