CN114357242A

CN114357242A - Training evaluation method and device based on recall model, equipment and storage medium

Info

Publication number: CN114357242A
Application number: CN202111575932.5A
Authority: CN
Inventors: 戴威
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-12-20
Filing date: 2021-12-20
Publication date: 2022-04-15

Abstract

The embodiment of the application discloses a training evaluation method and device based on a recall model, electronic equipment and a storage medium, which can be applied to the fields of automatic driving, intelligent traffic and the like, and comprise the following steps: respectively storing account characteristics and video characteristics extracted based on training video samples in an online training process of the recall model into an account characteristic library and a video characteristic library; obtaining target account characteristics by offline sampling from an account characteristic library, and searching a video characteristic set related to warehousing time from a video characteristic library aiming at the target account characteristics; calculating the matching degree of the target account characteristics and the video characteristics of each item mark in the video characteristic set, and selecting a specified rank based on the descending order of the matching degree values; the recall rate is calculated based on the number of positive samples associated with the target video feature corresponding to the specified rank and the positive sample data associated with the target video features in the set of video features. The scheme of the embodiment of the application can save online machine resources and improve the iteration effect of the model parameters.

Description

Training evaluation method and device based on recall model, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a training evaluation method and apparatus, an electronic device, a storage medium, and a program product based on a recall model.

Background

The recommendation system refers to a method for automatically selecting/matching commodities on a platform according to user interests by an Internet age platform and presenting the commodities to a user. Recommendation systems typically include a recall model for selecting a subset from a candidate pool that meets a goal and an algorithm constraint.

In order to improve the training effect of the recall model, the recall model needs to be evaluated. At present, the evaluation mode of the recall model is generally an online AB test, but the online AB test needs to occupy online resources, the scale of the online test is small, and the observation period is long.

Disclosure of Invention

In order to solve the above technical problem, embodiments of the present application provide a recall model-based training evaluation method and apparatus, an electronic device, a storage medium, and a program product.

Other features and advantages of the present application will be apparent from the following detailed description, or may be learned by practice of the application.

According to an aspect of an embodiment of the present application, there is provided a training evaluation method based on a recall model, including:

acquiring account characteristics and video characteristics extracted based on training video samples in an online training process of a recall model, and respectively storing the account characteristics and the video characteristics into an account characteristic library and a video characteristic library;

obtaining target account characteristics by offline sampling from the account characteristic library, and searching a video characteristic set related to warehousing time from the video characteristic library aiming at the target account characteristics;

respectively calculating the matching degree between the target account characteristics and the video characteristics of each item mark in the video characteristic set, and selecting a specified rank according to the descending order of the matching degree values;

calculating a recall rate from the number of positive samples associated with target video features corresponding to the specified rank and positive sample data associated with target video features in the set of video features, the recall rate being used to evaluate a training effect of the recall model.

According to an aspect of an embodiment of the present application, there is provided a training evaluation apparatus based on a recall model, including:

the characteristic acquisition module is configured to acquire account characteristics and video characteristics extracted based on training video samples in an online training process of the recall model and store the account characteristics and the video characteristics to an account characteristic library and a video characteristic library respectively;

the offline sampling module is configured to obtain target account characteristics by offline sampling from the account characteristic library, and search a video characteristic set related to warehousing time from the video characteristic library aiming at the target account characteristics;

the matching degree ranking module is configured to respectively calculate the matching degree between the target account characteristics and the video characteristics of each item mark in the video characteristic set, and select a designated rank according to the descending order of the matching degree values;

a recall rate calculation module configured to calculate a recall rate according to the number of positive samples associated with the target video features corresponding to the specified rank and the positive sample data associated with the target video features in the video feature set, the recall rate being used for evaluating a training effect of the recall model.

According to an aspect of an embodiment of the present application, there is provided an electronic device including: one or more processors; a storage device for storing one or more programs that, when executed by the one or more processors, cause the electronic device to implement a recall model-based training assessment method as previously described.

According to an aspect of the embodiments of the present application, there is provided a computer-readable storage medium having stored thereon computer-readable instructions, which, when executed by a processor of an electronic device, cause the electronic device to execute the recall model-based training evaluation method as described above.

According to an aspect of an embodiment of the present application, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the recall model-based training evaluation method as described above.

According to the technical scheme provided by the embodiment of the application, the account number characteristics and the video characteristics extracted based on the training video sample in the online training process of the recall model are obtained, and the recall model is evaluated based on the obtained account number characteristics and the obtained video characteristics, so that online resources are not occupied in the evaluation process, and online machine resources are saved; in addition, the recall model is evaluated in the training process, so that the timeliness of evaluation can be improved, and the efficiency of model parameter iteration is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

FIG. 1 is a schematic illustration of an implementation environment to which the present application relates;

FIG. 2 is a flow chart illustrating a recall model-based training assessment method in accordance with an exemplary embodiment of the present application;

FIG. 3 is a flow chart of step S130 in the embodiment shown in FIG. 2 in an exemplary embodiment;

FIG. 4 is a flow chart of a recall model-based training assessment method in accordance with another exemplary embodiment of the present application;

FIG. 5 is a flow chart illustrating the training and evaluation of a recall model according to an exemplary embodiment of the present application;

FIG. 6 is a schematic diagram illustrating an exemplary embodiment of a training assessment apparatus based on a recall model;

FIG. 7 illustrates a schematic structural diagram of a computer system suitable for use in implementing the electronic device of an embodiment of the present application.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The block diagrams shown in the figures are functional entities only and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the contents and operations/steps, nor do they necessarily have to be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

It should also be noted that: reference to "a plurality" in this application means two or more. "and/or" describe the association relationship of the associated objects, meaning that there may be three relationships, e.g., A and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

Before the technical solutions of the embodiments of the present application are described, terms and expressions referred to in the embodiments of the present application are explained, and the terms and expressions referred to in the embodiments of the present application are applied to the following explanations.

The recommendation system comprises: the method refers to that the platform in the internet era automatically selects/matches commodities on the platform according to the interests of users and presents the commodities to the users. Because the recent behaviors of the users express stronger interests or trends, in order to improve the real-time performance, the current recommendation system widely applies a real-time technology, namely, the behaviors of individual users and group users can be responded and inferred in time by streaming data transmission/model minute-level training, online/new model minute-level updating derivation and online reasoning. Due to limitations of recommender system computation and online system latency, the recommender system may employ a funnel-level architecture of recall-coarse (may not be) and fine-line-strategy (mixed-line).

Recalling: for selecting a subset from the entire candidate pool that meets the target and the computational constraints.

And (3) online AB testing: before a certain new function is completely online, the online flow is segmented, and the small part of flow obtained by segmentation is used for testing the new function to evaluate the effect of the new function.

The distributed cloud storage system refers to a storage system which integrates a large number of storage devices (storage devices are also called storage nodes) of different types in a network through application software or application interfaces to cooperatively work through functions of cluster application, grid technology, distributed storage file system and the like, and provides data storage and service access functions to the outside.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

At present, a storage method of a distributed cloud storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.

The distributed cloud storage system allocates a physical storage space for the logical volume, specifically: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume.

In order to improve the training effect of the recall model, the recall model needs to be evaluated. At present, the evaluation mode of the recall model is usually online AB test, but in the online AB test process, an online environment is needed, and online machine resources, such as memory, external memory and the like, are occupied; moreover, the flow rate for on-line testing is generally below 20%, and the scale is small. Based on this, embodiments of the present application provide a training evaluation method and apparatus, an electronic device, a storage medium, and a program product based on a recall model, so that an evaluation process of the recall model does not occupy online resources, and an evaluation scale can be adaptively adjusted.

Referring to fig. 1, fig. 1 is a schematic diagram of an implementation environment related to the present application. The implementation environment includes a recall model-based training evaluation apparatus 100, a recall model 200, and an online training apparatus 300.

The training evaluation apparatus 100 may be a server or other device. The server may be a server providing various services, may be an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, which is not limited herein.

The online training device 300 may also be a server or other device.

During the online training process of the recall model 200, the online training apparatus 300 extracts account features and video features based on the training video samples. The training evaluation device 100 can acquire account features and video features extracted based on training video samples in an online training process of the recall model, and respectively store the account features and the video features into an account feature library and a video feature library; obtaining target account characteristics by offline sampling from an account characteristic library, and searching a video characteristic set related to warehousing time from a video characteristic library aiming at the target account characteristics; respectively calculating the matching degree between the target account characteristics and the video characteristics of each item mark in the video characteristic set, and selecting a specified rank according to the descending order of the matching degree values; and calculating a recall rate according to the number of positive samples associated with the target video characteristics corresponding to the specified rank and the positive sample data associated with the target video characteristics in the video characteristic set, wherein the recall rate is used for evaluating the training effect of the recall model. In this way, the account number characteristics and the video characteristics extracted based on the training video sample in the online training process of the recall model are obtained, and the recall model is evaluated based on the obtained account number characteristics and the obtained video characteristics, so that online resources are not occupied in the evaluation process, and online machine resources are saved; in addition, the recall model is evaluated in the training process, so that the timeliness of evaluation can be improved, and the efficiency of model parameter iteration is improved. In addition, the data used in the evaluation process is the same as the data used in the training process, so that the deviation is reduced, and the evaluation accuracy is improved.

The account model is a model created based on machine learning; the account number feature library and the video feature library may be stored in a storage system, and the storage system may be a storage system based on a cloud storage technology, and may also be other types of storage systems. Referring to fig. 2, fig. 2 is a flowchart illustrating a training evaluation method based on a recall model according to an exemplary embodiment of the present application. The method may be applied to the implementation environment shown in fig. 1, which may be performed by the recall model-based training assessment apparatus 100 in the implementation environment shown in fig. 1.

After the recall model reaches a certain condition, the recall model can be brought online to provide services for the user terminal, wherein the user terminal comprises but is not limited to a mobile phone, a computer, intelligent voice interaction equipment, intelligent household appliances, a vehicle-mounted terminal and the like.

It should be noted that, in addition to the aforementioned application scenarios, the embodiment of the present application may also be applied to various application scenarios, including but not limited to cloud technology, artificial intelligence, smart transportation, assisted driving, and the like, and in practical applications, the application may be adjusted accordingly according to specific application scenarios. For example, if the method is applied to a cloud technology scene, the recall model may be deployed in the cloud, and the account number feature library and the video feature library may also be stored based on a cloud storage technology; if the method is applied to manual work, if the method is applied to intelligent traffic or driving-assistant scenes, the recall model can be deployed at a vehicle-mounted terminal, a navigation terminal and the like and used for navigation, driving-assistant and the like.

As shown in fig. 2, in an exemplary embodiment, the training evaluation method based on the recall model may include steps S110 to S140, which are described in detail as follows:

step S110, account characteristics and video characteristics extracted based on training video samples in the online training process of the recall model are obtained, and the account characteristics and the video characteristics are stored in an account characteristic library and a video characteristic library respectively.

It should be noted that, in this embodiment, the recall model is applied to a video recall scene, and is used to select a model that meets the goal and the calculation power from the candidate pool. The recall model may be a model created based on DNN (Deep Neural Networks), or may be a model created based on other machine learning Networks. The candidate pool can be flexibly set according to actual needs, for example, including but not limited to videos contained in a video platform.

The training video samples are videos used for training the recall model, and may include positive samples and may also include negative samples. The training video sample may be determined based on an online real-time message. For example, during the operation of the video platform, a real-time message is generated, and account data including, but not limited to, account attribute information, account behavior information, and the like, may be acquired from the real-time message, where the account attribute information includes, but is not limited to, the age, sex, and the like of a user corresponding to the account, and the account behavior information includes, but is not limited to, behaviors such as clicking, viewing, praise, commenting, forwarding, and the like. Based on account data, account characteristics of the account can be determined, and based on the account characteristics, a positive sample and a negative sample can be constructed, where the positive sample can include a video of the account viewed for a certain duration (e.g., 10 minutes, 3 minutes, etc.), a video forwarded by the account, a video complied with the account, and the like, and the negative sample can include a video that is not viewed by the account, a video that is masked by the account, and the like, or a video can be randomly selected from a candidate pool as the negative sample.

The account feature library is used for storing account features, and specific types of the account features can be flexibly set according to actual needs, for example, in one example, the account features are updated in real time, so the types of the account feature library include but are not limited to a real-time table, and the real-time table refers to a table file with content updated in real time, so that the requirement of account features extracted in real time in an online training process of a recall model is met.

The video feature library is used for storing video features of videos in the candidate pool, and the specific types of the video features can be flexibly set according to actual needs. In one example, to reduce storage pressure, the type of video feature library includes, but is not limited to, a distributed file system, since the amount of video in the candidate pool is large, typically on the order of millions to billions. The Distributed File System (DFS) is a complete hierarchical File System formed by combining a plurality of different logical disk partitions or volume labels together, and not only can reduce single-point storage pressure, but also can meet the video feature storage requirement based on time points.

When the recall model is trained on line, the recall model extracts the account characteristics based on the training video samples and extracts the characteristics of the videos in the candidate pool to obtain the video characteristics of each video in the candidate pool.

In the online training process of the recall model, an input training video sample associated with an account can be identified, account characteristics of the account are extracted, videos in the candidate pool are also identified by the recall model, and video characteristics of each video are extracted, so that videos corresponding to the account characteristics can be recalled conveniently on the basis of the account characteristics and the video characteristics; in order to evaluate the recall model, in this embodiment, account features and video features extracted based on training video samples in an online training process of the recall model are obtained, the account features are stored in an account feature library, and the video features are stored in a video feature library.

In some embodiments, the candidate pool is continuously updated, candidate pools with different versions exist, the recall model can perform feature extraction on videos in the candidate pools with different versions, and during recall, videos meeting conditions are generally recalled from a candidate pool with a certain version, so that when video features of the recall model in an online training process are acquired, video features corresponding to videos contained in the cache pool can be acquired by taking the cache pool as a unit.

And step S120, obtaining target account characteristics from the account characteristic library through offline sampling, and searching a video characteristic set related to the warehousing time from the video characteristic library according to the target account characteristics.

The account feature library stores account features of different accounts acquired at different times, and the data volume is large, so that in the embodiment, offline sampling can be performed in the account feature library to obtain target account features, and the offline sampling mode can be flexibly set according to actual needs.

The videos in the candidate pool are continuously updated, so that candidate pools with different versions exist, the recall model performs feature extraction on the videos in the candidate pools with different versions, and accordingly, video features corresponding to the candidate pools with different versions are stored in the video feature library, for example, at a certain time, data in the candidate pool 1 is updated, an updated candidate pool is obtained and is recorded as a candidate pool 2, video features corresponding to the candidate pool 1 and video features corresponding to the candidate pool 2 are stored in the video feature library, and the video features corresponding to the candidate pool are video features corresponding to videos contained in the candidate pool.

The storage time of the video features can be the time when the video features are stored in a video feature library or the time when the video features are acquired; the warehousing time of the account characteristics can be the time for storing the account characteristics into an account characteristic library or the time for acquiring the account characteristics;

in this embodiment, after obtaining the target account features by offline sampling from the account feature library, the video features corresponding to the warehousing time of the target account features are obtained from the video feature library to obtain a video feature set, for example, the video features with the warehousing time earlier than that of the target account features may be obtained.

Step S130, respectively calculating the matching degree between the target account characteristics and the video characteristics of each item mark in the video characteristic set, and selecting a specified rank according to the sequence of the matching degree values from large to small.

The designated ranking can be flexibly set according to actual needs, for example, the top 100, 500, and the like.

In this embodiment, the matching degree between the target account features and each entry mark video feature contained in the video feature set is calculated, and based on the size of the matching degree value, the target video features corresponding to the designated rank are selected according to the descending order.

Step S140, calculating a recall rate according to the number of positive samples associated with the target video features corresponding to the specified rank and the positive sample data associated with the target video features in the video feature set, wherein the recall rate is used for evaluating the training effect of the recall model.

Wherein the number of positive samples associated with the target video feature corresponding to the specified rank comprises: and the number of positive samples in the video associated with the target video feature corresponding to the ranking is specified. Positive sample data associated with a target video feature in the set of video features includes: the number of positive samples in the video associated with the target video feature in the set of video features.

The recall rate may be a ratio of the number of positive samples to the positive sample data. In one example, it is assumed that the video feature set includes features of a video a, features of a video b, features of a video c, features of a video d, and features of a video e, where positive samples are the video a and the video c, a rank is designated as 3, and the matching degree values are, in order from large to small, the features of the video a, the features of the video b, the features of the video e, the features of the video c, and the features of the video d, and since the first 3 names include only the features of the positive sample video a, the number of positive samples associated with the target video feature corresponding to the designated rank is 1 (i.e., video a), the number of positive samples associated with the target video feature in the video feature set is 2 (i.e., video a and c), and the recall rate is 1/2 ═ 0.5.

Because the account characteristics and the video characteristics are obtained from the online training process of the recall model and can represent the training effect of the recall model to a certain extent, the training effect of the recall model can be evaluated according to the number of positive samples associated with the target video characteristics corresponding to the specified rank and the recall rate calculated by the positive sample data associated with the target video characteristics in the video characteristic set.

In the embodiment, account characteristics and video characteristics extracted based on training video samples in the online training process of the recall model are obtained, and the account characteristics and the video characteristics are respectively stored in an account characteristic library and a video characteristic library; obtaining target account characteristics by offline sampling from an account characteristic library, and searching a video characteristic set related to warehousing time from a video characteristic library aiming at the target account characteristics; respectively calculating the matching degree between the target account characteristics and the video characteristics of each item mark in the video characteristic set, and selecting a specified rank according to the descending order of the matching degree values; calculating a recall rate according to the number of positive samples associated with the target video features corresponding to the specified rank and the positive sample data associated with the target video features in the video feature set, wherein the recall rate is used for evaluating the training effect of a recall model, so that online resources are not occupied in the evaluation process, and online machine resources are saved; in addition, the recall model is evaluated in the training process, so that the timeliness of evaluation can be improved, and the efficiency of model parameter iteration is improved.

In an exemplary embodiment, since the reference time parameter is required in determining the video feature set, the training evaluation method based on the recall model may further include: in the process of respectively storing the account characteristics and the video characteristics into the account characteristic library and the video characteristic library, the acquisition time of the account characteristics and the acquisition time of the video characteristics are respectively recorded into the account characteristic library and the video characteristic library. Therefore, the time-related video feature set can be conveniently searched and acquired from the video feature library aiming at the target account features.

Referring to fig. 3, fig. 3 is a flowchart of step S130 in the embodiment shown in fig. 2 in an exemplary embodiment under the condition that the acquisition time of the account feature and the acquisition time of the video feature are respectively recorded in an account feature library and a video feature library, and as shown in fig. 3, the process of obtaining the target account feature by offline sampling from the account feature library and searching the video feature set related to the warehousing time from the video feature library for the target account feature may include steps S131 to S132, which are described in detail as follows:

step S131, sampling the target account characteristics from the account characteristic library in an off-line manner, and determining the acquisition time corresponding to the target account characteristics.

In order to select a video feature set corresponding to a target account feature, in this embodiment, in the process of offline sampling the target account feature from the account feature library, the obtaining time corresponding to the target account feature may be obtained.

Step S132, searching the video feature library for the video feature which is obtained at the time earlier than the obtaining time of the target account feature and closest to the obtaining time of the target account feature, and taking the searched video feature as the target video feature in the video feature set.

In order to improve the accuracy of offline evaluation, in this embodiment, after the target account features are sampled offline from the account feature library and the acquisition time corresponding to the target account features is determined, video features which have acquisition times earlier than the acquisition time of the target account features and are closest to the acquisition time of the target account features are searched from the video feature library, and the searched video features are used as the target video features in the video feature set. For example, assume that the acquisition time of the target account feature is 12 hours, 10 minutes and 05 seconds, the video feature library includes the video feature corresponding to the candidate pool 1, the video feature corresponding to the candidate pool 2 and the video feature corresponding to the candidate pool 3, the acquisition time of the video feature corresponding to the candidate pool 1 is 12 hours, 3 minutes and 06 seconds, the acquisition time of the video feature corresponding to the candidate pool 2 is 12 hours, 6 minutes and 06 seconds, and the acquisition time of the video feature corresponding to the candidate pool 1 is 12 hours, 20 minutes and 06 seconds, so that the video feature corresponding to the candidate pool 2 is used as the target video feature in the video feature set.

In some embodiments, if there are multiple target account features, considering that the acquisition time of each target account feature may be different, for each target account feature, a corresponding video feature set may be determined according to the acquisition time of the target account feature, and then, based on the number of positive samples included in a video associated with the corresponding video feature set and data of positive samples included in a video associated with a target video feature with a specified rank, a recall rate of the target video feature is calculated; and then, averaging the recall rates corresponding to the characteristics of the plurality of target account numbers to obtain the final recall rate. Or if a plurality of target account features exist, determining the earliest acquisition time from the acquisition times corresponding to the plurality of target account features, determining a video feature set based on the earliest acquisition time, then determining the number of positive samples included in videos associated in the video feature set and the data of the positive samples included in videos associated with the target video features with specified ranks for each target account feature, and calculating the recall rate of the target video features; and then, averaging the recall rates corresponding to the characteristics of the plurality of target account numbers to obtain the final recall rate. Of course, other processing methods are also possible, and are not limited herein.

In this embodiment, the target account features are sampled offline from the account feature library, the acquisition time corresponding to the target account features is determined, the video features which are earlier than the acquisition time of the target account features and closest to the acquisition time of the target account features are searched in the video feature library, and the searched video features are used as the target video features in the video feature set, so that the evaluation accuracy is improved.

In an exemplary embodiment, in step S130 in the embodiment shown in fig. 2, the process of sampling the target account characteristics from the account characteristic library offline and determining the acquisition time corresponding to the target account characteristics may include: and periodically sampling a specified number of target account features from the account feature library in an off-line manner based on a preset time interval, wherein the preset time interval is greater than the frequency of extracting the account features and the video features in the on-line training process of the recall model.

The preset time interval is the time interval for evaluating the recall model, namely, the target account characteristics are obtained at regular intervals, and the recall model is evaluated once based on the target account characteristics. The specific value of the preset time interval can be flexibly set according to actual needs, and for example, can be set to 1 hour, 2 hours, and the like. The designated number is the number of the target account features sampled in each time interval, and the specific numerical value can be flexibly set according to actual needs, for example, 10 ten thousand, 5 ten thousand, and the like can be provided.

In the online training process of the recall model, account features of newly added training samples are extracted at regular intervals. As the candidate pool is updated continuously, the recall model may also perform feature extraction on videos included in the candidate pool of the new version at regular intervals, for example, the recall module may extract account features at a minute level, and perform feature extraction on videos included in the candidate pool at the minute level, where the minute level extraction refers to that the interval between two extractions is at a minute level (less than 1 hour, e.g., 1 minute, 2 minutes, etc.).

In order to avoid the situation that the recall model is evaluated for multiple times based on the same account number characteristics and video characteristics, in this embodiment, the preset time interval is greater than the frequency of extracting the account number characteristics and the video characteristics in the online training process of the recall model.

In this embodiment, the specified number of target account features are periodically sampled from the account feature library in an offline manner based on the preset time interval, so that the recall model can be evaluated once at intervals of the preset time interval, the real-time performance of evaluation is improved, and comparison of the recall models in different time periods is facilitated.

In an exemplary embodiment, referring to fig. 4, after calculating the recall ratio according to the number of positive samples associated with the target video features corresponding to the specified rank and the positive sample data associated with the target video features in the video feature set, the training assessment method based on the recall model may further include steps S210 to S220, which are described in detail as follows:

step S210, obtaining the recall rates calculated in different preset time intervals.

In this embodiment, the recall model is evaluated once every preset time interval to obtain the recall rate, so that the recall rates calculated in different preset time intervals can be obtained. For example, the recall rates calculated in 3 different preset time intervals may be obtained, the recall rates calculated in 4 different preset time intervals may be obtained, and the like.

And S220, comparing the obtained recall rates, and selecting the recall model version corresponding to the recall rate with the largest value as the recall model with the best training effect to be applied to the information recommendation system.

And after the recall rates calculated in different preset time intervals are obtained, comparing the obtained recall rate values, thereby determining the recall rate with the maximum value, namely the maximum recall rate. The greater the recall rate is, the better the effect of representing the recall model is, so that the recall model of the version corresponding to the maximum recall rate can be applied to the information recommendation system as the recall model with the best training effect.

In this embodiment, recall rates calculated in different preset time intervals are obtained, the obtained recall rates are compared in numerical value, and a recall model version corresponding to the recall rate with the largest numerical value is selected as a recall model with the best training effect to be applied to the information recommendation system, so that the iteration efficiency of the recall model is improved.

In an exemplary embodiment, in step S130 in the embodiment shown in fig. 2, the process of respectively calculating the matching degrees between the target account features and the entry target video features included in the video feature set may further include: and respectively carrying out vector inner product operation on the target account number characteristics and the item mark video characteristics, and taking the obtained operation as the matching degree between the target account number characteristics and the corresponding target video characteristics.

In order to determine the matching degree of the target account features and the target video features, in this embodiment, vector inner product operation may be performed on the target account features and the respective entry mark video features, and the obtained operation result is used as the matching degree between the target account features and the corresponding target video features.

In the embodiment, the matching degree between the target account number characteristics and the video characteristics of each item mark is determined in a vector inner product operation mode, so that the calculation speed and accuracy can be improved.

A specific application scenario of the embodiment of the present application is described in detail below. Referring to FIG. 5, the on-line training or evaluation of the recall model may include the following processes:

real-time messaging: in the running process of the video platform, a real-time message is generated, and in order to perform online training on the information recommendation model, in this embodiment, the real-time message is acquired.

And (3) real-time data processing: after the real-time message is obtained, the real-time message is processed to obtain user data from the real-time message, where the user data includes, but is not limited to, user attribute information, user behavior information, and the like, the user attribute information includes, but is not limited to, age, gender, and the like, and the user behavior information includes, but is not limited to, clicking, watching, agreeing, commenting, forwarding, and the like.

Pulling and splicing characteristics: after the real-time message is processed to obtain the user data, in this embodiment, features may also be extracted from the user data and the features may be spliced.

Positive and negative sample construction: after the features are extracted and spliced, positive and negative samples can be constructed based on the obtained features, and the construction mode can be flexibly set according to actual needs, for example, the positive samples can include videos with certain watching time length of a user and videos forwarded by the user.

An off-line sample center: the offline sample center may randomly select a video from the candidate pool as a negative sample and input the negative sample.

Recalling model on-line training: after positive and negative samples are constructed, the recall model can be trained on-line based on the positive and negative samples.

The user tower DNN extracts account features: the recall model comprises a user tower DNN module, and in the process of performing online training on the recall model, the user tower DNN module performs feature extraction on positive and negative samples to obtain account features, wherein the account features exist in a vector form, the user tower DNN module can perform feature extraction on newly added positive and negative samples at intervals, and the time interval can be in a minute level.

The video tower DNN extracts video features: the recall model comprises a video tower DNN, and the video tower DNN can perform feature extraction on videos contained in a candidate pool to obtain video features, wherein the video features exist in a vector form, videos in the candidate pool are dynamically changed, and the video tower DNN can perform feature extraction on videos in a new version of the candidate pool at intervals, wherein the intervals can be minute-level.

Updating the online index: after obtaining the video features of the videos in the candidate pool, the video features may be imported into a search repository, e.g., a Faiss, which is a clustering and similarity-oriented search repository sourced by the Facebook AI team.

And (3) online service: the online service searches out videos matching with the account characteristics from a search library.

Storing account characteristics: after the user tower DNN module performs feature extraction on the positive and negative samples to obtain account features, in this embodiment, the obtained account features are stored in an account feature library, and the acquisition time of the account features is recorded.

Storing video characteristics: the video tower DNN can perform feature extraction on videos contained in the candidate pool, store the obtained video features into an account feature library after the video features are obtained, and record the obtaining time of the video features corresponding to the cache pool of each version.

Sampling and comparing: in the account number feature library, the corresponding account number features are sampled off line from the time at preset time intervals to obtain target account number features, and the acquisition time of the target account number features is determined. The preset time interval and the sampling amount can be set arbitrarily, for example, the preset time may be 1 hour, the sampling amount may be 10 ten thousand, or, in other words, 10 ten thousand target account features are sampled every 1 hour. Then, for each target account feature, searching a video feature corresponding to a candidate pool with the acquisition time being earlier than that of the target account feature and closest to the acquisition time of the target account feature in a video feature library, taking the video feature corresponding to the candidate pool as a video feature set of the target account feature, respectively calculating the matching degree between the target account feature and each item mark video feature contained in the corresponding video feature set, and selecting the target video features of K before ranking based on the descending order of the matching degree value. The value of K can be flexibly set according to actual needs, for example, the value of K can be determined by calculating the AA experiment fluctuation degree of the call index through multiple random sampling.

Calculating the recall ratio: determining the number of positive samples in the video corresponding to the target video features of K before ranking; determining the number of positive samples in the video corresponding to the video feature set; and obtaining the ratio of the two to obtain the recall rate, and averaging to obtain the recall rate in the time interval based on the recall rate corresponding to each target account characteristic selected in the time interval.

And (4) storing the recall rate: and after the recall rate of each time interval is obtained, storing the recall rate of each time interval, the name of the recall model, the time interval, the number of the target account number features sampled in the time interval and a training video sample, wherein the recall rate of each time interval can be stored in a table or a distributed file system.

Model comparison: the recall rates of the recall models of the corresponding versions at different time intervals and different recall models can be compared according to the stored recall rates, so that the recall model with the best effect is selected to be on line, and the iteration efficiency is improved. And the recall rates of the recall models at different time intervals can be aggregated.

Referring to fig. 6, fig. 6 is a block diagram illustrating a training evaluation apparatus based on a recall model according to an exemplary embodiment of the present application. As shown in fig. 6, the apparatus includes:

the feature acquisition module 610 is configured to acquire account features and video features extracted based on training video samples in an online training process of the recall model, and store the account features and the video features into an account feature library and a video feature library respectively;

the offline sampling module 620 is configured to obtain target account features by offline sampling from the account feature library, and search a video feature set related to the warehousing time from the video feature library for the target account features;

the matching degree ranking module 630 is configured to calculate matching degrees between the target account features and the video features of the item labels in the video feature set, and select a designated rank based on the descending order of the matching degree values;

and the recall rate calculation module 640 is configured to calculate a recall rate according to the number of positive samples associated with the target video features corresponding to the specified rank and the positive sample data associated with the target video features in the video feature set, wherein the recall rate is used for evaluating the training effect of the recall model.

In another exemplary embodiment, the feature obtaining module 610 is further configured to record the obtaining time of the account features and the obtaining time of the video features into the account feature library and the video feature library, respectively, in the process of storing the account features and the video features into the account feature library and the video feature library, respectively.

In another exemplary embodiment, the offline sampling module 620 includes:

the characteristic sampling unit is used for sampling the target account characteristics from the account characteristic library in an off-line manner and determining the acquisition time corresponding to the target account characteristics;

and the feature searching unit is configured to search the video feature library for the video feature which is earlier than the acquisition time of the target account feature and closest to the acquisition time of the target account feature, and use the searched video feature as the target video feature in the video feature set.

In another exemplary embodiment, the feature sampling unit is further configured to sample a specified number of target account features from the account feature library offline periodically based on a preset time interval, where the preset time interval is greater than the frequency of extracting the account features and the video features in the online training process of the recall model.

In another exemplary embodiment, the apparatus further comprises:

the recall rate acquisition subunit is configured to acquire recall rates calculated in different preset time intervals;

and the recall rate comparison subunit is configured to perform numerical value comparison on the obtained plurality of recall rates, and select the recall model version corresponding to the recall rate with the largest numerical value as the recall model with the best training effect to be applied to the information recommendation system.

In another exemplary embodiment, the matching degree ranking module 630 is configured to perform a vector inner product operation on the target account features and each entry mark video feature, and use the obtained operation as the matching degree between the target account features and the corresponding target video features.

It should be noted that the training and evaluating apparatus based on the recall model provided in the above embodiment and the training and evaluating method based on the recall model provided in the above embodiment belong to the same concept, wherein the specific manner in which each module and unit performs operations has been described in detail in the method embodiment, and is not described herein again.

An embodiment of the present application further provides an electronic device, including: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the electronic equipment is enabled to realize the method provided in each embodiment.

It should be noted that the computer system 1600 of the electronic device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.

As shown in fig. 7, computer system 1600 includes a Central Processing Unit (CPU)1601, which can perform various appropriate actions and processes, such as executing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 1602 or a program loaded from a storage portion 1608 into a Random Access Memory (RAM) 1603. In the RAM 1603, various programs and data necessary for system operation are also stored. The CPU1601, ROM 1602, and RAM 1603 are connected to each other via a bus 1604. An Input/Output (I/O) interface 1605 is also connected to the bus 1604.

The following components are connected to the I/O interface 1605: an input portion 1606 including a keyboard, a mouse, and the like; an output section 1607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage portion 1608 including a hard disk and the like; and a communication section 1609 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 1609 performs communication processing via a network such as the internet. The driver 1610 is also connected to the I/O interface 1605 as needed. A removable medium 1611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1610 as necessary, so that a computer program read out therefrom is mounted in the storage portion 1608 as necessary.

In particular, according to embodiments of the application, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method illustrated by the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via the communication portion 1609, and/or installed from the removable media 1611. When the computer program is executed by a Central Processing Unit (CPU)1601, various functions defined in the system of the present application are executed.

It should be noted that the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium may be, for example, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with a computer program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be disposed in a processor. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.

Another aspect of the present application also provides a computer-readable storage medium having stored thereon computer-readable instructions, which, when executed by a processor of an electronic device, cause the electronic device to implement the method as described above. The computer-readable storage medium may be included in the electronic device described in the above embodiment, or may exist separately without being incorporated in the electronic device.

Another aspect of the present application also provides a computer program product or computer program comprising computer instructions which, when executed by a processor, implement the methods provided in the various embodiments described above. Wherein the computer instructions may be stored in a computer readable storage medium; the processor of the electronic device may read the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the electronic device executes the method provided in the above embodiments.

The above description is only a preferred exemplary embodiment of the present application, and is not intended to limit the embodiments of the present application, and those skilled in the art can easily make various changes and modifications according to the main concept and spirit of the present application, so that the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A training evaluation method based on a recall model is characterized by comprising the following steps:

2. The method of claim 1, further comprising:

in the process of respectively storing the account characteristics and the video characteristics into an account characteristic library and a video characteristic library, the acquisition time of the account characteristics and the acquisition time of the video characteristics are respectively recorded into the account characteristic library and the video characteristic library.

3. The method according to claim 2, wherein the obtaining of the target account features by offline sampling from the account feature library, and searching the video feature library for a video feature set related to the warehousing time according to the target account features comprises:

sampling the target account characteristics from the account characteristic library in an off-line manner, and determining acquisition time corresponding to the target account characteristics;

searching the video feature library for the video feature which is earlier than the acquisition time of the target account feature and closest to the acquisition time of the target account feature, and taking the searched video feature as the target video feature in the video feature set.

4. The method of claim 3, wherein the offline sampling of target account features from the account feature library comprises:

and periodically sampling a specified number of target account features from the account feature library in an off-line manner based on a preset time interval, wherein the preset time interval is greater than the frequency of extracting the account features and the video features in the online training process of the recall model.

5. The method of claim 4, wherein after calculating a recall rate based on a number of positive samples associated with target video features corresponding to the specified rank and positive sample data associated with target video features in the set of video features, the method further comprises:

obtaining recall rates calculated in different preset time intervals;

and comparing the obtained plurality of recall rates, and selecting the recall model version corresponding to the recall rate with the maximum value as the recall model with the best training effect to be applied to the information recommendation system.

6. The method according to any one of claims 1 to 4, wherein the calculating the matching degree between the target account features and each entry mark video feature contained in the video feature set comprises:

and respectively carrying out vector inner product operation on the target account number characteristics and the item mark video characteristics, and taking the obtained operation as the matching degree between the target account number characteristics and the corresponding target video characteristics.

7. The method of any of claims 1-4, wherein the type of account feature library comprises a real-time table and the type of video feature library comprises a distributed file system.

8. A recall model-based training assessment apparatus comprising:

9. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs that, when executed by the one or more processors, cause the electronic device to implement the recall model-based training assessment method of any of claims 1-7.

10. A computer-readable storage medium having stored thereon computer-readable instructions which, when executed by a processor of a computer, cause the computer to perform the recall model-based training assessment method of any of claims 1-7.

11. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, carries out the steps of the recall model based training assessment method of any one of claims 1 to 7.