CN113722537A

CN113722537A - Short video sequencing and model training method and device, electronic equipment and storage medium

Info

Publication number: CN113722537A
Application number: CN202110916738.2A
Authority: CN
Inventors: 温恒一
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2021-08-11
Filing date: 2021-08-11
Publication date: 2021-11-30
Anticipated expiration: 2041-08-11
Also published as: CN113722537B

Abstract

The embodiment of the invention provides a method and a device for sequencing short video search and training a model, which are applied to a scene of coarse sequencing of short videos, wherein the sequencing method comprises the following steps: acquiring a multi-channel short video recall file to be sorted; generating a characteristic array of a plurality of paths of the short video recall files; inputting the plurality of paths of feature arrays into a trained network model, and correspondingly outputting recall scores of the plurality of paths of short video recall files; and selecting the short video recall files corresponding to the recall scores meeting the preset sorting conditions as sorting results of the multiple short video recall files. The method avoids the steps of roughly arranging the recall files by using a general formula and sequencing the multi-path short video recall files based on the trained network model, can solve the technical problems that the conventional rough arrangement scheme cannot adapt to the multi-path short video recall files and has low rough arrangement efficiency, and achieves the effects of adapting to the multi-path short video recall files and improving the rough arrangement efficiency.

Description

Short video sequencing and model training method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for ranking short video searches, a method and an apparatus for training a network model, an electronic device, and a computer-readable storage medium.

Background

In a search system, coarse ranking may be understood as coarse ordering and fine ranking may be understood as fine ordering. The function of the rough bar is to screen hundreds of results from thousands of recalls to provide for the fine bar, and the rough bar plays a crucial role in the final search result.

The current search system uses a general formula to carry out rough arrangement, and parameters in the general formula correspond to fixed characteristics. In a short video scene, the recall files of each path have obviously different characteristics, and a general formula cannot adapt to the recall files of different paths with different characteristics. Moreover, when the short video is subjected to rough arrangement, a large number of recall files need to be processed at the same time, and the conventional rough arrangement scheme can only score single recall files in sequence and then sort according to scoring results.

Disclosure of Invention

Embodiments of the present invention provide a method and an apparatus for ordering short video searches, a method and an apparatus for training a network model, an electronic device, and a computer-readable storage medium, which solve the problems that a conventional coarse-layout scheme cannot adapt to multiple recalled files and the coarse-layout efficiency is low. The specific technical scheme is as follows:

in a first aspect of the present invention, there is provided a method for ordering short video searches, including: acquiring a multi-channel short video recall file to be sorted; generating a characteristic array of a plurality of paths of the short video recall files; inputting the plurality of paths of feature arrays into a trained network model, and correspondingly outputting recall scores of the plurality of paths of short video recall files; and selecting the short video recall files corresponding to the recall scores meeting the preset sorting conditions as sorting results of the multiple short video recall files.

Optionally, the generating a feature array of multiple short video recall files includes: acquiring characteristic data of a plurality of paths of video recall files in each dimension; and compressing and storing the feature data to obtain the feature array.

Optionally, the acquiring feature data of multiple paths of the video recall files in each dimension includes: acquiring one of the following characteristics of a plurality of paths of the video recall files in the document dimension: quality, freshness, user characteristics; and/or acquiring query category characteristics of the multiple paths of video recall files in query dimensions; and/or acquiring one of the following characteristics of the plurality of paths of video recall files in query and document dimensions: click rate characteristics, viewing duration characteristics, and presentation characteristics.

Optionally, the compressing and storing the feature data to obtain the feature array includes: and storing the characteristic data into a sparse matrix in a compressed sparse row format to obtain the characteristic array.

Optionally, the selecting the short video recall file corresponding to the recall score meeting the preset sorting condition as the sorting result of the multiple short video recall files includes: according to the recall scores, performing descending order arrangement on the multiple short video recall files; and taking the short video recall files of the preset number at the front after descending order as the ordering result.

In a second aspect of the present invention, there is also provided a method for training a network model, including: acquiring a plurality of paths of short video recall sample files; adding corresponding sample characteristics to the multi-path short video recall sample file; and training a network model according to the multipath short video recall sample file and the corresponding sample characteristics.

Optionally, the obtaining multiple short videos recalling the sample file includes: and acquiring multiple paths of short video recall positive sample files and multiple paths of short video recall negative sample files according to the watching duration and/or the display click condition.

Optionally, the adding corresponding sample features to the multiple short video recall sample files includes: for multiplexing the short video recall sample file, one of the following sample features is added: the system comprises a recall source information sample characteristic, a recall score sample characteristic, a recall file sample characteristic, a user display click sample characteristic and a user behavior sample characteristic.

In a third aspect of the present invention, there is also provided a device for ordering short video searches, including: the file acquisition module is used for acquiring the multi-channel short video recall files to be sorted; the characteristic generating module is used for generating a characteristic array of the multipath short video recall file; the score output module is used for inputting the plurality of paths of feature arrays to a trained network model and correspondingly outputting the recall scores of the plurality of paths of short video recall files; and the file selection module is used for selecting the short video recall files corresponding to the recall scores meeting the preset sorting conditions as the sorting results of the multiple short video recall files.

Optionally, the feature generation module includes: the characteristic data acquisition module is used for acquiring characteristic data of the multiple paths of video recall files in all dimensions; and the compression storage module is used for carrying out compression storage on the feature data to obtain the feature array.

Optionally, the feature data acquiring module is configured to acquire one of the following features of the multiple paths of video recall files in a document dimension: quality, freshness, user characteristics; and/or acquiring query category characteristics of the multiple paths of video recall files in query dimensions; and/or acquiring one of the following characteristics of the plurality of paths of video recall files in query and document dimensions: click rate characteristics, viewing duration characteristics, and presentation characteristics.

Optionally, the compressed storage module is configured to store the feature data into a sparse matrix in a compressed sparse row format, so as to obtain the feature array.

Optionally, the file selection module includes: the score sorting module is used for performing descending sorting on the plurality of short video recall files according to the recall scores; and the result determining module is used for taking the short video recall files with the preset number at the front after the descending order as the ordering result.

In a fourth aspect of the present invention, there is also provided a network model training apparatus, including: the sample acquisition module is used for acquiring a plurality of paths of short video recall sample files; the characteristic adding module is used for adding corresponding sample characteristics for the multi-path short video recall sample file; and the model training module is used for training the network model according to the multipath short video recall sample file and the corresponding sample characteristics.

Optionally, the sample obtaining module is configured to obtain multiple short video recall positive sample files and multiple short video recall negative sample files according to the viewing duration and/or the display click condition.

Optionally, the feature adding module is configured to recall a sample file for multiple short videos, and add one of the following sample features: the system comprises a recall source information sample characteristic, a recall score sample characteristic, a recall file sample characteristic, a user display click sample characteristic and a user behavior sample characteristic.

In another aspect of the present invention, there is also provided an electronic device, including a processor, a communication interface, a memory and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; a memory for storing a computer program; and a processor, configured to implement the short video search ranking method according to the first aspect or the network model training method according to the second aspect when executing the program stored in the memory.

In yet another aspect of the present invention, there is also provided a computer-readable storage medium having stored therein instructions, which, when run on a computer, cause the computer to perform the ranking method for short video search according to the first aspect or the training method for network model according to the second aspect.

In yet another aspect of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the ranking method for short video search according to the first aspect or the training method for network model according to the second aspect.

According to the short video search sequencing scheme provided by the embodiment of the invention, the technical means of obtaining the multi-path short video recall file, outputting the recall score of the multi-path short video recall file according to the characteristic array and the network model of the multi-path short video recall file, and determining the sequencing result according to the recall score are adopted. The method has the advantages that the recall files are sorted by using a general formula, the multi-channel short video recall files are sorted based on a trained network model, the technical problem that the existing coarse-arrangement scheme cannot adapt to the multi-channel short video recall files can be solved, in addition, the coarse-arrangement efficiency is low, the multi-channel short video recall files are adapted, and the effect of the coarse-arrangement efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

Fig. 1 is a flowchart illustrating a method for ordering short video searches according to an embodiment of the present invention.

Fig. 2 is a flowchart illustrating steps of a method for training a network model according to an embodiment of the present invention.

Fig. 3a is a schematic diagram of an offline training process of an xgboost model according to an embodiment of the present invention.

Fig. 3b is a schematic diagram of an online application process of the xgboost model according to the embodiment of the present invention.

Fig. 4 is a schematic structural diagram of a sorting apparatus for short video search according to an embodiment of the present invention.

Fig. 5 is a schematic structural diagram of a training apparatus for a network model according to an embodiment of the present invention.

Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

The embodiment of the invention provides a sequencing scheme for short video search, which is applied to a search scene of a short video. When the short video recall file is roughly arranged, not only the indexes such as click rate and the like, but also the consumption indexes such as click times of per-person display, playing time of per-person display and the like need to be considered. Different from a common search scene, the search scene of the short video has relatively low requirements on the correlation of the short video, and has certain requirements on the diversity and the divergence of the short video. Therefore, the embodiment of the invention can sort the multi-channel short video recall files including term recall, embedding layer recall and the like.

The embodiment of the invention provides a training scheme of a network model, which can be trained based on an xgboost (an optimized distributed gradient enhancement library) model. In the training process of the network model, sample characteristics can be flexibly set for training samples, such as adding recall source information sample characteristics and recall score sample characteristics, bias or deviation caused by different recall sources can be relieved, and personalized search of short video recall files can be realized by adding user behavior sample characteristics.

Fig. 1 is a flowchart illustrating steps of a method for ranking short video searches according to an embodiment of the present invention. The ordering method of the short video search can be applied to a server, and the server can be a short video server. The method for ordering the short video search may specifically include the following steps.

Step 101, obtaining a multi-channel short video recall file to be sorted.

In an embodiment of the present invention, the short video recall file may be a document obtained from multiple recall sources, and each document may contain one short video data and query information of the short video data. The query information may include, but is not limited to: video information and user information of the short video data. The video information includes name, number, type, capacity, duration, author, format, etc. The user information includes a user name, a click operation for video data, a viewing operation, a viewing time period, and the like.

And 102, generating a feature array of the multi-path short video recall file.

In embodiments of the present invention, since the short video recall file may originate from multiple recall sources, the short video recall files of each recall source have respective characteristics. Therefore, the multi-path short video recall file has a large number of feature types and quantity, and the features of the multi-path short video recall file can be constructed into a feature array.

And 103, inputting the multi-path characteristic array into the trained network model, and correspondingly outputting the recall scores of the multi-path short video recall files.

In the embodiment of the present invention, each short video recall file may correspond to one path of feature data, that is, the features of one path of short video recall file may be constructed as a group of feature arrays by taking the number of paths of the short video recall file as a unit. The network model may be trained based on the xgboost model. In practical applications, the xgboost model may contain 40 decision trees, each decision tree containing 6 nodes at most. The recall score is a scalar manifestation of the underlying relevance of the short video recall file, which may represent how popular the short video recall file is to the user.

And 104, selecting the short video recall files corresponding to the recall scores meeting the preset sorting conditions as the sorting results of the multi-channel short video recall files.

In the embodiment of the present invention, after the recall score of each short video recall file is output, a part of the short video recall files with higher recall scores can be selected according to the recall score, and the selected short video recall file is used as the sorting result of the multiple short video recall files to be sorted.

According to the short video search sequencing scheme provided by the embodiment of the invention, the technical means of obtaining the multi-path short video recall file, outputting the recall score of the multi-path short video recall file according to the characteristic array and the network model of the multi-path short video recall file, and determining the sequencing result according to the recall score are adopted. The method has the advantages that the general formula is avoided being utilized to carry out rough arrangement on the recall files, the multi-channel short video recall files are sequenced based on the trained network model, the technical problem that the existing rough arrangement scheme cannot adapt to the multi-channel short video recall files can be solved, in addition, the rough arrangement efficiency is low, the multi-channel short video recall files are adapted, and the effect of the rough arrangement efficiency is improved.

In an exemplary embodiment of the present invention, in the process of generating the feature array of the multi-channel short video recall file, feature data of the multi-channel short video recall file in each dimension may be obtained, and then the feature data of each dimension is compressed and stored to obtain the feature data. In practical applications, the dimensions include, but are not limited to: document dimensions, query dimensions, and query and document dimensions. The feature data for the document dimension may comprise at least one of: quality characteristics, freshness characteristics, user characteristics. The quality characteristics may be the definition, the code rate, etc. of the short video. The freshness feature may be an upload time of short video data. The user features may be click users, watch users, show users, etc. of the short videos. The feature data for the query dimension may include query category features, and the like. The feature data of the query and document dimensions may contain at least one of: click rate characteristics, viewing duration characteristics, presentation characteristics, click characteristics, and the like. Wherein the click-through rate characteristic may indicate that the short video data is after presentation to the user. The users clicking the short video data account for the proportion of all the users. The viewing duration characteristic may represent an average viewing duration of the short video data. The presentation characteristic may indicate how many users the short video data is presented to, the duration of the presentation, and the like. The click feature may represent a user clicking on the short video data, a time of the click operation, and the like.

In an exemplary embodiment of the present invention, in the process of performing the above-mentioned compression storage on the feature data to obtain the feature array, the feature data may be stored in a Sparse matrix in a Compressed Sparse Row (CSR) format to obtain the feature array. A sparse matrix is a matrix with a small number of non-zero entries (in a matrix, if the number of elements of value 0 is much larger than the number of elements of non-0, and the distribution of the elements of non-0 is not regular). If the two-dimensional array storage method is adopted for the sparse matrix, a large number of storage units are wasted for storing the zero elements, and a large amount of time is wasted in operation for carrying out invalid operation on the zero elements. Compressed storage of the sparse matrix (storing only non-zero elements) must therefore be considered. The CSR format is a storage format for sparse matrices, and is an efficient format for matrix-matrix and matrix-vector operations. CSR uses three arrays of values, row offsets (indicating the starting offset position of the first element of a row within the value, and the final offset of the row plus the total number of elements of the matrix), and column numbers. Wherein the one-dimensional array data (values) stores all non-0 values in order, having as many elements as non-0 elements. One-dimensional array indptr (row offset): the certificate is included so that indptr [ i ] is the index of the element in data, which is the first non-0 element in line i. If the entire row i is 0, then indptr [ i ] ═ indptr [ i +1], if the initial matrix has m rows, then len (indptr) ═ m + 1. One-dimensional arrays of Indices contain column index information in such a way that Indices [ indptr [ i ] indptr [ i +1] is an integer array of column Indices with non-0 elements in row i. The column index indicates the column number where the value is located, starting from 0. Array data: contains non-zero elements in the matrix and is stored in a row-first form. Line offset: the row indices in CSR are compressed, without row indices, where row indices are represented by row offsets.

In practical applications, the feature array input to the network model may constitute a tensorroto (tensor prototype), which may be a map structure specifically including the following four keywords: index, indices, data, option _ mask. The value of each keyword is TensorProto (dtype ═ uin 64, tensor _ shape ═ n 1), TensorProto (dtype ═ uin 32, tensor _ shape ═ n 2), TensorProto (dtype ═ float32, tensor _ shape ═ n 3), int. Where data represents feature data and option _ mask specifies the output recall score. That is, the recall score output from the network model is output through the option _ mask. n1, n2, and n3 respectively represent the respective numbers of elements.

In an exemplary embodiment of the present invention, in the process of selecting the short video recall files corresponding to the recall scores meeting the preset sorting condition as the sorting result of the multi-path short video recall files, the multi-path short video recall files may be sorted in a descending order according to the recall scores, and a preset number of short video recall files that are arranged in the descending order before are used as the sorting result. For example, if the preset number is 100, the top 100 short video recall files in descending order can be used as the sorting result.

Fig. 2 is a flowchart illustrating steps of a method for training a network model according to an embodiment of the present invention. The network model training method can be applied to a server, and the server can be a short video server. The training method of the network model specifically comprises the following steps.

Step 201, obtaining a plurality of short videos and recalling a sample file.

In the embodiment of the invention, a plurality of short video recall positive sample files and a plurality of short video recall negative sample files can be obtained. In practical application, a duration threshold may be set, and the duration threshold may be understood as a playing duration threshold of a short video when a user watches the short video. And determining whether the obtained short video recall sample file is a short video recall positive sample file or a short video recall negative sample file according to the duration threshold. Specifically, the average watching duration of the short video data in the short video recall sample file may be compared with a duration threshold, and if the average watching duration is greater than the duration threshold, the short video recall sample file is a short video recall sample file; and if the average watching duration is less than or equal to the duration threshold, the short video recall sample file is a short video recall negative sample file. Besides the fact that the short video recall sample file is judged to be the short video recall positive sample file or the short video recall negative sample file by utilizing the time length threshold, multiple paths of short video recall positive sample files and multiple paths of short video recall negative sample files can be obtained according to the watching time length and/or the display click condition. For example, if the short video data in a short video recall file is shown to the user, and the user clicks on the short video data, and meanwhile, the average watching duration of the short video data watched by the user is greater than the duration threshold, the short video recall sample file is a short video recall positive sample file. If the short video data in a certain short video recall file is shown to the user, but the user does not click on the short video data, or the user clicks on the short video data, but the average watching time of the user watching the short video data is less than or equal to the time threshold, the short video recall sample file is a short video recall negative sample file.

Step 202, adding corresponding sample characteristics for the multi-path short video recall sample file.

In an embodiment of the present invention, one of the following sample features may be added to the short-cut video recall sample file: the system comprises a recall source information sample characteristic, a recall score sample characteristic, a recall file sample characteristic, a user display click sample characteristic and a user behavior sample characteristic. The recall source information sample characteristics may include a recall source sample name, a recall source sample type, a recall source sample status, and the like. The recall score sample feature may represent a base relevance score of the short video recall sample file at the corresponding recall source. The recall file sample characteristics may include quality sample characteristics, freshness sample characteristics, click through rate sample characteristics, and the like. The user-presented click sample features may include user sample features that click on or view short video sample data in the short video recall sample file. The user behavior sample feature may represent an operation sample feature of a user performing an interactive operation on the short video sample data in the short video recall sample file, where the interactive operation includes, but is not limited to: download operations, share operations, like operations, comment operations, favorites operations, and the like.

And step 203, training the network model according to the multi-channel short video recall sample file and the corresponding sample characteristics.

In the embodiment of the invention, the recall score sample characteristics, the recall source information sample characteristics, the user behavior sample characteristics and the like in the sample characteristics can be used as the marking data of the short video recall sample file, the training result obtained by training is compared with the corresponding marking data through the training of the network model, and the parameters of the network model are adjusted according to the comparison result until the training result is the same as, similar to or meets the preset training condition with the corresponding marking data. The trained network model can output short video recall sample files arranged in sequence, and specifically can output short video recall sample files arranged in descending order from large recall score to small recall score.

Based on the above description about the embodiments of the short video search ranking method and the network model training method, a ranking scheme based on an xgboost model is introduced below. FIG. 3a shows an off-line training flow diagram of the xgboost model. Firstly, obtaining multi-path sample data, extracting characteristics for the sample data, then dividing the sample data into positive sample data and negative sample data according to the extracted characteristics, and finally training the xgboost model based on an artificial intelligence technology. Fig. 3b shows a schematic diagram of an online application flow of the xgboost model. Firstly, receiving a plurality of paths of recall files, then calculating feature data of the recall files, constructing a feature array in a CSR format according to the feature data, then calling artificial intelligence service to input the recall files and the feature array to a trained xgboost model in parallel, and finally receiving recall scores returned by the xgboost model as final scores of sequencing.

Fig. 4 is a schematic structural diagram illustrating an apparatus for sorting short video searches according to an embodiment of the present invention. The ordering means for the short video search may comprise the following modules.

The file acquisition module 41 is configured to acquire a multi-channel short video recall file to be sorted;

the characteristic generating module 42 is configured to generate a characteristic array of the multiple short video recall files;

a score output module 43, configured to input the multiple paths of feature arrays to a trained network model, and correspondingly output recall scores of the multiple paths of short video recall files;

and the file selection module 44 is configured to select the short video recall file corresponding to the recall score meeting the preset sorting condition as a sorting result of the multiple short video recall files.

In an exemplary embodiment of the present invention, the feature generation module 42 includes:

the characteristic data acquisition module is used for acquiring characteristic data of the multiple paths of video recall files in all dimensions;

and the compression storage module is used for carrying out compression storage on the feature data to obtain the feature array.

In an exemplary embodiment of the present invention, the feature data obtaining module is configured to obtain one of the following features of the multiple video recall files in a document dimension: quality, freshness, user characteristics; and/or acquiring query category characteristics of the multiple paths of video recall files in query dimensions; and/or acquiring one of the following characteristics of the plurality of paths of video recall files in query and document dimensions: click rate characteristics, viewing duration characteristics, and presentation characteristics.

In an exemplary embodiment of the invention, the compressed storage module is configured to store the feature data into a sparse matrix in a compressed sparse row format, so as to obtain the feature array.

In an exemplary embodiment of the present invention, the file selecting module 44 includes:

the score sorting module is used for performing descending sorting on the plurality of short video recall files according to the recall scores;

and the result determining module is used for taking the short video recall files with the preset number at the front after the descending order as the ordering result.

Fig. 5 is a schematic structural diagram illustrating a training apparatus for a network model according to an embodiment of the present invention. The training device of the network model may include the following modules.

The sample acquisition module 51 is configured to acquire a plurality of short videos and recall sample files;

a feature adding module 52, configured to add corresponding sample features to the multiple short video recall sample files;

and the model training module 53 is configured to train a network model according to the multiple short video recall sample files and the corresponding sample features.

In an exemplary embodiment of the present invention, the sample obtaining module 51 is configured to obtain multiple short video recall positive sample files and multiple short video recall negative sample files according to the viewing duration and/or the display click condition.

In an exemplary embodiment of the present invention, the feature adding module 52 is configured to recall a sample file for multiple short videos, and add one of the following sample features: the system comprises a recall source information sample characteristic, a recall score sample characteristic, a recall file sample characteristic, a user display click sample characteristic and a user behavior sample characteristic.

An embodiment of the present invention further provides an electronic device, as shown in fig. 6, including a processor 61, a communication interface 62, a memory 63, and a communication bus 64, where the processor 61, the communication interface 62, and the memory 63 complete mutual communication through the communication bus 64,

a memory 63 for storing a computer program;

the processor 61 is configured to implement the following steps when executing the program stored in the memory 63:

acquiring a multi-channel short video recall file to be sorted;

generating a characteristic array of a plurality of paths of the short video recall files;

inputting the plurality of paths of feature arrays into a trained network model, and correspondingly outputting recall scores of the plurality of paths of short video recall files;

and selecting the short video recall files corresponding to the recall scores meeting the preset sorting conditions as sorting results of the multiple short video recall files.

The step of generating the feature array of the multipath short video recall file comprises the following steps:

acquiring characteristic data of a plurality of paths of video recall files in each dimension;

and compressing and storing the feature data to obtain the feature array.

The step of obtaining feature data of the plurality of paths of the video recall files in each dimension includes:

acquiring one of the following characteristics of a plurality of paths of the video recall files in the document dimension: quality, freshness, user characteristics; and/or the presence of a gas in the gas,

acquiring query category characteristics of a plurality of paths of video recall files in a query dimension; and/or the presence of a gas in the gas,

acquiring one of the following characteristics of the multi-channel video recall file in the dimensions of query and document: click rate characteristics, viewing duration characteristics, and presentation characteristics.

The step of compressing and storing the feature data to obtain the feature array comprises the following steps:

and storing the characteristic data into a sparse matrix in a compressed sparse row format to obtain the characteristic array.

The step of selecting the short video recall file corresponding to the recall score meeting the preset sorting condition as the sorting result of the plurality of short video recall files comprises the following steps:

according to the recall scores, performing descending order arrangement on the multiple short video recall files;

and taking the short video recall files of the preset number at the front after descending order as the ordering result.

The processor 61 is further configured to implement the following steps when executing the program stored in the memory 63:

acquiring a plurality of paths of short video recall sample files;

adding corresponding sample characteristics to the multi-path short video recall sample file;

and training a network model according to the multipath short video recall sample file and the corresponding sample characteristics.

The step of obtaining the multipath short video recalling the sample file comprises the following steps:

and acquiring multiple paths of short video recall positive sample files and multiple paths of short video recall negative sample files according to the watching duration and/or the display click condition.

The step of adding corresponding sample features to the plurality of short video recall sample files comprises:

for multiplexing the short video recall sample file, one of the following sample features is added: the system comprises a recall source information sample characteristic, a recall score sample characteristic, a recall file sample characteristic, a user display click sample characteristic and a user behavior sample characteristic.

The communication bus mentioned in the above terminal may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface is used for communication between the terminal and other equipment.

The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.

The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

In another embodiment of the present invention, there is also provided a computer-readable storage medium, which stores instructions that, when executed on a computer, cause the computer to perform the ranking method of short video search or the training method of network model in any of the above embodiments.

In yet another embodiment of the present invention, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method for ranking short video searches or the method for training the network model according to any of the above embodiments.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the invention to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims

1. A method for ordering short video searches, comprising:

acquiring a multi-channel short video recall file to be sorted;

2. The method of claim 1, wherein the generating a feature array of the plurality of short video recall files comprises:

and compressing and storing the feature data to obtain the feature array.

3. The method of claim 2, wherein the obtaining feature data of the plurality of video recall files in each dimension comprises:

4. The method of claim 2, wherein the compressing the feature data to obtain the feature array comprises:

5. The method according to any one of claims 1 to 4, wherein the selecting a short video recall file corresponding to the recall score meeting a preset ranking condition as a ranking result of a plurality of short video recall files comprises:

6. A method for training a network model, comprising:

acquiring a plurality of paths of short video recall sample files;

7. The method of claim 6, wherein obtaining the plurality of short video recall sample files comprises:

8. The method of claim 6, wherein adding corresponding sample features for the plurality of short video recall sample files comprises:

9. An apparatus for ordering short video searches, comprising:

the file acquisition module is used for acquiring the multi-channel short video recall files to be sorted;

the characteristic generating module is used for generating a characteristic array of the multipath short video recall file;

the score output module is used for inputting the plurality of paths of feature arrays to a trained network model and correspondingly outputting the recall scores of the plurality of paths of short video recall files;

and the file selection module is used for selecting the short video recall files corresponding to the recall scores meeting the preset sorting conditions as the sorting results of the multiple short video recall files.

10. An apparatus for training a network model, comprising:

the sample acquisition module is used for acquiring a plurality of paths of short video recall sample files;

the characteristic adding module is used for adding corresponding sample characteristics for the multi-path short video recall sample file;

and the model training module is used for training the network model according to the multipath short video recall sample file and the corresponding sample characteristics.

11. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;

a memory for storing a computer program;

a processor for implementing the method for ranking short video searches of any of claims 1 to 5 or the method for training a network model of any of claims 6 to 8 when executing a program stored in a memory.

12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of ranking short video searches of any of claims 1 to 5 or the method of training a network model of any of claims 6 to 8.