CN113849666A - Data processing method, data processing device, model training method, model training device, equipment and storage medium - Google Patents

Data processing method, data processing device, model training method, model training device, equipment and storage medium Download PDF

Info

Publication number
CN113849666A
CN113849666A CN202111040413.9A CN202111040413A CN113849666A CN 113849666 A CN113849666 A CN 113849666A CN 202111040413 A CN202111040413 A CN 202111040413A CN 113849666 A CN113849666 A CN 113849666A
Authority
CN
China
Prior art keywords
data
multimedia resource
feature vector
text
image feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111040413.9A
Other languages
Chinese (zh)
Inventor
曹文强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111040413.9A priority Critical patent/CN113849666A/en
Publication of CN113849666A publication Critical patent/CN113849666A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/45Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a data processing method, a data processing device, a model training device and a storage medium, which relate to the technical field of computers and can accurately determine the association degree of topic labels and short videos. The data processing method comprises the following steps: determining first characteristic data, second characteristic data and third characteristic data; the first characteristic data is used for representing the matching degree of the multimedia resource and the topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource; and determining the association degree of the multimedia resource and the topic label according to the first characteristic data, the second characteristic data and the third characteristic data.

Description

Data processing method, data processing device, model training method, model training device, equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for data processing and model training.
Background
Short videos gradually become one of the main means for people to record life and also one of the main forms for people to consume and entertain everyday. The topic tag (Hashtag) is a definition of a short video publishing account on a video of the account, and can help a viewing account to quickly know basic information of the video. A video may have multiple hashtags.
However, the publishing account of the short video may edit hashtag freely due to exposure, which may have a serious impact on the search recommendation service provided by the short video platform. For example, a short video publishing account publishes a video running in the evening, but for more exposure, the video is labeled with "# make a mobile phone good for use", and when other watching accounts search for the make a mobile phone, the video is obtained and is irrelevant to the make a mobile phone.
Therefore, how to determine and process hashtag irrelevant to short video is a technical problem which needs to be solved urgently at present.
Disclosure of Invention
The invention provides a data processing method, a data processing device, a model training device and a storage medium, which can accurately determine the association degree of a topic label and a short video.
The technical scheme of the embodiment of the disclosure is as follows:
according to a first aspect of embodiments of the present disclosure, there is provided a data processing method that may be applied to an electronic device. The method can comprise the following steps: determining first characteristic data, second characteristic data and third characteristic data; the first characteristic data is used for representing the matching degree of the multimedia resource and the topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource; and determining the association degree of the multimedia resource and the topic label according to the first characteristic data, the second characteristic data and the third characteristic data.
Optionally, the method for determining the association degree between the multimedia resource and the topic tag according to the first feature data, the second feature data, and the third feature data specifically includes: inputting the first characteristic data, the second characteristic data and the third characteristic data into a correlation degree detection model to obtain a correlation degree; the relevance detection model is a model which is trained to a convergence state in advance and used for detecting the relevance of the multimedia resource and the topic label.
Optionally, the method for determining the first feature data specifically includes: acquiring associated data of multimedia resources; extracting text characteristic vectors and image characteristic vectors of the multimedia resources from the associated data; determining first matching data between the topic label and the text feature vector; acquiring image feature vectors of associated multimedia resources related to the topic labels; determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource; the first matching data and the second matching data are determined as first feature data.
Optionally, the association data comprises: at least one of audio data, cover data, image frame data, classification tags, account information, and comment information of the multimedia resource.
Optionally, the method for extracting a text feature vector and an image feature vector of a multimedia resource from associated data specifically includes: extracting trusted texts and untrusted texts from the associated data; the credible text is text data with a first association degree with the multimedia resource; the unreliable text is text data with a second association degree with the multimedia resource; the first degree of association is greater than the second degree of association; converting the credible text into a first text feature vector and converting the incredible text into a second text feature vector, and determining that the text feature vectors comprise the first text feature vector and the second text feature vector; and extracting a first image feature vector of the cover data and a second image feature vector of the image frame data by using a pre-trained image feature extraction model, and determining that the image feature vectors comprise the first image feature vector and the second image feature vector.
Optionally, the trusted text comprises: classifying at least one of a text in the tag, a text in the account information, a music title text extracted from the audio data using a voice recognition technique, and a singer text; the untrusted text includes: comment on at least one of the text in the information, the lyrics text extracted from the audio data using a speech recognition technique, and the text recognized from the image frame data using a text recognition technique.
Optionally, the method for determining the first matching data between the topic tag and the text feature vector specifically includes: acquiring at least one associated text feature vector corresponding to the topic label; determining each associated text feature vector in the at least one associated text feature vector, and the literal matching degree and the semantic matching degree of the text feature vector; and determining first matching data based on the literal matching degree and the semantic matching degree.
Optionally, the method for obtaining the image feature vector of the associated multimedia resource related to the topic tag specifically includes: acquiring cover data and image frame data of related multimedia resources: extracting a third image feature vector of cover data of the associated multimedia resource and a fourth image feature vector of image frame data of the associated multimedia resource by using the image feature extraction model; determining that the image feature vector of the associated multimedia asset includes the third image feature vector and the fourth image feature vector.
Optionally, the method for determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource specifically includes: and determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource based on the first image feature vector, the second image feature vector, the third image feature vector and the fourth image feature vector.
Optionally, the data processing method further includes: and when the association degree of the multimedia resource and the topic label is smaller than a preset association degree threshold value, removing the association relation between the multimedia resource and the topic label.
According to a second aspect of the embodiments of the present disclosure, a model training method is provided, which can be applied to an electronic device. The model training method comprises the following steps: acquiring training data of multimedia resources in a preset time period; the training data comprises training sample input data and training sample label data; training sample input data includes: matching degree of the multimedia resource and the topic label of the multimedia resource, consumption data of the multimedia resource and user portrait data of an account for issuing the multimedia resource; training the sample label data includes: the relevance of the multimedia resource and the topic label of the multimedia resource; training to obtain a correlation detection model based on training data; and the relevance detection model is used for detecting the relevance of the multimedia resource and the topic label.
According to a third aspect of the embodiments of the present disclosure, there is provided a data processing apparatus, which can be applied to an electronic device. The apparatus may include: a determination unit; a determination unit configured to determine first feature data, second feature data, and third feature data; the first characteristic data is used for representing the matching degree of the multimedia resource and the topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource; and the determining unit is further used for determining the association degree of the multimedia resource and the topic label according to the first characteristic data, the second characteristic data and the third characteristic data.
Optionally, the determining unit is specifically configured to: inputting the first characteristic data, the second characteristic data and the third characteristic data into a correlation degree detection model to obtain a correlation degree; the relevance detection model is a model which is trained to a convergence state in advance and used for detecting the relevance of the multimedia resource and the topic label.
Optionally, the determining unit is specifically configured to: acquiring associated data of multimedia resources; extracting text characteristic vectors and image characteristic vectors of the multimedia resources from the associated data; determining first matching data between the topic label and the text feature vector; acquiring image feature vectors of associated multimedia resources related to the topic labels; determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource; it is determined that the first characteristic data includes first matching data and second matching data.
Optionally, the association data comprises: at least one of audio data, cover data, image frame data, classification tags, account information, and comment information of the multimedia resource.
Optionally, the determining unit is specifically configured to: extracting trusted texts and untrusted texts from the associated data; the credible text is text data with a first association degree with the multimedia resource; the unreliable text is text data with a second association degree with the multimedia resource; the first association degree is greater than the second association degree, the credible text is converted into a first text feature vector, the incredible text is converted into a second text feature vector, and the text feature vector is determined to comprise the first text feature vector and the second text feature vector; and extracting a first image feature vector of the cover data and a second image feature vector of the image frame data by using a pre-trained image feature extraction model, and determining that the image feature vectors comprise the first image feature vector and the second image feature vector.
Optionally, the trusted text comprises: classifying at least one of a text in the tag, a text in the account information, a music title text extracted from the audio data using a voice recognition technique, and a singer text; the untrusted text includes: comment on at least one of the text in the information, the lyrics text extracted from the audio data using a speech recognition technique, and the text recognized from the image frame data using a text recognition technique.
Optionally, the determining unit is specifically configured to: acquiring at least one associated text feature vector corresponding to the topic label; determining each associated text feature vector in the at least one associated text feature vector, and the literal matching degree and the semantic matching degree of the text feature vector; and determining first matching data based on the literal matching degree and the semantic matching degree.
Optionally, the determining unit is specifically configured to: acquiring cover data and image frame data of related multimedia resources: extracting a third image feature vector of cover data of the associated multimedia resource and a fourth image feature vector of image frame data of the associated multimedia resource by using the image feature extraction model; determining that the image feature vector of the associated multimedia asset includes the third image feature vector and the fourth image feature vector.
Optionally, the determining unit is specifically configured to: and determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource based on the first image feature vector, the second image feature vector, the third image feature vector and the fourth image feature vector.
Optionally, the determining unit is further configured to, when the association degree between the multimedia resource and the topic tag is smaller than a preset association degree threshold, release the association relationship between the multimedia resource and the topic tag.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a model training apparatus including: an acquisition unit and a training unit; the acquisition unit is used for acquiring training data of the multimedia resources in a preset time period; the training data includes training: training sample input data and training sample label data; training sample input data includes: matching degree of the multimedia resource and the topic label of the multimedia resource, consumption data of the multimedia resource and user portrait data of an account for issuing the multimedia resource; training the sample label data includes: the relevance of the multimedia resource and the topic label of the multimedia resource; the training unit is used for training to obtain a correlation detection model based on training data; and the relevance detection model is used for detecting the relevance of the multimedia resource and the topic label.
According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic device, which may include: a processor and a memory for storing processor-executable instructions; wherein the processor is configured to execute the instructions to implement any of the above-described optional data processing methods of the first aspect, or to implement the model training method of the second aspect.
According to a sixth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon instructions, which, when executed by a processor of an electronic device, enable the electronic device to perform any one of the above-mentioned optional data processing methods of the first aspect, or implement the model training method of the second aspect.
According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program product, which includes computer instructions, when the computer instructions are run on an electronic device, cause the electronic device to execute the data processing method according to any one of the optional implementations of the first aspect, or implement the model training method of the second aspect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
based on any one of the above aspects, in the present disclosure, after the first feature data, the second feature data, and the third feature data are determined, since the first feature data is used to represent a matching degree of the multimedia resource and the topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource, so that the association degree of the multimedia resource and the topic label can be quickly and accurately obtained according to the first characteristic data, the second characteristic data and the third characteristic data, the search service of the multimedia resource is prevented from being influenced by the irrelevant topic label, and the user experience of the multimedia resource in a search recommendation scene is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 is a schematic flow chart illustrating a data processing method provided by an embodiment of the present disclosure;
FIG. 2 is a flow chart illustrating a further data processing method provided by an embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating a further data processing method provided by an embodiment of the present disclosure;
FIG. 4 is a flow chart illustrating a further data processing method provided by an embodiment of the present disclosure;
FIG. 5 is a flow chart illustrating a further data processing method provided by an embodiment of the present disclosure;
FIG. 6 is a flow chart illustrating a further data processing method provided by an embodiment of the present disclosure;
FIG. 7 is a flow chart diagram illustrating a model training method provided by an embodiment of the present disclosure;
FIG. 8 is a schematic diagram illustrating a structure of another data processing apparatus provided in an embodiment of the present disclosure;
FIG. 9 is a schematic structural diagram of a model training apparatus provided in an embodiment of the present disclosure;
fig. 10 is a schematic structural diagram of a terminal provided in an embodiment of the present disclosure;
fig. 11 shows a schematic structural diagram of a server provided in an embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, and/or components.
The data to which the present disclosure relates may be data that is authorized by a user or sufficiently authorized by parties.
As described in the background, the publishing account of the short video may edit hashtag freely due to exposure, which may have a serious impact on the search recommendation service provided by the short video platform. For example, a short video publishing account publishes a video running in the evening, but for more exposure, the video is labeled with "# make a mobile phone good for use", and when other watching accounts search for the make a mobile phone, the video is obtained and is irrelevant to the make a mobile phone. Therefore, how to determine and process hashtag irrelevant to short video is a technical problem which needs to be solved urgently at present.
Based on this, the embodiment of the present disclosure provides a data processing method, after determining first feature data, second feature data, and third feature data, because the first feature data is used to represent a matching degree of a multimedia resource and a topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource, so that the association degree of the multimedia resource and the topic label can be quickly and accurately obtained according to the first characteristic data, the second characteristic data and the third characteristic data, the search service of the multimedia resource is prevented from being influenced by the irrelevant topic label, and the user experience of the multimedia resource in a search recommendation scene is improved.
The following is an exemplary description of the data processing method provided by the embodiments of the present disclosure:
the data processing method provided by the disclosure can be applied to electronic equipment.
In some embodiments, the electronic device may be a server, a terminal, or other electronic devices for performing data processing, which is not limited in this disclosure.
The server may be a single server, or may be a server cluster including a plurality of servers. In some embodiments, the server cluster may also be a distributed cluster. The present disclosure is also not limited to a specific implementation of the server.
The terminal may be a mobile phone, a tablet computer, a desktop computer, a laptop computer, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, a Personal Digital Assistant (PDA), an Augmented Reality (AR), a Virtual Reality (VR) device, and other devices that can install and use a content community application (e.g., a fast hand), and the specific form of the electronic device is not particularly limited by the present disclosure. The system can be used for man-machine interaction with a user through one or more modes of a keyboard, a touch pad, a touch screen, a remote controller, voice interaction or handwriting equipment and the like.
The data processing method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, when the data processing method is applied to an electronic device, the data processing method may include:
s101, the electronic equipment determines first characteristic data, second characteristic data and third characteristic data.
Specifically, when determining the association degree between the multimedia resource and the topic tag, the electronic device may determine first feature data, second feature data, and third feature data.
The first feature data is used for representing the matching degree of the multimedia resource and the topic label of the multimedia resource.
Optionally, the first feature data may be matching data between the topic tag and text data in the associated data of the multimedia resource, may also be matching data between an image feature of the associated multimedia resource of the topic tag and an image feature of the multimedia resource, and may also be other feature data used for representing a matching degree between the multimedia resource and the topic tag of the multimedia resource, which is not limited in this disclosure.
The second characteristic data is used to represent consumption data of the multimedia asset.
Optionally, the consumption data includes the number of the interactive operations performed on the multimedia resource and the content of the interactive operations performed on the multimedia resource.
Illustratively, the number of performed interactions includes: play amount, click amount, comment amount, praise amount, share amount, and the like. The content on which the interactive operation is performed includes: comment on content, trigger clicked search query content, etc.
The third feature data is used to represent user representation data of an account for publishing the multimedia asset.
Optionally, the user representation data includes account information of a user publishing the multimedia resource and behavior data of the user.
Illustratively, the account information includes: account name information, friend information, etc. The behavior data includes: search term data, category label data, and the like.
S102, the electronic equipment determines the association degree of the multimedia resource and the topic label according to the first feature data, the second feature data and the third feature data.
Optionally, when the electronic device determines the association degree between the multimedia resource and the topic tag according to the first feature data, the second feature data, and the third feature data, the first feature data, the second feature data, and the third feature data may be input into the association degree detection model to obtain the association degree between the multimedia resource and the topic tag.
The relevancy detection model is a model which is trained to a convergence state in advance and used for detecting the relevancy of the multimedia resource and the topic label.
Specifically, after the first feature data, the second feature data and the third feature data are determined, the first feature data, the second feature data and the third feature data may be input into the association degree detection model to obtain the association degree between the multimedia resource and the topic label.
Optionally, the relevance detection model may be a tree model (XGBoost), one of end-to-end models for determining relevance, or other models, which is not limited in this disclosure.
As a further alternative, the relevance detection model may learn the relevance between the input feature data through feature combinations and output a score, and output the relevance between the multimedia resource and the topic label by empirically grading the score.
Illustratively, the association degree of the multimedia resource and the topic tag includes three levels, which are respectively: class A (corresponding to model output scores of 80-100), class B (corresponding to model output scores of 60-80) and class C (corresponding to model output scores of 0-60). After the first feature data, the second feature data and the third feature data are determined, the first feature data, the second feature data and the third feature data can be input into the association degree detection model, so that the score of the association degree of the multimedia resource and the topic label is 85 points. In this case, the level of the association degree of the multimedia resource and the topic tag is determined to be a level a.
Optionally, when the electronic device determines the association degree of the multimedia resource and the topic tag according to the first feature data, the second feature data and the third feature data, the electronic device may further perform quantization processing on the first feature data, the second feature data and the third feature data, respectively assign weights to the quantized feature data, and determine the association degree of the multimedia resource and the topic tag based on a similarity algorithm.
The similarity algorithm may be a euclidean distance algorithm, a pearson correlation coefficient algorithm, a cosine similarity algorithm, or the like.
Optionally, the electronic device may further determine, by using another existing association degree detection algorithm, the association degree between the multimedia resource and the topic tag according to the first feature data, the second feature data, and the third feature data, which is not limited in this disclosure.
The technical scheme provided by the embodiment at least has the following beneficial effects: from S101 to S102, after the first feature data, the second feature data, and the third feature data are determined, the first feature data is used to represent a matching degree between the multimedia resource and the topic tag of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource, so that the association degree of the multimedia resource and the topic label can be quickly and accurately obtained according to the first characteristic data, the second characteristic data and the third characteristic data, the search service of the multimedia resource is prevented from being influenced by the irrelevant topic label, and the user experience of the multimedia resource in a search recommendation scene is improved.
In one embodiment, the data processing method further comprises:
and when the association degree of the multimedia resource and the topic label is smaller than a preset association degree threshold value, the electronic equipment releases the association relation between the multimedia resource and the topic label.
Specifically, after the first feature data, the second feature data and the third feature data are input into the association degree detection model to obtain the association degree between the multimedia resource and the topic label, whether the association degree between the multimedia resource and the topic label is smaller than a preset association degree threshold value or not can be judged. When the association degree of the multimedia resource and the topic label is smaller than a preset association degree threshold value, the topic label is not the topic label related to the multimedia resource, and therefore the electronic equipment releases the association relation between the multimedia resource and the topic label.
Correspondingly, when the association degree of the multimedia resource and the topic label is greater than or equal to the preset association degree threshold value, the topic label is the topic label related to the multimedia resource, and therefore the electronic equipment does not need any processing.
The technical scheme provided by the embodiment at least has the following beneficial effects: as can be seen from the above, after the first feature data, the second feature data and the third feature data are input into the association degree detection model to obtain the association degree between the multimedia resource and the topic tag, it can be determined whether the association degree between the multimedia resource and the topic tag is smaller than the preset association degree threshold. When the association degree of the multimedia resource and the topic label is smaller than the preset association degree threshold value, the topic label is not the topic label related to the multimedia resource, so that the electronic equipment removes the association relation between the multimedia resource and the topic label, the influence of the unrelated topic label on the search service of the multimedia resource is avoided, and the user experience of the multimedia resource in the search recommendation scene is improved.
In an embodiment, with reference to fig. 1 and as shown in fig. 2, in the above S101, the method for determining, by an electronic device, first feature data specifically includes:
s201, the electronic equipment acquires the associated data of the multimedia resources.
Specifically, the electronic device may obtain the associated data of the multimedia resource when determining the first feature data.
Optionally, the association data comprises: at least one of audio data, cover data, image frame data, classification tags, account information, and comment information of the multimedia resource.
Since the associated data is detailed data associated with the multimedia resource, the electronic device may subsequently determine the first feature data accurately based on the associated data, and further determine the association degree between the target multimedia resource and the topic tag accurately based on the first feature data.
Illustratively, when the multimedia asset is a short video, the audio data may be background music of the short video. The cover data may be a cover image of the short video. The image frame data may be an image of each frame in the short video. The category label may be a machine category label for which the short video is derived by a classifier (e.g., the category labels for basketball-type short videos include "basketball," "sports," etc.). The account information may be an account name, a friend name, etc. of the publishing short video account. The comment information may be comment content in the short video.
S202, the electronic equipment extracts the text feature vector and the image feature vector of the multimedia resource from the associated data.
Specifically, after obtaining the associated data of the multimedia resource, the associated data includes: at least one of audio data, cover data, image frame data, classification tags, account information, and comment information of the multimedia resource, and thus, the electronic device may extract text feature vectors and image feature vectors of the multimedia resource from the associated data.
Optionally, when the electronic device extracts the text feature vector of the multimedia resource from the associated data, the text feature vector of the multimedia resource may be extracted from the associated data by using a pre-trained text feature vector extraction model.
Optionally, when the electronic device extracts the image feature vector of the multimedia resource from the associated data, the image feature vector of the multimedia resource may be extracted from the associated data by using a pre-trained image feature vector extraction model.
S203, the electronic equipment determines first matching data between the topic label and the text feature vector.
Specifically, after extracting the text feature vector and the image feature vector of the multimedia resource from the associated data, the electronic device may determine first matching data between the topic tag and the text feature vector.
Optionally, when the electronic device determines the first matching data between the topic tag and the text feature vector, the first matching data between the topic tag and the text feature vector may be determined by using a pre-trained text semantic matching model.
S204, the electronic equipment obtains image feature vectors of the associated multimedia resources related to the topic labels.
Specifically, after extracting the text feature vector and the image feature vector of the multimedia resource from the associated data, the electronic device may further obtain the image feature vector of the associated multimedia resource related to the topic tag.
The related multimedia resources related to the topic tag may be all multimedia resources obtained by searching the topic tag, may also be multimedia resources of which a part of the relevance meets a preset condition, and may also be other related multimedia resources related to the topic tag, which is not limited in this disclosure.
Illustratively, the topical tag is "brand a cell phone". The electronic device can mine the first 5 videos with the best consumption effect as the associated multimedia resources related to the topic label brand a mobile phone when the brand a mobile phone is used as a search query according to the search log.
S205, the electronic equipment determines second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource.
Specifically, after obtaining the image feature vector of the associated multimedia resource related to the topic tag, the electronic device may determine second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource.
Specifically, when second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource is determined, the image feature vector of the associated multimedia resource may be obtained.
Specifically, when the image feature vector of the associated multimedia resource is obtained, a cover image and an image frame of the associated multimedia resource may be obtained, and the cover image and the image frame of the associated multimedia resource may be converted into the image feature vector of the associated multimedia resource.
When determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource, the image feature vector of the multimedia resource can also be obtained.
Specifically, when the image feature vector of the multimedia resource is obtained, a cover image and an image frame of the multimedia resource can be obtained, and the cover image and the image frame of the multimedia resource are converted into the image feature vector of the multimedia resource.
After the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource are obtained, second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource are determined according to the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource.
It should be noted that, the sequence of the electronic device obtaining the image feature vector of the associated multimedia resource and obtaining the image feature vector of the multimedia resource is not limited.
Optionally, the electronic device may obtain the image feature vector of the associated multimedia resource first, and then obtain the image feature vector of the multimedia resource; or obtaining the image characteristic vector of the multimedia resource first and then obtaining the image characteristic vector of the associated multimedia resource; and the image characteristic vector of the associated multimedia resource and the image characteristic vector of the multimedia resource can be obtained simultaneously.
S206, the electronic equipment determines the first matching data and the second matching data as first characteristic data.
After determining the first matching data and the second matching data, the electronic device determines the first matching data and the second matching data as being included in the first feature data.
The technical scheme provided by the embodiment at least has the following beneficial effects: as known from S201-S206, after acquiring the associated data of the multimedia resource, the electronic device may extract a text feature vector and an image feature vector of the multimedia resource from the associated data, and determine first matching data between the topic tag and the text feature vector. The electronic device may also obtain an image feature vector of the associated multimedia resource related to the topic tag and determine second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource. Subsequently, the electronic device determines the first matching data and the second matching data as the first feature data, and a specific implementation manner for determining the first feature data is provided.
Because the first feature data comprise the first matching data and the second matching data, the first matching data are the matching data between the topic labels and the text feature vectors, and the second matching data are the matching data between the image feature vectors of the associated multimedia resources and the image feature vectors of the multimedia resources, the electronic equipment can accurately determine the association degree of the multimedia resources and the topic labels through the first feature data, the condition that the search service of the multimedia resources is influenced by the irrelevant topic labels is avoided, and the user experience of the multimedia resources under the search recommendation scene is improved.
In an embodiment, referring to fig. 2 and as shown in fig. 3, in the above S202, the method for extracting, by an electronic device, a text feature vector and an image feature vector of a multimedia resource from associated data specifically includes:
s301, the electronic equipment extracts the credible text and the incredible text from the associated data.
Specifically, when the electronic device extracts the text feature vector of the multimedia resource from the associated data, the electronic device may extract the trusted text and the untrusted text from the associated data.
The credible text is text data with a first association degree with the multimedia resource; the unreliable text is text data with a second association degree with the multimedia resource; the first degree of association is greater than the second degree of association.
Optionally, since the credible text is used to indicate the degree of association between the text and the multimedia resource, the credible text includes: at least one of text in the category label, text in the account information, music title text and singer text extracted from the audio data using speech recognition technology.
Optionally, since the confidence level of the untrusted text used for representing the association degree of the text with the multimedia resource is low, the untrusted text includes: comment on at least one of the text in the information, the lyrics text extracted from the audio data using a speech recognition technique, and the text recognized from the image frame data using a text recognition technique.
Because the trusted text and the untrusted text are text data in the associated data of the multimedia resource, the electronic device can subsequently accurately determine the first feature data based on the associated data, and further accurately determine the association degree of the target multimedia resource and the topic label based on the first feature data.
S302, the electronic equipment converts the credible text into a first text feature vector and converts the incredible text into a second text feature vector, and the text feature vectors are determined to comprise the first text feature vector and the second text feature vector.
Specifically, after extracting the trusted text and the untrusted text from the associated data, the electronic device may convert the trusted text into a first text feature vector and convert the untrusted text into a second text feature vector, and determine that the text feature vectors include the first text feature vector and the second text feature vector.
Optionally, when the electronic device converts the trusted text into the first text feature vector and converts the untrusted text into the second text feature vector, the electronic device may convert the trusted text into the first text feature vector and convert the untrusted text into the second text feature vector by using a feature vector conversion algorithm.
S303, the electronic equipment extracts a first image feature vector of the cover data and a second image feature vector of the image frame data by using the pre-trained image feature extraction model, and determines that the image feature vectors comprise the first image feature vector and the second image feature vector.
Specifically, when the electronic device extracts the image feature vector of the multimedia resource from the associated data, a pre-trained image feature extraction model may be used to extract a first image feature vector of the cover data and a second image feature vector of the image frame data, and determine that the image feature vectors include the first image feature vector and the second image feature vector.
The technical scheme provided by the embodiment at least has the following beneficial effects: as known from S301 to S303, when the electronic device extracts the text feature vector of the multimedia resource from the associated data, the electronic device may first extract the trusted text and the untrusted text from the associated data. Next, the electronic device may convert the trusted text into a first text feature vector and convert the untrusted text into a second text feature vector and determine that the text feature vectors include the first text feature vector and the second text feature vector. When the electronic equipment extracts the image feature vector of the multimedia resource from the associated data, a first image feature vector of cover data and a second image feature vector of image frame data can be extracted by using a pre-trained image feature extraction model, and the image feature vectors are determined to comprise the first image feature vector and the second image feature vector, so that a specific implementation mode for acquiring the text feature vector and the image feature vector of the multimedia resource is provided.
Because the text feature vector comprises the first text feature vector (text feature vector of the credible text) and the second text feature vector (text feature vector of the credible text), and the image feature vector comprises the first image feature vector (image feature vector of the cover data) and the second image feature vector (image feature vector of the image frame data), the association degree of the multimedia resource and the topic tag can be accurately determined through the text feature vector and the image feature vector of the multimedia resource, the influence of the irrelevant topic tag on the search service of the multimedia resource is avoided, and the user experience of the multimedia resource in the search recommendation scene is improved.
In an embodiment, referring to fig. 3, as shown in fig. 4, in the above S203, the method for determining, by an electronic device, first matching data between a topic tag and a text feature vector specifically includes:
s401, the electronic equipment obtains at least one associated text feature vector corresponding to the topic label.
Specifically, upon determining first match data between the topic tag and the text feature vector, the electronic device may obtain at least one associated text feature vector corresponding to the topic tag.
Optionally, when the electronic device obtains the at least one associated text feature vector corresponding to the topic tag, the electronic device may perform rewrite transformation and entity association on the topic tag to obtain a reliable transformation form, and then perform feature vector processing on the transformed topic tag to obtain the at least one associated text feature vector.
Illustratively, the topic tags are: "how the brand a handset looks". The electronic equipment can perform rewriting transformation and entity association on the topic label of' what brand a mobile phone is ", so as to obtain a reliable transformation form, which includes: "brand a mobile phone", "brand a", "mobile phone", etc.
S402, the electronic equipment determines each associated text feature vector in the at least one associated text feature vector, and the literal matching degree and the semantic matching degree of the text feature vector.
Specifically, after obtaining at least one associated text feature vector corresponding to the topic tag, the electronic device determines each associated text feature vector in the at least one associated text feature vector, and a literal matching degree and a semantic matching degree of the text feature vector.
Optionally, when the electronic device determines each associated text feature vector in the at least one associated text feature vector and the literal matching degree of the text feature vector, the electronic device may first obtain a text weight in the text feature vector, and then determine each associated text feature vector in the at least one associated text feature vector and the literal matching degree of the text feature vector according to the text weight in the text feature vector.
Optionally, when the electronic device determines the semantic matching degree between each associated text feature vector in the at least one associated text feature vector and the text feature vector, the electronic device may determine the semantic matching degree between each associated text feature vector in the at least one associated text feature vector and the text feature vector based on a pre-trained text semantic matching model.
S403, the electronic equipment determines first matching data based on the literal matching degree and the semantic matching degree.
Specifically, after the literal matching degree and the semantic matching degree are obtained, the electronic device determines first matching data based on the literal matching degree and the semantic matching degree.
For example, for the trusted text, the electronic device may calculate a face matching score of each of the at least one associated text and each of the trusted text, and take the average or the maximum as the face matching degree of the associated text and the trusted text.
Accordingly, the electronic device may calculate a semantic matching score of each associated text of the at least one associated text and each text in the trusted text, and take the average value or the maximum value as the semantic matching degree of the associated text and the trusted text.
Optionally, the electronic device may further select an associated text with a score larger than a preset trusted text score from the word matching score and the semantic matching score of each associated text of the at least one associated text and each text in the trusted text, and acquire the number of associated texts with a score larger than the preset trusted text score.
For the untrusted text, the electronic device may calculate a face-match score for each of the at least one associated text and each of the untrusted text, and take the average or maximum value as the face-match of the associated text and the untrusted text.
Accordingly, the electronic device may calculate a semantic matching score of each associated text of the at least one associated text and each text in the untrusted text, and take the average value or the maximum value as the semantic matching degree of the associated text and the untrusted text.
Optionally, the electronic device may further select, from the word matching score and the semantic matching score of each associated text of the at least one associated text and each text in the untrusted text, an associated text with a score greater than a preset untrusted text score, and acquire the number of associated texts with a score greater than the preset untrusted text score.
Subsequently, the electronic device determining the first matching data comprises: the word matching degree of the associated text and the credible text, the semantic matching degree of the associated text and the credible text, the word matching degree of the associated text and the untrustworthy text, the semantic matching degree of the associated text and the untrustworthy text, the number of the associated texts larger than the preset credible text score and the number of the associated texts larger than the preset untrustworthy text score.
The technical scheme provided by the embodiment at least has the following beneficial effects: from S401 to S403, when the electronic device determines first matching data between the topic tag and the text feature vector, at least one associated text feature vector corresponding to the topic tag may be obtained, and a literal matching degree and a semantic matching degree between each associated text feature vector and the text feature vector in the at least one associated text feature vector are determined. Subsequently, the electronic device determines first matching data based on the literal matching degree and the semantic matching degree. A specific implementation of obtaining the first match data is given.
Since the first matching data includes: each associated text feature vector in the at least one associated text feature vector, and the literal matching degree and the semantic matching degree of the text feature vector, therefore, the electronic equipment can accurately determine the association degree of the multimedia resource and the topic tag through the first matching data, the influence of irrelevant topic tags on the search service of the multimedia resource is avoided, and the user experience of the multimedia resource in the search recommendation scene is improved.
In an embodiment, with reference to fig. 4 and as shown in fig. 5, in the above S204, the method for acquiring, by an electronic device, an image feature vector of an associated multimedia resource related to a topic tag specifically includes:
s501, the electronic equipment acquires cover data and image frame data of the associated multimedia resource.
Specifically, when the electronic device acquires the image feature vector of the associated multimedia resource related to the topic tag, cover data and image frame data of the associated multimedia resource may be acquired.
For example, when the associated multimedia asset is a short video, the electronic device may obtain a video cover of the short video and each image frame in the video.
S502, the electronic equipment extracts a third image feature vector of cover data of the associated multimedia resource and a fourth image feature vector of image frame data of the associated multimedia resource by using the image feature extraction model.
Specifically, after acquiring cover data and image frame data of the associated multimedia resource, the electronic device extracts a third image feature vector of the cover data of the associated multimedia resource and a fourth image feature vector of the image frame data of the associated multimedia resource by using the image feature extraction model.
S503, the electronic equipment determines that the image feature vectors of the associated multimedia resources comprise a third image feature vector and a fourth image feature vector.
Specifically, after extracting a third image feature vector of cover data of the associated multimedia resource and a fourth image feature vector of image frame data of the associated multimedia resource by using the image feature extraction model, the electronic device determines that the image feature vectors of the associated multimedia resource include the third image feature vector and the fourth image feature vector.
The technical scheme provided by the embodiment at least has the following beneficial effects: from S501-S503, when the electronic device acquires the image feature vector of the associated multimedia resource related to the topic tag, the cover data and the image frame data of the associated multimedia resource may be acquired. Subsequently, the electronic device may extract a third image feature vector of cover data of the associated multimedia resource and a fourth image feature vector of image frame data of the associated multimedia resource by using the image feature extraction model, and determine that the image feature vector of the associated multimedia resource includes the third image feature vector and the fourth image feature vector, which provides a specific implementation manner for obtaining the image feature vector of the associated multimedia resource related to the topic tag.
Because the image feature vectors of the associated multimedia resources comprise the third image feature vector (the image feature vector of the cover data of the associated multimedia resources) and the fourth image feature vector (the image feature vector of the image frame data of the associated multimedia resources), the electronic equipment can accurately determine the association degree of the multimedia resources and the topic tags through the image feature vectors of the associated multimedia resources, thereby avoiding that the irrelevant topic tags influence the search service of the multimedia resources and improving the user experience of the multimedia resources in the search recommendation scene.
In an embodiment, referring to fig. 5 and fig. 6, in the above S205, the method for determining, by the electronic device, second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource specifically includes:
s601, the electronic device determines second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource based on the first image feature vector, the second image feature vector, the third image feature vector and the fourth image feature vector.
Specifically, after obtaining the first image feature vector, the second image feature vector, the third image feature vector, and the fourth image feature vector, the electronic device may determine second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource based on the first image feature vector, the second image feature vector, the third image feature vector, and the fourth image feature vector.
Optionally, the electronic device may obtain a similarity between the first image feature vector and the third image feature vector, obtain a similarity between the second image feature vector and the fourth image feature vector, and determine the obtained similarity as second matching data. The electronic device may also obtain the similarity between the first image feature vector and the fourth image feature vector, obtain the similarity between the second image feature vector and the third image feature vector, and determine the obtained similarity as the second matching data, which is not limited in this disclosure.
Still alternatively, when the number of the first image feature vector, the second image feature vector, the third image feature vector and the fourth image feature vector is multiple, the electronic device may further obtain an average value or a maximum value of the first image feature vector, an average value or a maximum value of the second image feature vector, an average value or a maximum value of the third image feature vector and an average value or a maximum value of the fourth image feature vector, then obtain a similarity between the average value or the maximum value of the first image feature vector and the average value or the maximum value of the fourth image feature vector, and obtain a similarity between the average value or the maximum value of the second image feature vector and the average value or the maximum value of the third image feature vector, and determine the obtained similarity as the second matching data, which is not limited by the present disclosure.
The technical scheme provided by the embodiment at least has the following beneficial effects: as can be seen from S601, after obtaining the first image feature vector, the second image feature vector, the third image feature vector, and the fourth image feature vector, the electronic device may determine, based on the first image feature vector, the second image feature vector, the third image feature vector, and the fourth image feature vector, second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource, and provide a specific implementation manner for determining the second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource.
The second matching data is determined based on the first image feature vector (the feature vector of the cover data of the multimedia resource), the second image feature vector (the feature vector of the image frame data of the multimedia resource), the third image feature vector (the feature vector of the cover data of the associated multimedia resource) and the fourth image feature vector (the feature vector of the image frame data of the associated multimedia resource), so that the electronic equipment can accurately determine the association degree of the multimedia resource and the topic tag through the second matching data, the influence of the irrelevant topic tag on the search service of the multimedia resource is avoided, and the user experience of the multimedia resource in the search recommendation scene is improved.
In an embodiment, as shown in fig. 7, an embodiment of the present application further provides a model training method, including:
s701, the electronic equipment obtains training data of the multimedia resources in a preset time period.
The training data comprises training sample input data and training sample label data; training sample input data includes: matching degree of the multimedia resource and the topic label of the multimedia resource, consumption data of the multimedia resource and user portrait data of an account for issuing the multimedia resource; training the sample label data includes: and the relevance of the multimedia resource and the topic label of the multimedia resource.
Specifically, before the first feature data, the second feature data and the third feature data are input into the association degree detection model to obtain the association degree between the multimedia resource and the topic label, the electronic device may further obtain training data of the multimedia resource in a preset time period for training the association degree detection model. S702, training the electronic equipment to obtain a correlation detection model based on the training data.
Specifically, after training data of the multimedia resource in a preset time period is acquired, the electronic device trains to obtain the association degree detection model based on the training data.
The relevancy detection model is used for detecting the relevancy of the multimedia resource and the topic label.
The technical scheme provided by the embodiment at least has the following beneficial effects: therefore, the electronic equipment can also acquire training data of the multimedia resource in a preset time period and train to obtain the association degree detection model based on the training data, so that the subsequent electronic equipment can quickly and accurately obtain the association degree of the multimedia resource and the topic label based on the trained association degree detection model, the search service of the multimedia resource is prevented from being influenced by the irrelevant topic label, and the user experience of the multimedia resource in the search recommendation scene is improved.
It is understood that, in practical implementation, the terminal/server according to the embodiments of the present disclosure may include one or more hardware structures and/or software modules for implementing the corresponding data processing methods, and these hardware structures and/or software modules may constitute an electronic device. Those of skill in the art will readily appreciate that the present disclosure can be implemented in hardware or a combination of hardware and computer software for implementing the exemplary algorithm steps described in connection with the embodiments disclosed herein. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
Based on such understanding, the embodiment of the present disclosure also provides a data processing apparatus, which can be applied to an electronic device. Fig. 8 shows a schematic structural diagram of a data processing apparatus provided in an embodiment of the present disclosure. As shown in fig. 8, the data processing apparatus may include: a determination unit 801;
a determining unit 801 for determining first characteristic data, second characteristic data and third characteristic data; the first characteristic data is used for representing the matching degree of the multimedia resource and the topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third characteristic data is used for representing user portrait data of an account for issuing the multimedia resource;
the determining unit 802 is further configured to determine, according to the first feature data, the second feature data, and the third feature data, a degree of association between the multimedia resource and the topic tag.
Optionally, the determining unit 802 is specifically configured to: and inputting the first feature data, the second feature data and the third feature data into the relevance detection model to obtain a model which is trained to a convergence state in advance and used for detecting the relevance of the multimedia resource and the topic label.
Optionally, the determining unit 801 is specifically configured to:
acquiring associated data of multimedia resources;
extracting text characteristic vectors and image characteristic vectors of the multimedia resources from the associated data;
determining first matching data between the topic label and the text feature vector;
acquiring image feature vectors of associated multimedia resources related to the topic labels;
determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource;
it is determined that the first characteristic data includes first matching data and second matching data.
Optionally, the associated data includes: at least one of audio data, cover data, image frame data, classification tags, account information, and comment information of the multimedia resource.
Optionally, the determining unit 801 is specifically configured to:
extracting trusted texts and untrusted texts from the associated data; the credible text is text data with a first association degree with the multimedia resource; the unreliable text is text data with a second association degree with the multimedia resource; the first degree of association is greater than the second degree of association;
converting the credible text into a first text feature vector and converting the incredible text into a second text feature vector, and determining that the text feature vectors comprise the first text feature vector and the second text feature vector;
and extracting a first image feature vector of the cover data and a second image feature vector of the image frame data by using a pre-trained image feature extraction model, and determining that the image feature vectors comprise the first image feature vector and the second image feature vector.
Optionally, the trusted text includes: classifying at least one of a text in the tag, a text in the account information, a music title text extracted from the audio data using a voice recognition technique, and a singer text; the untrusted text includes: comment on at least one of the text in the information, the lyrics text extracted from the audio data using a speech recognition technique, and the text recognized from the image frame data using a text recognition technique.
Optionally, the determining unit 801 is specifically configured to:
acquiring at least one associated text feature vector corresponding to the topic label;
determining each associated text feature vector in the at least one associated text feature vector, and the literal matching degree and the semantic matching degree of the text feature vector;
and determining first matching data based on the literal matching degree and the semantic matching degree.
Optionally, the determining unit 801 is specifically configured to:
acquiring cover data and image frame data of related multimedia resources:
extracting a third image feature vector of cover data of the associated multimedia resource and a fourth image feature vector of image frame data of the associated multimedia resource by using the image feature extraction model;
determining that the image feature vector of the associated multimedia asset includes the third image feature vector and the fourth image feature vector.
Optionally, the determining unit 801 is specifically configured to:
and determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource based on the first image feature vector, the second image feature vector, the third image feature vector and the fourth image feature vector.
Optionally, the determining unit 801 is further configured to, when the association degree between the multimedia resource and the topic tag is smaller than a preset association degree threshold, release the association relationship between the multimedia resource and the topic tag.
The embodiment of the disclosure also correspondingly provides a model training device which can be applied to electronic equipment. Fig. 9 shows a schematic structural diagram of a model training apparatus provided in an embodiment of the present disclosure. As shown in fig. 9, the model training apparatus may include: an acquisition unit 901 and a training unit 902;
an obtaining unit 901, configured to obtain training data of a multimedia resource within a preset time period; the training data comprises training sample input data and training sample label data; training sample input data includes: matching degree of the multimedia resource and the topic label of the multimedia resource, consumption data of the multimedia resource and user portrait data of an account for issuing the multimedia resource; training the sample label data includes: the relevance of the multimedia resource and the topic label of the multimedia resource;
a training unit 902, configured to train to obtain a relevance detection model based on training data; and the relevance detection model is used for detecting the relevance of the multimedia resource and the topic label.
As described above, the embodiment of the present disclosure may perform division of functional modules on an electronic device according to the above method example. The integrated module can be realized in a hardware form, and can also be realized in a software functional module form. In addition, it should be further noted that the division of the modules in the embodiments of the present disclosure is schematic, and is only a logic function division, and there may be another division manner in actual implementation. For example, the functional blocks may be divided for the respective functions, or two or more functions may be integrated into one processing block.
With regard to the data processing apparatus in the foregoing embodiments, the specific manner in which each module performs operations and the beneficial effects thereof have been described in detail in the foregoing method embodiments, and are not described herein again.
The embodiment of the disclosure also provides a terminal, which can be a user terminal such as a mobile phone, a computer and the like. Fig. 10 shows a schematic structural diagram of a terminal provided in an embodiment of the present disclosure. The terminal, which may be a data processing device, may include at least one processor 61, a communication bus 62, a memory 63, and at least one communication interface 64.
The processor 61 may be a Central Processing Unit (CPU), a micro-processing unit, an ASIC, or one or more integrated circuits for controlling the execution of programs according to the present disclosure. As an example, in connection with fig. 8, the determining unit 801 in the electronic device implements the same functions as the processor 61 in fig. 10.
The communication bus 62 may include a path that carries information between the aforementioned components.
The communication interface 64 may be any device, such as a transceiver, for communicating with other devices or communication networks, such as a server, an ethernet, a Radio Access Network (RAN), a Wireless Local Area Network (WLAN), etc. As an example of this, it is possible to provide,
the memory 63 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that may store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be self-contained and connected to the processing unit by a bus. The memory may also be integrated with the processing unit.
The memory 63 is used for storing application program codes for executing the disclosed solution, and is controlled by the processor 61. The processor 61 is configured to execute application program code stored in the memory 63 to implement the functions in the disclosed method.
In particular implementations, processor 61 may include one or more CPUs such as CPU0 and CPU1 in fig. 10, for example, as one embodiment.
In one implementation, the terminal may include multiple processors, such as processor 61 and processor 65 in fig. 10, for example, as an example. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
In one implementation, the terminal may further include an input device 66 and an output device 67, as one example. The input device 66 communicates with the output device 67 and may accept user input in a variety of ways. For example, the input device 66 may be a mouse, a keyboard, a touch screen device or a sensing device, and the like. The output device 67 is in communication with the processor 61 and may display information in a variety of ways. For example, the output device 61 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, or the like.
Those skilled in the art will appreciate that the configuration shown in fig. 10 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
The embodiment of the disclosure also provides a server. Fig. 11 shows a schematic structural diagram of a server provided by an embodiment of the present disclosure. The server may be a data processing device. The server, which may vary widely in configuration or performance, may include one or more processors 71 and one or more memories 72. At least one instruction is stored in the memory 72, and the at least one instruction is loaded and executed by the processor 71 to implement the data processing method provided by the above-mentioned method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.
The present disclosure also provides a computer-readable storage medium including instructions stored thereon, which, when executed by a processor of a computer device, enable a computer to perform the data processing method provided by the above-described illustrated embodiment. For example, the computer readable storage medium may be a memory 63 comprising instructions executable by the processor 61 of the terminal to perform the above described method. Also for example, the computer readable storage medium may be a memory 72 comprising instructions executable by a processor 71 of the server to perform the above-described method. Alternatively, the computer readable storage medium may be a non-transitory computer readable storage medium, for example, which may be a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
The present disclosure also provides a computer program product comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the data processing method shown in any of the above fig. 1-7, or the model training method shown in fig. 8.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A data processing method, comprising:
determining first characteristic data, second characteristic data and third characteristic data; the first feature data is used for representing the matching degree of a multimedia resource and a topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third feature data is used for representing user portrait data of an account issuing the multimedia resource;
and determining the association degree of the multimedia resource and the topic label according to the first characteristic data, the second characteristic data and the third characteristic data.
2. The data processing method of claim 1, wherein the determining the association of the multimedia resource and the topic tag according to the first feature data, the second feature data, and the third feature data comprises:
inputting the first feature data, the second feature data and the third feature data into a relevance detection model to obtain the relevance; the relevancy detection model is a model which is trained to a convergence state in advance and used for detecting the relevancy of the multimedia resource and the topic label.
3. The data processing method of claim 1, wherein the determining the first characteristic data comprises:
acquiring the associated data of the multimedia resource;
extracting text feature vectors and image feature vectors of the multimedia resources from the associated data;
determining first matching data between the topic tag and the text feature vector;
acquiring image feature vectors of associated multimedia resources related to the topic labels;
determining second matching data between the image feature vector of the associated multimedia resource and the image feature vector of the multimedia resource;
determining the first matching data and the second matching data as first feature data.
4. The data processing method of claim 3, wherein the association data comprises: at least one of audio data, cover data, image frame data, classification tags, account information, and comment information of the multimedia resource.
5. A method of model training, comprising:
acquiring training data of multimedia resources in a preset time period; the training data comprises training sample input data and training sample label data; the training sample input data comprises: matching degree of the multimedia resource and a topic label of the multimedia resource, consumption data of the multimedia resource, and user portrait data of an account issuing the multimedia resource; the training sample label data comprises: the relevancy of the multimedia resource and the topic label of the multimedia resource;
training to obtain a correlation detection model based on the training data; the relevancy detection model is used for detecting the relevancy of the multimedia resource and the topic label of the multimedia resource.
6. A data processing apparatus, comprising: a determination unit;
the determining unit is used for determining first characteristic data, second characteristic data and third characteristic data; the first feature data is used for representing the matching degree of a multimedia resource and a topic label of the multimedia resource; the second characteristic data is used for representing consumption data of the multimedia resource; the third feature data is used for representing user portrait data of an account issuing the multimedia resource;
the determining unit is further configured to determine a degree of association between the multimedia resource and the topic tag according to the first feature data, the second feature data, and the third feature data.
7. A model training apparatus, comprising: an acquisition unit and a training unit;
the acquisition unit is used for acquiring training data of the multimedia resources in a preset time period; the training data comprises training sample input data and training sample label data; the training sample input data comprises: matching degree of the multimedia resource and a topic label of the multimedia resource, consumption data of the multimedia resource, and user portrait data of an account issuing the multimedia resource; the training sample label data comprises: the relevancy of the multimedia resource and the topic label of the multimedia resource;
the training unit is used for training to obtain a correlation detection model based on the training data; the relevancy detection model is used for detecting the relevancy of the multimedia resource and the topic label of the multimedia resource.
8. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the data processing method of any one of claims 1-4 or the model training method of claim 5.
9. A computer-readable storage medium having instructions stored thereon, wherein the instructions in the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the data processing method of any of claims 1-4, or the model training method of claim 5.
10. A computer program product comprising instructions that, when run on an electronic device, cause the electronic device to perform the data processing method of any one of claims 1-4, or the model training method of claim 5.
CN202111040413.9A 2021-09-06 2021-09-06 Data processing method, data processing device, model training method, model training device, equipment and storage medium Pending CN113849666A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111040413.9A CN113849666A (en) 2021-09-06 2021-09-06 Data processing method, data processing device, model training method, model training device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111040413.9A CN113849666A (en) 2021-09-06 2021-09-06 Data processing method, data processing device, model training method, model training device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113849666A true CN113849666A (en) 2021-12-28

Family

ID=78973129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111040413.9A Pending CN113849666A (en) 2021-09-06 2021-09-06 Data processing method, data processing device, model training method, model training device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113849666A (en)

Similar Documents

Publication Publication Date Title
CN106897428B (en) Text classification feature extraction method and text classification method and device
US11210572B2 (en) Aligning symbols and objects using co-attention for understanding visual content
JP7304370B2 (en) Video retrieval method, apparatus, device and medium
US7853582B2 (en) Method and system for providing information services related to multimodal inputs
WO2020108063A1 (en) Feature word determining method, apparatus, and server
WO2018045646A1 (en) Artificial intelligence-based method and device for human-machine interaction
WO2020244065A1 (en) Character vector definition method, apparatus and device based on artificial intelligence, and storage medium
US9639633B2 (en) Providing information services related to multimodal inputs
CN114861889B (en) Deep learning model training method, target object detection method and device
US20200257679A1 (en) Natural language to structured query generation via paraphrasing
WO2022095354A1 (en) Bert-based text classification method and apparatus, computer device, and storage medium
CN111708942A (en) Multimedia resource pushing method, device, server and storage medium
CN113806588A (en) Method and device for searching video
CN113704507A (en) Data processing method, computer device and readable storage medium
JP2023002690A (en) Semantics recognition method, apparatus, electronic device, and storage medium
CN113435523B (en) Method, device, electronic equipment and storage medium for predicting content click rate
CN114298007A (en) Text similarity determination method, device, equipment and medium
CN111460224B (en) Comment data quality labeling method, comment data quality labeling device, comment data quality labeling equipment and storage medium
CN113360683A (en) Method for training cross-modal retrieval model and cross-modal retrieval method and device
KR20210120203A (en) Method for generating metadata based on web page
CN117251761A (en) Data object classification method and device, storage medium and electronic device
CN116030375A (en) Video feature extraction and model training method, device, equipment and storage medium
CN113239215B (en) Classification method and device for multimedia resources, electronic equipment and storage medium
CN113849666A (en) Data processing method, data processing device, model training method, model training device, equipment and storage medium
CN112784600A (en) Information sorting method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination