CN115114459A - Label correction method, device, equipment and computer readable storage medium - Google Patents

Label correction method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN115114459A
CN115114459A CN202110287224.5A CN202110287224A CN115114459A CN 115114459 A CN115114459 A CN 115114459A CN 202110287224 A CN202110287224 A CN 202110287224A CN 115114459 A CN115114459 A CN 115114459A
Authority
CN
China
Prior art keywords
label
corrected
content
search
tag
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110287224.5A
Other languages
Chinese (zh)
Inventor
陈小帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110287224.5A priority Critical patent/CN115114459A/en
Publication of CN115114459A publication Critical patent/CN115114459A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/432Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a label correction method, a label correction device, label correction equipment and a computer-readable storage medium; the method comprises the following steps: acquiring content to be corrected, a label set comprising each label of the content to be corrected and historical behavior data corresponding to the content to be corrected; the historical behavior data represent data generated when searching for the content to be corrected in a historical time period, and the content to be corrected is multimedia content waiting for label correction; analyzing each label based on historical behavior data, and determining a label to be corrected for the content to be corrected; wherein the label to be corrected is a missing label or a false label of the content to be corrected; and correcting the label set of the content to be corrected according to the label to be corrected to obtain a corrected label set. Through the method and the device, the accuracy of the label set can be improved based on artificial intelligence.

Description

Label correction method, device, equipment and computer readable storage medium
Technical Field
The present application relates to video understanding technology in the field of artificial intelligence, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for label modification.
Background
Content recommendation and content search are common applications in the field of artificial intelligence. Content recommendation and content search are mostly implemented based on tags of multimedia content, that is, according to the tags, it is determined which multimedia content is recommended to the user or which multimedia content is fed back to the user as a search result. It can be seen that the accuracy of the tags can affect the accuracy of content recommendation and content search.
In the related art, most of the tag sets are constructed manually with machine assistance based on multimedia contents. However, the tag set constructed in this way is prone to mislabeling and missing labeling, so that multimedia content cannot be accurately embodied by using tags in the tag set, and the accuracy of the tag set is low.
Disclosure of Invention
The embodiment of the application provides a label correction method, a label correction device, label correction equipment and a computer-readable storage medium, which can improve the accuracy of a label set.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a label correction method, which comprises the following steps:
acquiring content to be corrected, a label set comprising each label of the content to be corrected, and historical behavior data corresponding to the content to be corrected;
the historical behavior data represents data generated when the content to be corrected is searched in a historical time period, and the content to be corrected is multimedia content waiting for label correction;
analyzing each label based on the historical behavior data, and determining a label to be corrected for the content to be corrected; wherein the label to be corrected is a missing label or a false label of the content to be corrected;
and correcting the label set of the content to be corrected according to the label to be corrected to obtain a corrected label set.
The embodiment of the application provides a label correcting device, includes:
the data acquisition module is used for acquiring content to be corrected, a label set comprising each label of the content to be corrected and historical behavior data corresponding to the content to be corrected; the historical behavior data represents data generated when the content to be corrected is searched in a historical time period, and the content to be corrected is multimedia content waiting for label correction;
the label determining module is used for analyzing each label based on the historical behavior data and determining a label to be corrected for the content to be corrected; the label to be corrected is a missing label or a false label of the content to be corrected;
and the label correction module is used for correcting the label set of the content to be corrected according to the label to be corrected to obtain a corrected label set.
In some embodiments of the application, the tag determining module is further configured to extract, from the historical behavior data, a plurality of search sentences used for searching out the content to be corrected, search times corresponding to each search sentence, viewing times of the content to be corrected under each search sentence, and total times of search behaviors; wherein the total number of search behavior occurrences characterizes a total number of search behavior occurrences over the historical time; combining the plurality of search sentences, the search times, the viewing times and the total search behavior times, performing error marking analysis on each label, and screening out error marking labels of the content to be corrected from each label; and determining the error marking label of the content to be corrected as the label to be corrected.
In some embodiments of the application, the tag determining module is further configured to perform a ratio operation on the search times and the view times to obtain view probabilities that the content to be corrected is viewed under each search statement; screening at least one sentence to be analyzed from the plurality of search sentences according to the viewing probability corresponding to each search sentence; the at least one sentence to be analyzed is a search sentence with the viewing probability lower than a preset threshold value; screening out the number of times of to-be-analyzed search corresponding to each to-be-analyzed statement in the at least one to-be-analyzed statement from the plurality of search times; analyzing the mislabel of each label based on the number of times of searching to be analyzed, the total number of times of searching behaviors and the correlation degree of each label and each statement to be analyzed to obtain the mislabel probability of each label; and screening the labels with the mislabeling probability larger than a probability threshold value from each label to obtain the mislabeling labels of the content to be corrected.
In some embodiments of the present application, the tag determining module is further configured to perform a ratio operation on the number of searches to be analyzed and the total number of search behaviors to obtain a first search proportion corresponding to each statement to be analyzed; performing relevance calculation on each label in the labels and each statement to be analyzed one by one to obtain statement label relevance of each label and each statement to be analyzed; multiplying the statement label correlation degree and the first search proportion to obtain the sub-error marking probability of each label under each statement to be analyzed; and accumulating the sub-false mark probabilities corresponding to the statements to be analyzed to obtain the false mark probability of each label.
In some embodiments of the application, the tag determining module is further configured to perform text feature extraction on the tag text and the tag type of each tag to obtain a tag feature of each tag; extracting text features of each sentence to be analyzed to obtain the sentence features of each sentence to be analyzed; interacting the sentence characteristics of each sentence to be analyzed and the label characteristics of each label to obtain interaction characteristics; and performing relevance identification on the interactive features to obtain relevance of each label and the statement label of each statement to be analyzed.
In some embodiments of the present application, the tag determining module is further configured to obtain, from the historical behavior data, a plurality of search sentences corresponding to the content to be modified, the number of search times corresponding to each search sentence in the plurality of search sentences, and the total number of search behaviors; acquiring tags of other contents except the contents to be corrected searched by utilizing each search statement; counting the synchronous viewing times of the content to be corrected and the other contents viewed under each search statement from the historical behavior data; analyzing the missing marks of the labels according to the labels of the other contents, the synchronous viewing times, the searching times corresponding to each searching statement and the total searching behavior times, and determining the missing mark labels for the contents to be corrected; and determining the label of missing marks of the content to be corrected as the label to be corrected.
In some embodiments of the application, the tag determining module is further configured to perform a ratio operation on the number of searches corresponding to each search statement and the total number of search actions, so as to obtain a second search ratio corresponding to each search statement; comparing each label with the labels of the other contents to obtain a difference label; the difference label is a label marked by the other content and unmarked by the content to be corrected; calculating the sub-missing label probability of the difference label under each search statement based on the correlation degree of the difference label and the content to be corrected, the second search proportion, the synchronous viewing times and the search times; and accumulating the sub-missing mark probabilities corresponding to each search statement to obtain the missing mark probability of the difference label, and taking the difference label as the missing mark label of the content to be corrected when the missing mark probability is greater than a probability threshold.
In some embodiments of the present application, the tag determining module is further configured to compare the number of synchronous viewing times with the number of search times, and obtain a synchronous viewing probability that the content to be corrected and the other content are simultaneously viewed under each search statement; calculating the correlation degree of the content to be corrected and the difference label to obtain the correlation degree of the content label; and taking the product of the synchronous viewing probability, the content label correlation degree and the second search proportion as the sub-missing label probability of the difference label under each search statement.
In some embodiments of the present application, the content to be modified comprises a video; the label determining module is further configured to extract audio features, image features and text features from the video respectively, and fuse the audio features, the image features and the text features to obtain fusion features corresponding to the video; extracting text features of the label types and the label texts of the difference labels to obtain the label features of the difference labels; interacting the label features of the difference labels with the fusion features to obtain interaction features; and performing relevance identification on the interactive features to obtain the relevance of the content label.
In some embodiments of the application, the data obtaining module is further configured to obtain, from a multimedia content library, the multimedia content whose exposure times are greater than a preset threshold and whose click times are less than the preset threshold, so as to obtain the content to be corrected.
In some embodiments of the present application, the tag correction module is further configured to, when the tag to be corrected is a missing tag of the content to be corrected, add the tag to be corrected to the tag set of the content to be corrected to obtain the corrected tag set; or, when the to-be-corrected label is the error marking label of the to-be-corrected content, removing the to-be-corrected label from the label set of the to-be-corrected content to obtain the corrected label set.
The embodiment of the application provides a label correction device, including:
a memory for storing executable tag correction instructions;
and the processor is used for realizing the tag correction method provided by the embodiment of the application when executing the executable tag correction instruction stored in the memory.
The embodiment of the application provides a computer-readable storage medium, which stores executable tag correction instructions and is used for causing a processor to execute the executable tag correction instructions so as to realize the tag correction method provided by the embodiment of the application.
The embodiment of the application has the following beneficial effects: the label correction equipment can analyze each label in the label set of the content to be corrected based on the historical behavior data of the content to be corrected by the user, determine the label to be corrected, namely determine the error marking label or the missing marking label of the content to be corrected, and correct the label set by using the error marking label or the missing marking label, so that the label set capable of reflecting the content to be corrected more accurately is obtained, and the accuracy of the label set is improved.
Drawings
Fig. 1 is a schematic diagram of an alternative architecture of a tag correction system 100 provided in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of the server in FIG. 1 according to an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart of an alternative label correction method provided by the embodiments of the present application;
FIG. 4 is a schematic flow chart of another alternative label correction method provided in the embodiments of the present application;
FIG. 5 is a schematic diagram of an extraction tag feature provided by an embodiment of the present application;
FIG. 6 is a schematic diagram of extracting text features provided in an embodiment of the present application;
FIG. 7 is a schematic diagram illustrating a calculation of relevance between a tag and a sentence to be analyzed according to an embodiment of the present application;
FIG. 8 is a schematic flow chart of another alternative label correction method provided in the embodiments of the present application;
FIG. 9 is a schematic diagram of a fusion feature for generating a video according to an embodiment of the present application;
FIG. 10 is a diagram illustrating a method for calculating relevance of content tags according to an embodiment of the present application;
FIG. 11 is a schematic diagram of a process for performing tag correction on a video according to an embodiment of the present application;
fig. 12 is a schematic diagram of data after sorting user search behavior data according to an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the accompanying drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first", "second", and the like are only intended to distinguish similar objects and do not denote a particular order, but rather the terms "first", "second", and the like may be used interchangeably with the specific order or sequence described herein, where permissible, to enable embodiments of the present application to be practiced otherwise than as specifically illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce intelligent machines that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject, has wide design field, and has the technology of both hardware level and software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision processing technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
2) Computer Vision technology (CV) is a science for researching how to make a machine "see", and more specifically, it refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.
3) Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates and realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach to make computers have intelligence, and is applied in various fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like.
4) The content search is to search multimedia search content desired by a user from a multimedia information base according to information included in a search phrase (query) input by the user. For example, the search phrase input by the user is "comedy movie", and the process is a multimedia content search for searching out movies with "comedy" tags from the video library.
5) Content recommendation refers to a process of recommending multimedia content that may be interested to a user according to information such as a user portrait. For example, according to a certain comedy movie that the user has watched recently, the process of recommending other comedy movies for the user is content recommendation.
6) The content tag is used for marking tags such as roles, actors, places, plots and the like for multimedia contents such as videos, pictures and the like, and providing visual semantic understanding characteristics for content search or content recommendation.
With the research and progress of the artificial intelligence technology, the artificial intelligence technology develops research and application in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, smart medical services, smart customer service and the like.
Modifying the tags of multimedia content is also an important application of artificial intelligence technology. The embodiment of the application provides a label correction method, a label correction device, label correction equipment and a computer-readable storage medium, which can improve the accuracy of labels of multimedia contents. An exemplary application of the label correction device provided in the embodiment of the present application is described below, and the label correction device provided in the embodiment of the present application may be implemented as various types of terminals, and may also be implemented as a server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a smart television, a smart vehicle-mounted terminal, and the like. The terminal is provided with a client, such as a video client, a browser client, an information flow client, an instant messaging client and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiment of the present invention.
Next, an exemplary application when the label correction apparatus is implemented as a server will be described. Referring to fig. 1, fig. 1 is an alternative architecture diagram of a tag correction system 100 provided in this embodiment of the present application, in order to support a tag correction application, a terminal 200 is connected to a server 400 through a network 300, where the network 300 may be a wide area network or a local area network, or a combination of both. Meanwhile, the server 400 is further connected with a multimedia content library 500 (for storing multimedia content) for providing data support for the server 400.
The server 400 is configured to obtain, from the multimedia content platform 500, content to be modified, a tag set including tags of the content to be modified, and historical behavior data corresponding to the content to be modified. The historical behavior data represents data generated when the content to be corrected is searched in a historical time period, and the content to be corrected is multimedia content waiting for label correction. The server 400 analyzes each tag based on the historical behavior data, and determines a tag to be corrected for the content to be corrected, wherein the tag to be corrected is a tag which is marked by missing or a tag which is marked by mistake for the content to be corrected. Then, the server 400 corrects the tag set of the content to be corrected according to the tag to be corrected, so as to obtain a corrected tag set.
Then, the server 400 may distribute the content to be modified as a search result or a recommended content to the terminal 200 according to each tag in the modified tag set. The terminal 200 may present the content to be modified on its graphical interface 210.
Referring to fig. 2, fig. 2 is a schematic structural diagram of the server in fig. 1 according to an embodiment of the present disclosure, where the server 400 shown in fig. 2 includes: at least one processor 410, memory 450, at least one network interface 420, and a user interface 430. The various components in server 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable communications among the components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 440 in FIG. 2.
The Processor 410 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
The user interface 430 includes one or more output devices 431, including one or more speakers and/or one or more visual displays, that enable the presentation of media content. The user interface 430 also includes one or more input devices 432, including user interface components that facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 450 optionally includes one or more storage devices physically located remote from processor 410.
The memory 450 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 450 described in embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, memory 450 is capable of storing data, examples of which include programs, modules, and data structures, or a subset or superset thereof, to support various operations, as exemplified below.
An operating system 451, including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;
a network communication module 452 for communicating to other computing devices via one or more (wired or wireless) network interfaces 420, exemplary network interfaces 420 including: bluetooth, wireless-compatibility authentication (Wi-Fi), and Universal Serial Bus (USB), etc.;
a presentation module 453 for enabling presentation of information (e.g., user interfaces for operating peripherals and displaying content and information) via one or more output devices 431 (e.g., display screens, speakers, etc.) associated with user interface 430;
an input processing module 454 for detecting one or more user inputs or interactions from one of the one or more input devices 432 and translating the detected inputs or interactions.
In some embodiments, the tag correction apparatus provided in the embodiments of the present application may be implemented in software, and fig. 2 illustrates the tag correction apparatus 455 stored in the memory 450, which may be software in the form of programs and plug-ins, and includes the following software modules: a data acquisition module 4551, a tag determination module 4552 and a tag correction module 4553, which are logical and thus may be arbitrarily combined or further separated depending on the functions implemented.
The functions of the respective modules will be explained below.
In other embodiments, the tag correction apparatus provided in this embodiment of the present Application may be implemented in hardware, and for example, the tag correction apparatus provided in this embodiment of the present Application may be a processor in the form of a hardware decoding processor, which is programmed to execute the tag correction method provided in this embodiment of the present Application, for example, the processor in the form of the hardware decoding processor may be one or more Application Specific Integrated Circuits (ASICs), DSPs, Programmable Logic Devices (PLDs), Complex Programmable Logic Devices (CPLDs), Field Programmable Gate Arrays (FPGAs), or other electronic components.
Illustratively, an embodiment of the present application provides a label correction apparatus, including:
a memory for storing executable tag correction instructions;
and the processor is used for realizing the tag correction method provided by the embodiment of the application when executing the executable tag correction instruction stored in the memory.
In the following, the label correction method provided by the embodiment of the present application will be described in conjunction with an exemplary application and implementation of the label correction apparatus provided by the embodiment of the present application. It should be noted that the tag correction device may be implemented as a terminal or a server, so that the tag correction method provided by the embodiment of the present application may be executed by the terminal or the server; the tag correction device can also be implemented as a device cluster composed of a terminal and a server, so that the tag correction method provided by the embodiment of the application can also be executed by the terminal and the server together. Next, a description will be given of an example in which the tag correction method provided in the embodiment of the present application is executed by a server.
Referring to fig. 3, fig. 3 is an alternative flow chart of a label correction method provided in the embodiment of the present application, which will be described with reference to the steps shown in fig. 3.
S101, obtaining the content to be corrected, a label set comprising each label of the content to be corrected, and historical behavior data corresponding to the content to be corrected.
The embodiment of the application is realized under the scene of correcting the multimedia tag set to obtain more accurate tag set, for example, the tags of the video are corrected, so that the content of the video can be completely and accurately described by using the tags in the tag set. Firstly, a server needs to acquire a content to be modified, which needs to be subjected to tag correction, and a tag set of the content to be modified from a multimedia content platform. Meanwhile, the server also can obtain historical behavior data corresponding to the content to be corrected, so that the label set of the content to be corrected can be analyzed based on the historical behavior data.
It should be noted that the historical behavior data represents data generated when searching for the content to be processed in the historical time period. The historical behavior data may include search sentences which are used by the user in the historical time period and can search out the content to be corrected, may include the number of times, time and the like that the search sentences are searched, may include the number of times, time and the like that the user views the content to be corrected after searching the content to be corrected by using the search sentences, and may also include the total number of times of search behaviors performed by the user (i.e., the total number of searches on the multimedia content platform).
In some embodiments, there may be more than one search statement capable of searching out the content to be corrected, that is, the historical behavior data includes a plurality of search statements for searching out the content to be corrected. Of course, in other embodiments, not only the content to be corrected but also other content may be searched by using any one search statement.
In the embodiment of the present application, a search sentence refers to a text sequence for content search, which is input by a user, and the search sentence may be one character input by the user, one word input by the user, or a sentence composed of a plurality of characters or a plurality of words. The content to be corrected may refer to content such as video, audio, and the like, and may also refer to content such as pictures, articles, and the like, and the present application is not limited herein.
The tag set is composed of individual tags marked by the content to be modified. The tag set may include only one tag of the content to be modified, or may include multiple different tags of the content to be modified, where the tags may be ordered or unordered. In the case where there is an order between different tags in the tag set, the order may be determined, for example, by the time of the mark or by the degree of fit to the content to be modified.
Further, the label refers to labeling information capable of reflecting semantic understanding of the content to be modified. The tags may include characters, animals, plots, styles, provenance, author, etc. that appear in the content to be modified. For example, if the content to be modified is a video, the label may be the character, actor, plot that appears; if the content to be modified is a song, the tag may be the genre and author of the song.
In the embodiment of the application, the content to be corrected is multimedia content waiting for label correction. The content to be corrected can be any multimedia content on the multimedia content platform, and can also be specific multimedia content selected from the multimedia content platform according to information such as heat, time, searching conditions, viewing conditions and the like.
And S102, analyzing each label based on the historical behavior data, and determining the label to be corrected for the content to be corrected.
The historical behavior data reflects the searching behavior and the viewing behavior of the user aiming at the content to be corrected, and from the searching behavior and the viewing behavior, the correlation degree of the content to be corrected and the corresponding label thereof and the similarity degree of each label of the content to be corrected and the labels of other contents can be analyzed, so that the server can analyze each label of the content to be corrected based on the historical behavior data, determine the label with poor correlation degree with the content to be corrected from each label of the label set, or determine the label with enough correlation with the content to be corrected from the labels of other contents, and take the determined label as the label to be corrected corresponding to the content to be corrected.
It can be understood that the tag with a poor correlation degree of the content to be corrected is essentially a tag in which the content to be corrected is marked by mistake, and the tag determined from the tags of other contents and sufficiently correlated with the content to be corrected can be marked with the content to be corrected, so that the tag is a tag in which the content to be corrected is marked by omission. Therefore, the label to be corrected is a missing label or a false label of the content to be corrected.
In some embodiments, the server may determine whether the user views the content to be corrected after searching the content by using the search statement from the search behavior and the viewing behavior characterized by the historical behavior data. When the user does not see the content to be corrected and a certain label is sufficiently related to the search statement, it indicates that the content to be corrected is low in degree of correlation with the label, so that the label should be a mislabel label of the content to be corrected, otherwise, the label is a correct label of the content to be corrected.
In other embodiments, the server may determine whether the content to be modified and other content simultaneously searched by using the search statement are simultaneously viewed from the search behavior and the viewing behavior characterized by the historical behavior data. When the content to be corrected and other content are viewed by the user at the same time, the distribution of the tags of the content to be corrected and other content is similar, so that the server can determine that the other content is marked by comparing the tag of the content to be corrected with the tags of the other content, but the tag of the content to be corrected is not marked yet, and the determined tag is a label which is missed to mark the content to be corrected.
S103, according to the label to be corrected, correcting the label set of the content to be corrected to obtain a corrected label set.
After the server determines the to-be-corrected label of the to-be-corrected content, the server corrects the label set according to the to-be-corrected label, so that the label in the corrected label set is more attached to the to-be-corrected content, and the accuracy of the label set of the to-be-corrected content is improved.
In the embodiment of the application, the server can analyze each tag in the tag set of the content to be corrected based on the historical behavior data of the content to be corrected by the user, determine the tag to be corrected, namely determine the error marking tag or the missing marking tag of the content to be corrected, and correct the tag set by using the error marking tag or the missing marking tag, so that the tag set capable of reflecting the content to be corrected more accurately is obtained, and the accuracy of the tag set is also improved. Meanwhile, the embodiment of the application realizes the correction of the label set according to the historical behaviors of the user, so that the labor cost of label correction is reduced.
The server may perform an analysis of the mis-tagged tags on the content to be corrected. Referring to fig. 4, in some embodiments of the present application, analyzing each tag based on historical behavior data, and determining a tag to be corrected for a content to be corrected, that is, a specific implementation process of S102 may include: S1021-S1023, as follows:
s1021, extracting a plurality of search sentences used for searching out the content to be corrected, the search times corresponding to each search sentence, the viewing times of the content to be corrected under each search sentence and the total search behavior times from the historical behavior data.
When the server analyzes the error marked labels, the server reads the historical behavior data, and finds out a plurality of search sentences which can return the content to be corrected as search results for the user, the number of times of viewing the content to be corrected under each search sentence, the number of times of searching each search sentence which is typed by the user for searching, and the total number of times of searching behaviors.
It should be noted that the total number of times of search behaviors represents the total number of search behaviors occurring in the historical time, and is obtained by counting all search behaviors (whether search terms capable of searching out the content to be corrected) of all users. The number of searches is for the search term, and is substantially the number of repetitions of the search term.
And S1022, combining the plurality of search sentences, the search times, the viewing times and the total search behavior times, performing error marking analysis on each label, and screening out error marking labels of the content to be corrected from each label.
The server can determine whether the content to be corrected is checked by the user after being searched out according to the plurality of searching times and the checking times of the label to be corrected; determining whether the plurality of search sentences are common sentences for searching the content to be corrected or not by utilizing the plurality of search times and the total search behavior times; and simultaneously analyzing the relevance of each label to a plurality of search sentences. When a plurality of search sentences are sentences which are often used when a certain label is relatively related, but the content to be corrected searched out by the plurality of search sentences is not checked by a user after being searched out, the label cannot well reflect the content to be corrected, and therefore the label is the error marking label of the content to be corrected.
It is understood that the reason for determining whether the plurality of search sentences are common sentences for searching for the content to be corrected is that: if a certain search statement is not a common statement, the search times of the search statement and the viewing times of the content to be corrected viewed under the search statement may lack statistical significance and may not well reflect the historical behavior of the user, so that the finally determined mis-marked tag may be different from the actual mis-marked tag.
And S1023, determining the error marking label of the content to be corrected as the label to be corrected.
After determining the mismarked tag from each tag of the content to be corrected, the server takes the mismarked tag as the tag to be corrected, so that the tag set of the content to be corrected is corrected based on the mismarked tag.
In the embodiment of the application, the server may determine whether each tag of the content to be corrected is a mislabel or not by using the number of times of checking the content to be corrected, the plurality of search statements and the corresponding plurality of times of searching, and the total number of times of searching, in the historical behavior data, so as to obtain the tag to be corrected, so that the tag set of the content to be corrected is corrected by using the tag to be corrected in the subsequent process.
In some embodiments of the present application, in combination with the plurality of search statements, the number of searches, the number of views, and the total number of search behaviors, the mislabel analysis is performed on each tag, and a mislabel tag of the content to be corrected is screened out from each tag, that is, a specific implementation process of S1022 may include: s1022a-S1022e, as follows:
and S1022a, performing ratio operation on the search times and the viewing times to obtain the viewing probability that the content to be corrected is viewed under each search statement.
The server takes the viewing times of the content to be corrected under each search statement as numerators, takes the search times corresponding to each search statement as denominators, takes the ratio as the probability of the content to be corrected being viewed under each search statement, and records the probability as the viewing probability.
It should be noted that, the process of viewing the content to be corrected in each search statement refers to a process of directly triggering and opening the content to be corrected to view in the search result fed back corresponding to each search statement when the user enters each search statement. Furthermore, the content to be corrected can be triggered and opened in the modes of clicking, double clicking, long pressing and the like.
S1022b, screening at least one sentence to be analyzed from the plurality of search sentences according to the viewing probability corresponding to each search sentence.
The server compares the viewing probability of each search statement with a preset threshold one by one, and picks out the search statement with the viewing probability lower than the preset threshold, so as to obtain at least one statement to be analyzed, which needs to be analyzed in a key manner. In other words, at least one sentence to be analyzed is a search sentence whose viewing probability is lower than a preset threshold.
It should be noted that the viewing probability can reflect, to the side, the content to be corrected returned based on the search statement, and is not what the user wants to view. In the searching process, the entity in the search sentence typed by the user or the semanteme represented by the search sentence is extracted and matched with the label of the content to be corrected, so that the matched content to be corrected is used as a searching result and is returned to the user. If the viewing probability is high, it is very likely that the content to be corrected is what the user wants, and therefore the association between the content to be corrected and the tag and the association between the tag and the search statement are high, if the viewing probability is low, the content to be corrected is not what the user wants, and at this time, the association degree between the content to be corrected and the tag needs to be judged according to the association degree between the tag and the search statement, and therefore, the search statement with low viewing probability needs to be analyzed in a key mode.
It is understood that the preset threshold may be set according to practical situations, for example, set to 0.5, or set to 0.3, etc., and the application is not limited herein.
S1022c, screening out the number of to-be-analyzed searches corresponding to each to-be-analyzed sentence in at least one to-be-analyzed sentence from the plurality of search times.
The at least one sentence to be analyzed is screened from the plurality of search sentences, and the plurality of search sentences are in one-to-one correspondence with the plurality of search times, so that the server can screen the corresponding search times to be analyzed for each sentence to be analyzed from the plurality of search times according to the correspondence between the search sentences and the search times, and obtain the at least one search time.
S1022d, analyzing each label by the mislabel according to the number of searches to be analyzed, the total number of search behaviors and the degree of correlation between each label and each statement to be analyzed, and obtaining the mislabel probability of each label.
The server firstly analyzes whether the at least one search frequency to be analyzed is frequently used in the user search behaviors by utilizing the at least one search frequency to be analyzed and the total search behavior frequency, then respectively calculates the degree of correlation between each label and at least one sentence to be analyzed, and calculates the false marking probability of each label of the content to be corrected according to the degree of correlation between each label and the at least one sentence to be analyzed and whether the at least one search sentence to be analyzed is frequently used.
S1022e, screening the labels with the false mark probability larger than the probability threshold value from each label to obtain the false mark label of the content to be corrected.
After the server calculates the false mark probability of each label, a preset probability threshold value is obtained, and then the false mark probability and the probability threshold value are compared to obtain a comparison result. And when the comparison result shows that the false marking probability of a certain label is greater than the probability threshold, the server determines the label as the false marking label of the content to be corrected, otherwise, when the comparison result shows that the false marking probability of the certain label is less than or equal to the probability threshold, the server considers the label as the correct label of the content to be corrected. The server can thus distinguish which tags are mismarked for the content to be corrected.
In the embodiment of the application, the server determines which search sentences are to-be-analyzed sentences needing to be analyzed in a key manner based on a plurality of search times of a plurality of search sentences and the viewing times of contents to be corrected, then synthesizes the to-be-analyzed search times corresponding to the to-be-analyzed sentences, and analyzes the association degree between the to-be-analyzed sentences and each label, so that the probability that each label is marked by mistake can be determined, and therefore the label marked by mistake is selected.
In some embodiments of the present application, based on the number of searches to be analyzed, the total number of search actions, and the degree of correlation between each tag and each sentence to be analyzed, performing a mis-tagging analysis on each tag to obtain a mis-tagging probability of each tag, that is, a specific implementation process of S1022d may include: S201-S204, as follows:
s201, carrying out ratio operation on the number of times of searching to be analyzed and the total number of times of searching behaviors to obtain a first searching ratio corresponding to each statement to be analyzed.
The server takes the number of times of searching to be analyzed corresponding to each statement to be analyzed as a numerator and takes the total number of times of searching behaviors as a denominator, so that the ratio operation of the number of times of searching to be analyzed and the total number of times of searching behaviors is realized, and the calculated ratio is the first searching ratio corresponding to each statement to be analyzed.
S202, carrying out relevance calculation on each label in each label and each statement to be analyzed one by one to obtain statement label relevance of each label and each statement to be analyzed.
The server may use each tag and each sentence to be analyzed as a relevancy calculation unit, then input the composed relevancy calculation unit into the trained sentence tag relevancy calculation model, and output the sentence tag relevancy calculation model as the sentence tag relevancy between each tag and each sentence to be analyzed. The server can also perform text matching on each label and each statement to be analyzed, and then the matching degree is used as the statement label correlation degree of each label and each statement to be analyzed.
It is understood that the sentence and tag correlation degree calculation model is trained by using a built search sentence and tags which are prepared in advance. The statement label correlation calculation model may be based on an ALBERT model, or may be a model constructed based on the ALBERT model, and the present application is not limited herein.
And S203, multiplying the sentence label correlation degree and the first search proportion to obtain the sub-error marking probability of each label under each statement to be analyzed.
After the server calculates the statement label correlation and the first search ratio, the statement correlation and the first search ratio are multiplied, and the obtained product is the sub-mislabel probability of each label under each statement to be analyzed. Wherein the sub-mislabeling probability indicates the probability of each label being mislabeled under the condition of each statement to be analyzed.
For example, the embodiment of the present application provides a formula for calculating the sub-mislabeling probability of each label under each statement to be analyzed, see formula (1):
q ij =I ij ×R j (1)
where I is each label, j is each statement to be analyzed, I ij Is the statement label correlation, R, of each label with each statement to be analyzed j Is the first search proportion, q, corresponding to each statement to be analyzed ij Is the sub-mislabel probability of each label under each statement to be analyzed.
And S204, accumulating the sub-false mark probabilities corresponding to each statement to be analyzed to obtain the false mark probability of each label.
And the server accumulates the sub-error marking probabilities corresponding to each statement to be analyzed, and the obtained accumulated sum is the error marking probability of each label.
Illustratively, on the basis of equation (1), the embodiment of the present application provides a formula for calculating the mislabeling probability of each tag, see equation (2), as follows:
PE=sum q ij (2)
wherein q is ij The sub-mislabeling probability of each label under each statement to be analyzed, that is, the sub-mislabeling probability corresponding to each statement to be analyzed, and PE is the mislabeling probability of each label.
In the embodiment of the application, the server respectively calculates the first search proportion of each statement to be analyzed and the statement label correlation degree of each label and each statement to be analyzed, so as to determine whether each label is wrongly marked under each statement to be analyzed according to the search condition of each statement to be analyzed and the correlation condition of each label and each statement to be analyzed, and then accumulates the sub-mismarking probabilities corresponding to each statement to be analyzed, so that the possibility that each label is wrongly marked under all statements to be analyzed can be determined, and the mismarking probability of each label can be obtained.
It should be noted that, in the present application, first calculating the first search proportion of each statement to be analyzed, or first calculating the statement label correlation of each label and each statement to be analyzed, has no influence on the calculation of the sub-mislabeling probability of each label under each statement to be analyzed, and therefore, in some embodiments, the server may further perform S202, then perform S201, and finally perform S203-S204. Of course, in other embodiments, the server may also perform S201 and S202 simultaneously, and then perform S203-S204.
In some embodiments of the present application, the sentence tag relevancy calculation model may be composed of a text feature extraction model, a feature interaction part, and a relevancy calculation part, in which case, the relevancy calculation is performed on each tag in each tag and each sentence to be analyzed one by one to obtain the sentence tag relevancy between each tag and each sentence to be analyzed, that is, the specific implementation process of S202 may include: S2021-S2024, as follows:
s2021, extracting text features of the label text and the label type of each label to obtain the label features of each label.
The server inputs the label text and the label type of each label into a text feature extraction model in the sentence label correlation degree calculation model, so that the label feature of each label is obtained by extraction of the text feature extraction model.
The label features extracted by the server by using the text feature extraction model may be feature vectors, feature matrices, or feature values, which is not limited herein. The text feature extraction model may be an ALBERT model or a KNN model, and the present application is not limited herein.
It will be appreciated that the server may distinguish between tag text and tag type using a separator, so that the text feature extraction model may distinguish between tag text and tag type. Of course, the server may also stitch the tag text and tag type directly together for input.
For example, fig. 5 is a schematic diagram of the extracted tag feature provided in the embodiment of the present application, and referring to fig. 5, the server may separate the tag type 5-2 (e.g., place name, attraction, etc.) of each tag, the tag text 5-3 of each tag by a separator SEP 5-4, and then input the separated tag text into the text feature extraction model 5-1 to obtain the tag feature: vector 5-5.
S2022, extracting text features of each phrase to be analyzed to obtain the sentence features of each sentence to be analyzed.
And the server directly inputs each phrase to be analyzed into the text feature extraction model so as to extract the sentence features by using the text feature extraction model. Of course, the label feature may be a feature vector, a feature matrix, or a feature value, and the application is not limited herein.
For example, fig. 6 is a schematic diagram of extracting text features provided in an embodiment of the present application. The server inputs each sentence 6-2 to be analyzed (e.g., the most popular new song) into the text feature extraction model 6-1 to obtain sentence features: vector 6-3.
S2023, interacting the sentence characteristics of each sentence to be analyzed and the label characteristics of each label to obtain interaction characteristics.
And S2024, performing relevance identification on the interactive features to obtain the relevance of each label and each statement label of the statement to be analyzed.
The server can directly utilize the feature interaction part to realize interaction of the statement features of each statement and the label features of each label in a splicing or weighting mode to obtain interaction features, and can also input the statement features and the label features into a feature extraction model (the model is the feature interaction part) to realize interaction to obtain the interaction features. Then, the server performs relevancy identification on the interactive features by using a relevancy identification part, thereby outputting sentence tagging relevancy between each tag and each sentence to be analyzed.
Illustratively, based on fig. 5 and fig. 6, the embodiment of the present application provides a schematic diagram of calculating the relevance between the tag and the sentence to be analyzed. Referring to fig. 7, the sentence to be analyzed is: zhougelon New song 7-1, tag types are: the place name is 7-2, and the label text is: beijing 7-3, and the separator is SEP 7-4. The server inputs the sentence to be analyzed into a text feature extraction model 7-5, inputs the label type and the label text divided by the separator into the text feature extraction model 7-6 (the two models can share parameters during training), then splices the vector output by the text feature extraction model 7-5 and the output vector of the text feature extraction model 7-6 to obtain a spliced vector 7-7 (namely interactive features), finally identifies the spliced vector, and outputs the correlation degree 7-8 of the label and the sentence to be analyzed.
In the embodiment of the application, the server can extract the tag characteristics from the tag text and the tag type, extract the sentence characteristics from the sentence to be analyzed, and interact the sentence characteristics and the tag characteristics, so that the calculation of the sentence tag correlation degree between each tag and each sentence to be analyzed is realized.
The server may perform an analysis of the mis-tagged tags on the content to be corrected. Referring to fig. 8, in some embodiments of the present application, analyzing each tag based on historical behavior data, and determining a tag to be corrected for a content to be corrected, that is, a specific implementation process of S102 may include: S1024-S1028, as follows:
s1024, obtaining a plurality of search sentences corresponding to the content to be corrected, the number of search times corresponding to each search sentence in the plurality of search sentences and the total number of search behaviors from the historical behavior data.
When analyzing the missing marks of each label of the content to be corrected, the server searches and acquires a plurality of search sentences corresponding to the content to be corrected from the historical behavior data, the search times of each search sentence searched by the user and the total search behavior times, so that the missing marks of each label can be analyzed by utilizing the information subsequently.
And S1025, acquiring the labels of the contents except the contents to be corrected searched by each search statement.
Because there may be more than one content searched by using the same search term, the content to be corrected is removed from the content searched from the multimedia content library by using each search term and is marked as other content. The server finds out other contents from the multimedia content library and obtains the labels of the other contents so as to analyze whether the content to be corrected has label missing or not by utilizing the labels of the other contents subsequently.
And S1026, counting the synchronous viewing times of the content to be modified and other contents viewed under each search statement from the historical behavior data.
Only when the content to be corrected and other content are viewed simultaneously under each search statement, the similarity between the tag of the content to be corrected and the tags of the other content can be shown, so that the server also counts the number of times that the content to be corrected and other content are viewed simultaneously under each search statement from the historical behavior data, and the counted number is used as the synchronous viewing number.
For example, when the content to be modified is a popular cartoon video, a certain search statement is "the current popular cartoon", and other content is another video fed back to the user when the user searches by using the search statement, the server counts the times that the user views the popular cartoon video and the another video simultaneously in the search result, so as to obtain the synchronous viewing times.
S1027, analyzing missing marks of all the labels according to the labels of other contents, the synchronous viewing times, the searching times corresponding to each search statement and the total searching behavior times, and determining the missing mark labels for the contents to be corrected.
The server judges whether the user has higher probability and simultaneously checks the content to be corrected and other contents based on the synchronous checking times and the searching times; judging whether each search statement is a common statement during searching or not by utilizing the searching times and the total searching behavior times of each search statement; based on the tags of the other contents, whether the tags of the other contents are different from the tags of the contents to be corrected can be judged. Finally, the server can simultaneously check the content to be corrected and other contents by combining the user, judge whether each search statement is a common statement, and judge whether the label of the content to be corrected has the label of the missed mark or not by combining the difference condition of the label of the other contents and each label of the content to be corrected, namely, determine the label of the missed mark.
It can be understood that, in the embodiment of the present application, the missing tag is determined based on the difference between the tag of the other content and each tag of the content to be corrected, which is to substantially preliminarily determine some candidate tags capable of being migrated by using the tag of the other content, and then determine whether the candidate tags can actually be migrated based on the condition that whether each search statement is a common statement or not by looking up the content to be corrected and the other content at the same time.
S1028, determining the label of the missing mark of the content to be corrected as the label to be corrected.
And the server takes the determined tag of the missed mark of the content to be corrected as the tag to be corrected so as to correct the tag set of the content to be corrected according to the tag of the missed mark.
In the embodiment of the application, the server can determine other contents which can be searched by the server according to each search statement used for searching the contents to be corrected, and then judge whether the two contents have commonality on a user level based on the number of times that the other contents and the contents to be corrected are simultaneously viewed under each search statement, so as to judge whether the tags of the other contents can be migrated into the contents to be corrected, and further determine that the missing tag of the contents to be corrected is given, so that a more accurate tag set can be obtained by subsequently utilizing the missing tag.
In some embodiments of the present application, analyzing the tag omission according to tags of other contents, the number of synchronous viewing times, the number of search times corresponding to each search statement, and the total number of search behaviors, and determining the label omission for the content to be corrected, that is, a specific implementation process of S1027 may include S1027 a-S1027 d, as follows:
s1027a, carrying out ratio operation on the searching times and the total searching behavior times corresponding to each searching statement to obtain a second searching ratio corresponding to each searching statement.
The server takes the searching times of each searching statement as a numerator and the total searching behavior times as a denominator, and performs ratio operation, wherein the obtained ratio is the second searching proportion corresponding to each searching statement. It should be noted that the second search proportion reflects the proportion of each search statement used for performing search activities in all the search statements in the whole network.
S1027b, comparing each label with labels of other contents to obtain a difference label.
The server compares each label of the content to be corrected with the performances of other contents, so as to judge whether labels marked in other contents exist but not marked in the content to be corrected, and the labels are used as difference labels. That is, the difference tag is a tag in which other contents are marked and contents to be corrected are not marked.
It should be noted that the difference tag does not include a tag in which the content to be modified is marked, but other content is not marked. This is because the embodiment of the present application migrates to the content to be corrected based on the tags of other contents, so that the server only needs to find out the tags that are possessed by other contents but not possessed by the content to be corrected.
For example, the labels of the other contents are "beijing", "the old palace" and "museum", and the label of the content to be corrected is "beijing", "the museum", and then the label "the old palace" is a difference label between the other contents and the content to be corrected.
S1027c, calculating the sub-missing mark probability of the difference label under each search statement based on the correlation degree of the difference label and the content to be corrected, the second search ratio, the synchronous viewing times and the search times.
After the server determines the difference label, the server calculates the degree of correlation between the difference label and the content to be corrected, then synchronously checks the number of times and the number of times of search, determines the probability that the content to be corrected and other content are simultaneously checked by the user under each search statement, and finally calculates the probability of the missed sub-label of the difference label under each search statement by combining the degree of correlation, the second search duty ratio and the probability that the content to be corrected and other content are simultaneously checked by the user under each search statement.
S1027d, accumulating the sub-missing tag probability corresponding to each search statement to obtain the missing tag probability of the difference tag, and when the missing tag probability is greater than the probability threshold, taking the difference tag as the missing tag of the content to be corrected.
The server accumulates the sub-missing label probabilities corresponding to each search statement together, the obtained probability accumulation result is the missing label probability of the difference label, and then the server compares the missing label probability with the probability threshold value. When the probability of the missed mark is judged to be larger than the probability threshold value, the difference label is the label of the missed mark of the content to be corrected, and therefore the difference label is determined to be the missed mark label of the content to be corrected; and when the probability of the missed mark is judged to be less than or equal to the probability threshold, the difference label is not the label of the missed mark of the content to be corrected.
Illustratively, when the difference label is represented by a sub-miss label probability under each search statement as p i (i is a label of the search sentence), the calculation process of the probability of missing label can be shown as formula (3):
PL=sum p i (3)
wherein PL is the probability of missing label of the difference label.
In the embodiment of the application, the server compares each tag of the content to be corrected with tags of other contents to determine a difference tag, and then based on the correlation degree between the difference tag and the content to be corrected, each search statement occupies the usage proportion of the statements used for searching in the whole network, namely the second search proportion, and the probability of checking the content to be corrected and other contents simultaneously, which is calculated by using the synchronous checking times and the searching times, calculates the probability of missing tags of the difference tag, so as to determine whether the difference tag is the missing tag.
In some embodiments of the present application, calculating the sub-missing label probability of the difference label under each search statement based on the correlation between the difference label and the content to be corrected, the second search ratio, the number of synchronous views, and the number of searches, that is, a specific implementation process of S1027c may include: S301-S303, as follows:
s301, comparing the synchronous viewing times with the searching times to obtain the synchronous viewing probability that the content to be corrected and other contents are simultaneously viewed under each search statement.
And the server takes the synchronous viewing times as numerators and the searching times as denominators, calculates the ratio, namely the probability that the content to be corrected and other contents are viewed synchronously under each search statement, and records the probability as the synchronous viewing probability.
And S302, calculating the correlation degree of the content to be corrected and the difference label to obtain the correlation degree of the content label.
The server firstly judges the type of the content to be corrected, for example, judges whether the content to be corrected belongs to audio, video or image, so as to select a proper feature extraction mode for the content to be corrected, and perform feature extraction on the content to be corrected to obtain the feature of the content to be corrected. Meanwhile, the server also extracts the characteristics of the difference labels to obtain label characteristics, and then performs interaction and correlation analysis on the label characteristics and the characteristics of the content to be corrected to obtain the content label correlation.
It will be appreciated that the server selects the appropriate feature extraction method depending on the type of content to be modified. For example, when the content to be modified is a video, the server performs feature extraction on the content to be modified in three dimensions of sound, image and text at the same time, when the content to be modified is an image, the server performs feature extraction only in the dimension of the image, and when the content to be modified is an audio, the server performs feature extraction only in the dimension of the sound.
And S303, taking the product of the synchronous viewing probability, the content label correlation degree and the second search proportion as the sub-missing label probability of the difference label under each search statement.
And the server multiplies the calculated synchronous viewing probability, the content label correlation degree and the second search ratio, and uses the obtained product as the sub-missing label probability of the difference label under each search statement.
By way of example, the embodiment of the present application provides a formula for calculating the probability of the child missing mark, see formula (4):
p i =r i ×c i ×d (4)
wherein i is the index of the search term, r i For each search statement, a second search proportion, c i The synchronous viewing probability that the content to be corrected and other contents are simultaneously viewed under each search statement is shown, d is the correlation degree of the difference label and the content label of the content to be corrected, and p is i Tags are labeled for the child misses of the difference tag under each search statement.
In the embodiment of the application, the server calculates the synchronous viewing probability of the content to be corrected and other contents under each search statement, then calculates the content label correlation degree of the content to be corrected and the difference label, and then multiplies the synchronous viewing probability, the content label correlation degree and the second search duty, so that the sub-missing label search probability of the difference label under each search statement is obtained, and the missing label probability of the difference label is conveniently calculated subsequently.
In some embodiments of the present application, the to-be-corrected tag includes a video, and then, the calculating of the relevance between the to-be-corrected content and the difference tag is performed to obtain the relevance of the content tag, that is, the specific implementation process of S302 may include: s3021 to S3024, as follows:
s3021, respectively extracting audio features, image features and text features from the video, and fusing the audio features, the image features and the text features to obtain fusion features corresponding to the video.
The server extracts audio features of the audio of the video, extracts image features of video frames of the video, identifies caption keywords from pictures of the video, and extracts text features of captions of the video and titles of the video. Finally, the server can fuse the audio features, the image features and the text features by means of weighting, splicing or sending the audio features, the image features and the text features to a feature extraction network with a higher dimensionality, so that fusion features capable of comprehensively representing the three dimensionalities of the sound, the image and the text of the video are obtained.
Further, in some embodiments, the server may disassemble the video into individual video frames, extract the sub-image features of each video frame from the image frames of the video frames by using an image feature extraction model, and then fuse the sub-image features of each video frame (which may be implemented by splicing or collectively feeding into a 1 × 1 convolutional layer, etc.) to obtain the image features. Similarly, the server may segment the audio of the video according to the video frames to obtain an audio segment corresponding to each video frame, then extract the sub-audio features corresponding to each video frame from the audio segment corresponding to each video frame by using the audio feature extraction model, and finally fuse the sub-audio features corresponding to each video frame to obtain the audio features of the video.
Illustratively, fig. 9 is a schematic diagram of a fusion feature for generating a video provided by an embodiment of the present application. Referring to fig. 9, the server splits the video into video frames 9-1, inputs the video frames 9-1 into an image feature extraction model 9-2 to obtain sub-image features of each video frame, and then fuses the sub-image features of all the video frames 9-3 to obtain image features; the server inputs the audio fragment 9-4 corresponding to each video frame 9-1 into the audio feature extraction model 9-5 to obtain sub-audio features corresponding to each video frame, and then fuses the sub-audio features of all the video frames 9-6 to obtain audio features; meanwhile, the server inputs the title 9-7 of the video and the caption keywords 9-8 extracted from the picture of the video into the text feature extraction model 9-9 to obtain the text features 9-10. Finally, the server performs multi-modal feature fusion 9-11 of the image features, audio features and text features 9-10, resulting in fused features 9-12 of the video.
And S3022, extracting text characteristics of the label type and the label text of the difference label to obtain the label characteristics of the difference label.
And S3023, interacting the label features and the fusion features of the difference labels to obtain interaction features.
And the server inputs the label type and the label text of the difference label into the text feature extraction model to obtain the label feature of the difference label, and then fuses the label feature and the fusion feature of the difference label together in splicing, weighting and other modes to obtain the interactive feature between the difference label and the video.
And S3024, performing relevance identification on the interactive features to obtain the relevance of the content label.
And the server identifies the relevance of the interactive features, so that the relevance of the content label between the video and the difference label is obtained. It will be appreciated that in some embodiments, the server may input the interactive features into a trained classifier to derive content tag relevance.
Illustratively, based on fig. 9, fig. 10 is a schematic diagram for calculating the content tag relevance provided in the embodiment of the present application. The server determines fusion characteristics, simultaneously inputs label text 10-1 and label type 10-2 of the difference label into a text characteristic extraction model 10-3 for identification to obtain label characteristics 10-4 of the difference label, and simultaneously splices the fusion characteristics obtained based on the process of figure 9 with the label characteristics 10-4 to realize interaction 10-5 of the label characteristics and the fusion characteristics of the video to obtain interaction characteristics, and finally identifies the interaction characteristics to obtain final content label relevance 10-6.
In the embodiment of the application, when the content to be corrected comprises a video, the server extracts the features of the video from three dimensions of an image, an audio and a text, so that the fusion features fully expressed by the video content are obtained, meanwhile, the tag features are extracted from a difference tag, then, the tag features and the fusion features are interacted, and the correlation degree of the interaction features obtained by interaction is identified.
In some embodiments of the present application, the server selects some multimedia contents from the multimedia content library as the contents to be modified, instead of taking all multimedia contents in the multimedia content library as the contents to be modified, so as to reduce the number of contents that need to be subjected to tag modification. In this case, obtaining the content to be corrected, that is, a specific implementation process of S101 may include: s1011, as follows:
s1011, obtaining the multimedia content with the exposure times larger than the preset threshold and the click times smaller than the preset threshold from the multimedia content library to obtain the content to be corrected.
The server obtains the exposure times and the click times of each multimedia content in the multimedia content library, so that the search condition and the viewing condition of each multimedia content of a user can be clarified. The server compares the exposure times of the multimedia contents with a preset threshold value, and initially selects the multimedia contents with the exposure times larger than the preset threshold value, namely the multimedia contents displayed more during the search of the user. And then, the server compares the click times of the multimedia contents with a preset threshold value, and selects the multimedia contents with the click times smaller than the preset threshold value, so that the multimedia contents which are less viewed by the user can be obtained. And finally, the server screens the overlapped parts of the multimedia contents which are displayed more when the user searches and the multimedia contents which are viewed less by the user, wherein the multimedia contents are the multimedia contents of which the labels need to be corrected, and the server takes the multimedia contents as the contents to be corrected.
In the embodiment of the application, the server can comprehensively analyze the exposure times and the click times of each multimedia content, so that the content needing to be subjected to label correction is selected, the number of the content to be corrected is reduced, and the speed of label correction is increased.
In some embodiments of the present application, the server may perform different correction processing on the tag sequence according to the type of the tag to be corrected, so as to correct the tag set of the content to be corrected according to the tag to be corrected, and obtain a corrected tag set, that is, a specific implementation process of S103 may include S1031 or S1032, as follows:
and S1031, when the to-be-corrected label is the missing label of the to-be-corrected content, adding the to-be-corrected label to the label set of the to-be-corrected content to obtain a corrected label set.
When the label to be corrected is a label with missing marks, the server adds the label to be corrected to the label set, so that the label in the corrected label set is complete and has no missing marks.
S1032, when the label to be corrected is the error marking label of the content to be corrected, the label to be corrected is removed from the label set of the content to be corrected, and the corrected label set is obtained.
And when the label to be corrected is the error marking label, the server eliminates the error marking label from the label set, so that the corrected label can accurately reflect the content to be corrected.
In the embodiment of the application, the server determines whether to eliminate the label to be corrected from the label set or add the label to the label set according to whether the label to be corrected is a label marked by mistake or a label marked by omission, so that a more accurate corrected label set is obtained, and the accuracy of the label set is improved.
In the following, an exemplary application of the embodiments of the present application in a practical application scenario will be described.
The method and the device are implemented under the condition that the label correction is automatically carried out on the video (to-be-corrected content) in the video platform (multimedia content library) based on the user behavior so as to determine a more accurate label sequence (label set) for the video.
Fig. 11 is a schematic diagram of a process of performing label correction on a video according to an embodiment of the present application. Referring to fig. 11, a server (an embodiment of a tag correction device) corrects a video tag (a tag set of content to be corrected) based on a behavior of a user searching for a video and clicking to play the video 11-1, which can be divided into two sub-processes, namely correcting a video mis-tag 11-11 based on a user searching and viewing behavior, and correcting a video missing-tag 11-12 based on a user searching and viewing behavior, where both processes require participation of a sentence (a search sentence) searched by the user and a video tag (each tag in the tag set) correlation model 11-2. After the video tag modification is finished, the server may distribute 11-3 the video based on the modified video tag (modified set of tags).
After the tag correction process is started, the server acquires user search behavior data (historical behavior data) to correct the video tag based on the user behavior data.
Specifically, the server obtains a user search behavior log of the video platform, performs data summarization and sorting, and sorts the obtained data, as shown in fig. 12, where fig. 12 is a schematic diagram of data obtained by sorting the user search behavior data according to the embodiment of the present application, where the user search behavior data includes statements (statements _1 to _ q in fig. 12) used by the user to search for videos, tag sequences of searched videos 1 to v, and tag sequences of videos 1 to v, where the tag sequence of each video is: label 1, label 2, … …, label ti (i denotes the number of the video), search duty per sentence (i.e., search duty per sentence _ i), user click-to-play probability per video (i.e., user click-to-play probability of video i).
The search duty of each sentence is the number of searches performed by the platform user for each sentence (the number of searches performed for each search sentence)/the number of searches performed by all users (the total number of searches performed).
The user click-to-play probability of each video is the number of times that the video is clicked by the user to play under the sentence (the number of views of the content to be corrected under each search sentence)/the number of searches of the sentence (the number of searches of each search sentence).
When the video mislabel is corrected based on the user searching behavior, the user is supposed to search a certain video through a certain sentence, if the video in the searching result is not clicked by the user and the certain label of the video is related to the sentence searched by the user, the label is not related to the video too much, otherwise, the user has a higher probability of clicking the video. Therefore, the server can determine whether the tag of the video that is not clicked in the user search sentence is the tag to be deleted.
The server can identify the mistaken label for the video v (to-be-corrected content) of which the search exposure times in the platform video library are greater than a certain threshold (the exposure times are greater than a preset threshold) and the user click quantity is lower than the certain threshold (the click times are smaller than the preset threshold). For each labeled tag of the video v, calculating the probability of mislabeling on the search sentence (at least one sentence to be analyzed) of which the corresponding click probability (view probability) of the video is less than a certain threshold value:
the mislabel probability PEq of the label x in the sentence q is the correlation between the label x and the user search Query (sentence label correlation) × the ratio of the sentences searched by the user (first search ratio), and all the user search sentences herein are sentences of which the click probability is smaller than a certain threshold. Where the relevance of tag x to the sentence searched by the user may be calculated using the process illustrated in fig. 7.
The overall mis-tag probability PE of the label x of the video v is sum _ PEq (sub-mis-tag probabilities corresponding to each sentence to be analyzed are accumulated to obtain the mis-tag probability of each label).
When the overall false labeling probability PE of the label x of the video v is larger than a certain threshold value, the label x can be removed from the label sequence of the video v (the label to be corrected is removed from the label set), the accuracy of the label sequence is improved, or the platform is informed to operate to perform manual confirmation intervention, so that the manual efficiency of finding the false labeling label of the video is improved.
When the user searches for behavior modification video missing label, assuming that the video a and the video B are exposed and displayed by the user search statement q at the same time and are clicked by the user, the label of the video a can try to migrate to the video B, and meanwhile, the label of the video B can try to migrate to the video a, which is explained by the following that the label of the video a (other contents) tries to migrate to the video B (contents to be modified):
a label x (difference label) that is labeled in the video a but not labeled in the video B, and based on a searched sentence q (each of a plurality of searched sentences of the content to be corrected), the label x is a probability PLq of missing label of the video B as a search occupation ratio of the searched sentence q (second search occupation ratio) as a probability that the video a and the video B are clicked simultaneously under the searched sentence q (synchronous viewing probability) as a correlation degree of the label x with the video B (content label correlation degree).
The probability that the video a and the video B are clicked simultaneously under the sentence is equal to the number of clicks (number of simultaneous views) of the video a and the video B simultaneously in the searched sentence q/the number of searches in the searched sentence q. The degree of correlation of the label x with the video B can be calculated by the process shown in fig. 10. In some cases, it may also be required that the search occupancy ratio of the searched sentence q, the probability that the video a and the video are clicked under the searched sentence q at the same time, and the relevance of the label x and the video B all satisfy a certain threshold, so that the calculation of PLq is more accurate.
The label x is the probability PL of the missing label of the video B — sum _ PLq (the probability of the missing label of the disparity label is obtained by accumulating the probabilities of the sub-missing labels corresponding to each search term). When the PL meets a certain threshold, the label x can be added to the label list of the video B (namely, the label x is a label with missing mark) so as to improve the accuracy of the label list of the video B, or the platform is informed to operate to perform manual confirmation intervention, so that the manual efficiency of finding the label with the error mark of the video is improved.
Through the steps, the server removes the mistaken label of the video based on the user behavior and supplements the missed label, so that the coverage and the accuracy of the video label sequence are improved, and the video can be searched, recommended and distributed based on each label in the corrected label sequence.
Through the mode, the server can identify the false mark label and the missing mark label of the video based on the historical behaviors of the user, effectively correct the label sequence of the video, improve the accuracy of the video label sequence in reflecting the video content, improve the usability of the label sequence and reduce the cost of manually finding and correcting the label.
Continuing with the exemplary structure of the tag correction device 455 provided by the embodiments of the present application as a software module, in some embodiments, as shown in fig. 2, the software module stored in the tag correction device 455 of the memory 440 may include:
a data obtaining module 4551, configured to obtain content to be corrected, a tag set including each tag of the content to be corrected, and historical behavior data corresponding to the content to be corrected; the historical behavior data represent data generated when the content to be corrected is searched in a historical time period, and the content to be corrected is multimedia content waiting for label correction;
a tag determining module 4552, configured to analyze each tag based on the historical behavior data, and determine a tag to be corrected for the content to be corrected; wherein the label to be corrected is a missing label or a false label of the content to be corrected;
and the tag correction module 4553 is configured to correct the tag set of the content to be corrected according to the tag to be corrected, so as to obtain a corrected tag set.
In some embodiments of the present application, the tag determining module 4552 is further configured to extract, from the historical behavior data, a plurality of search sentences for searching out the content to be corrected, the number of searches corresponding to each search sentence, the number of views of the content to be corrected in each search sentence, and the total number of search behaviors; wherein the total number of search behavior occurrences characterizes a total number of search behavior occurrences over the historical time; combining the plurality of search sentences, the search times, the viewing times and the total search behavior times, performing error marking analysis on each label, and screening out error marking labels of the content to be corrected from each label; and determining the error marking label of the content to be corrected as the label to be corrected.
In some embodiments of the present application, the tag determining module 4552 is further configured to perform a ratio operation on the search times and the viewing times to obtain viewing probabilities that the content to be corrected is viewed under each search statement respectively; screening at least one sentence to be analyzed from the plurality of search sentences according to the viewing probability corresponding to each search sentence; the at least one sentence to be analyzed is a search sentence with the viewing probability lower than a preset threshold value; screening out the number of times of to-be-analyzed search corresponding to each to-be-analyzed statement in the at least one to-be-analyzed statement from the plurality of search times; analyzing the error marking of each label based on the number of searching times to be analyzed, the total number of searching behaviors and the correlation degree of each label and each statement to be analyzed to obtain the error marking probability of each label; and screening the labels with the mislabeling probability larger than a probability threshold value from each label to obtain the mislabeling labels of the content to be corrected.
In some embodiments of the application, the tag determining module 4552 is further configured to perform a ratio operation on the number of searches to be analyzed and the total number of search behaviors, so as to obtain a first search proportion corresponding to each statement to be analyzed; performing relevance calculation on each label in each label and each statement to be analyzed one by one to obtain statement label relevance of each label and each statement to be analyzed; multiplying the statement label correlation degree and the first search proportion to obtain the sub-error marking probability of each label under each statement to be analyzed; and accumulating the sub-false mark probabilities corresponding to the statements to be analyzed to obtain the false mark probability of each label.
In some embodiments of the present application, the tag determining module 4552 is further configured to perform text feature extraction on the tag text and the tag type of each tag, so as to obtain a tag feature of each tag; extracting text characteristics of each sentence to be analyzed to obtain sentence characteristics of each sentence to be analyzed; interacting the sentence characteristics of each sentence to be analyzed and the label characteristics of each label to obtain interaction characteristics; and performing relevance identification on the interactive features to obtain relevance of each label and the statement label of each statement to be analyzed.
In some embodiments of the present application, the tag determining module 4552 is further configured to obtain, from the historical behavior data, a plurality of search sentences corresponding to the content to be modified, search times corresponding to each search sentence in the plurality of search sentences, and a total number of search behaviors; acquiring tags of other contents except the contents to be corrected, which are searched by utilizing each search statement; counting the synchronous viewing times of the content to be corrected and the other contents viewed under each search statement from the historical behavior data; analyzing the missing marks of the labels according to the labels of the other contents, the synchronous viewing times, the searching times corresponding to each searching statement and the total searching behavior times, and determining the missing mark labels for the contents to be corrected; and determining the label of missing marks of the content to be corrected as the label to be corrected.
In some embodiments of the application, the tag determining module 4552 is further configured to perform a ratio operation on the number of searches corresponding to each search statement and the total number of search behaviors to obtain a second search proportion corresponding to each search statement; comparing each label with the labels of the other contents to obtain a difference label; the difference label is a label marked by the other content and unmarked by the content to be corrected; calculating the sub-missing label probability of the difference label under each search statement based on the correlation degree of the difference label and the content to be corrected, the second search proportion, the synchronous viewing times and the search times; and accumulating the sub-missing mark probabilities corresponding to each search statement to obtain the missing mark probability of the difference label, and taking the difference label as the missing mark label of the content to be corrected when the missing mark probability is greater than a probability threshold.
In some embodiments of the present application, the tag determining module 4552 is further configured to compare the number of synchronous viewing times with the number of search times, and obtain a synchronous viewing probability that the content to be modified and the other content are simultaneously viewed under each search statement; calculating the correlation degree of the content to be corrected and the difference label to obtain the correlation degree of the content label; and taking the product of the synchronous viewing probability, the content label correlation degree and the second search proportion as the sub-missing label probability of the difference label under each search statement.
In some embodiments of the present application, the content to be modified comprises a video; the tag determination module 4552 is further configured to extract audio features, image features, and text features from the video, and fuse the audio features, the image features, and the text features to obtain fusion features corresponding to the video; extracting text features of the label types and the label texts of the difference labels to obtain the label features of the difference labels; interacting the label features of the difference labels with the fusion features to obtain interaction features; and performing relevance identification on the interactive features to obtain the relevance of the content label.
In some embodiments of the application, the data obtaining module 4551 is further configured to obtain, from a multimedia content library, a multimedia content whose exposure number is greater than a preset threshold and whose click number is less than a preset threshold, so as to obtain the content to be corrected.
In some embodiments of the application, the tag correction module 4553 is further configured to, when the tag to be corrected is a missing tag of the content to be corrected, add the tag to be corrected to the tag set of the content to be corrected, to obtain the corrected tag set; or, when the to-be-corrected label is the error marking label of the to-be-corrected content, removing the to-be-corrected label from the label set of the to-be-corrected content to obtain the corrected label set.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the label correction device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the label correction device executes the label correction method described above in the embodiments of the present application.
Embodiments of the present application provide a computer-readable storage medium storing executable instructions, where the executable instructions are stored, and when executed by a processor, the executable instructions cause the processor to execute a tag correction method provided by embodiments of the present application, for example, a method as shown in fig. 3.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, the executable tag revision instruction may be written in any form of programming language (including compiled or interpreted languages), in the form of a program, software module, script, or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, the executable tag revision instructions may, but need not, correspond to files in a file system, may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
As an example, the executable tag revision instructions may be deployed for execution on one tag revision device, or on multiple tag revision devices located at one site, or distributed across multiple sites and interconnected by a communication network.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (14)

1. A label correction method, comprising:
acquiring content to be corrected, a label set comprising each label of the content to be corrected and historical behavior data corresponding to the content to be corrected;
the historical behavior data represents data generated when the content to be corrected is searched in a historical time period, and the content to be corrected is multimedia content waiting for label correction;
analyzing each label based on the historical behavior data, and determining a label to be corrected for the content to be corrected; the label to be corrected is a missing label or a false label of the content to be corrected;
and correcting the label set of the content to be corrected according to the label to be corrected to obtain a corrected label set.
2. The method according to claim 1, wherein the analyzing the tags based on the historical behavior data to determine tags to be modified for the content to be modified comprises:
extracting a plurality of search sentences used for searching out the content to be corrected, the search times corresponding to each search sentence, the viewing times of the content to be corrected under each search sentence and the total search behavior times from the historical behavior data;
wherein the total number of search behavior occurrences characterizes a total number of search behavior occurrences over the historical time;
combining the plurality of search sentences, the search times, the viewing times and the total search behavior times, performing error marking analysis on each label, and screening out error marking labels of the content to be corrected from each label;
and determining the error marking label of the content to be corrected as the label to be corrected.
3. The method according to claim 2, wherein the combining the plurality of search sentences, the number of searches, the number of views, and the total number of search actions to perform mis-tag analysis on the tags and screen out mis-tag tags of the content to be corrected from the tags includes:
performing ratio operation on the search times and the viewing times to obtain viewing probabilities of the content to be corrected viewed under each search statement respectively;
screening at least one sentence to be analyzed from the plurality of search sentences according to the viewing probability corresponding to each search sentence; the at least one sentence to be analyzed is a search sentence with the viewing probability lower than a preset threshold value;
screening out the number of times of to-be-analyzed search corresponding to each to-be-analyzed statement in the at least one to-be-analyzed statement from the plurality of search times;
analyzing the mislabel of each label based on the number of times of searching to be analyzed, the total number of times of searching behaviors and the correlation degree of each label and each statement to be analyzed to obtain the mislabel probability of each label;
and screening the labels with the mislabeling probability larger than a probability threshold value from each label to obtain the mislabeling labels of the content to be corrected.
4. The method of claim 3, wherein the analyzing each tag for mislabeling based on the number of searches to be analyzed, the total number of search actions, and the degree of correlation between each tag and each sentence to be analyzed to obtain the mislabeling probability of each tag comprises:
performing ratio operation on the search times to be analyzed and the total search behavior times to obtain a first search ratio corresponding to each statement to be analyzed;
performing relevance calculation on each label in the labels and each statement to be analyzed one by one to obtain statement label relevance of each label and each statement to be analyzed;
multiplying the statement label correlation degree and the first search proportion to obtain the sub-error marking probability of each label under each statement to be analyzed;
and accumulating the sub-false mark probabilities corresponding to the statements to be analyzed to obtain the false mark probability of each label.
5. The method according to claim 4, wherein the calculating the relevance of each label in the labels with each sentence to be analyzed one by one to obtain the relevance of each label with the sentence label of each sentence to be analyzed comprises:
extracting text characteristics of the label text and the label type of each label to obtain the label characteristics of each label;
extracting text features of each sentence to be analyzed to obtain the sentence features of each sentence to be analyzed;
interacting the sentence characteristics of each sentence to be analyzed with the label characteristics of each label to obtain interaction characteristics;
and performing relevance identification on the interactive features to obtain relevance of each label and the statement label of each statement to be analyzed.
6. The method according to claim 1, wherein the analyzing the respective tags based on the historical behavior data to determine a tag to be modified for the content to be modified comprises:
obtaining a plurality of search sentences corresponding to the content to be corrected, the number of search times corresponding to each search sentence in the plurality of search sentences and the total number of search behaviors from the historical behavior data;
acquiring tags of other contents except the contents to be corrected searched by utilizing each search statement;
counting the synchronous viewing times of the content to be corrected and the other contents viewed under each search statement from the historical behavior data;
analyzing the missed marks of the labels according to the labels of the other contents, the synchronous viewing times, the searching times corresponding to each searching statement and the total searching behavior times, and determining the missed mark labels for the contents to be corrected;
and determining the label of the missing mark of the content to be corrected as the label to be corrected.
7. The method according to claim 6, wherein the analyzing the missing marks of the tags according to the tags of the other contents, the synchronous viewing times, the search times corresponding to each search statement, and the total number of search actions to determine a missing mark tag for the content to be corrected comprises:
performing ratio operation on the search times corresponding to each search statement and the total search behavior times to obtain a second search ratio corresponding to each search statement;
comparing each label with the labels of the other contents to obtain a difference label; the difference label is a label marked by the other content and unmarked by the content to be corrected;
calculating the sub-missing label probability of the difference label under each search statement based on the correlation degree of the difference label and the content to be corrected, the second search proportion, the synchronous viewing times and the search times;
and accumulating the sub-missing mark probabilities corresponding to each search statement to obtain the missing mark probability of the difference label, and taking the difference label as the missing mark label of the content to be corrected when the missing mark probability is greater than a probability threshold.
8. The method according to claim 7, wherein the calculating the sub-missing tag probability of the difference tag under each search statement based on the correlation degree of the difference tag and the content to be corrected, the second search proportion, the synchronous viewing times and the search times comprises:
comparing the synchronous viewing times with the searching times to obtain synchronous viewing probability that the content to be corrected and the other content are simultaneously viewed under each searching statement;
calculating the correlation degree of the content to be corrected and the difference label to obtain the correlation degree of the content label;
and taking the product of the synchronous viewing probability, the content label correlation degree and the second search proportion as the sub-missing label probability of the difference label under each search statement.
9. The method according to claim 8, wherein the content to be modified comprises video; the calculating the correlation degree of the content to be corrected and the difference label to obtain the correlation degree of the content label comprises the following steps:
respectively extracting audio features, image features and text features from the video, and fusing the audio features, the image features and the text features to obtain fused features corresponding to the video;
extracting text features of the label types and the label texts of the difference labels to obtain the label features of the difference labels;
interacting the label features of the difference labels with the fusion features to obtain interaction features;
and performing relevance identification on the interactive features to obtain the relevance of the content label.
10. The method according to any one of claims 2 to 9, wherein the obtaining of the content to be modified includes:
and acquiring the multimedia content with the exposure times larger than a preset threshold and the click times smaller than the preset threshold from a multimedia content library to obtain the content to be corrected.
11. The method according to any one of claims 1 to 9, wherein the modifying the labelset of the content to be modified according to the label to be modified to obtain a modified labelset includes:
when the label to be corrected is a missing label of the content to be corrected, adding the label to be corrected to the label set of the content to be corrected to obtain the corrected label set; alternatively, the first and second liquid crystal display panels may be,
and when the label to be corrected is the error marking label of the content to be corrected, removing the label to be corrected from the label set of the content to be corrected to obtain the corrected label set.
12. A label correction device, comprising:
the data acquisition module is used for acquiring content to be corrected, a label set comprising each label of the content to be corrected and historical behavior data corresponding to the content to be corrected; the historical behavior data represent data generated when the content to be corrected is searched in a historical time period, and the content to be corrected is multimedia content waiting for label correction;
the label determining module is used for analyzing each label based on the historical behavior data and determining a label to be corrected for the content to be corrected; wherein the label to be corrected is a missing label or a false label of the content to be corrected;
and the label correction module is used for correcting the label set of the content to be corrected according to the label to be corrected to obtain a corrected label set.
13. A label correction apparatus, characterized by comprising:
a memory for storing executable tag revision instructions;
a processor for implementing the method of any one of claims 1 to 11 when executing executable tag revision instructions stored in the memory.
14. A computer-readable storage medium having stored thereon executable tag modification instructions for, when executed by a processor, implementing the method of any one of claims 1 to 11.
CN202110287224.5A 2021-03-17 2021-03-17 Label correction method, device, equipment and computer readable storage medium Pending CN115114459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110287224.5A CN115114459A (en) 2021-03-17 2021-03-17 Label correction method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110287224.5A CN115114459A (en) 2021-03-17 2021-03-17 Label correction method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115114459A true CN115114459A (en) 2022-09-27

Family

ID=83323383

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110287224.5A Pending CN115114459A (en) 2021-03-17 2021-03-17 Label correction method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115114459A (en)

Similar Documents

Publication Publication Date Title
CN111444428B (en) Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
CN111143610B (en) Content recommendation method and device, electronic equipment and storage medium
CN110781347B (en) Video processing method, device and equipment and readable storage medium
CN111259215B (en) Multi-mode-based topic classification method, device, equipment and storage medium
CN112015949B (en) Video generation method and device, storage medium and electronic equipment
WO2018177139A1 (en) Method and apparatus for generating video abstract, server and storage medium
CN112100438A (en) Label extraction method and device and computer readable storage medium
CN110287375B (en) Method and device for determining video tag and server
CN113705299A (en) Video identification method and device and storage medium
CN112015928A (en) Information extraction method and device of multimedia resource, electronic equipment and storage medium
CN114419515A (en) Video processing method, machine learning model training method, related device and equipment
CN114845149B (en) Video clip method, video recommendation method, device, equipment and medium
CN113989476A (en) Object identification method and electronic equipment
CN113704420A (en) Method and device for identifying role in text, electronic equipment and storage medium
TWI725375B (en) Data search method and data search system thereof
CN116977701A (en) Video classification model training method, video classification method and device
CN112256917B (en) User interest identification method, device, equipment and computer readable storage medium
CN115640790A (en) Information processing method and device and electronic equipment
CN115114459A (en) Label correction method, device, equipment and computer readable storage medium
CN114662002A (en) Object recommendation method, medium, device and computing equipment
CN111507065A (en) Reading information processing method and device and storage medium
US11907705B1 (en) Systems and methods for generating dynamically updated metadata using real-time artificial intelligence models
CN113672820B (en) Training method of feature extraction network, information recommendation method, device and equipment
CN116483946A (en) Data processing method, device, equipment and computer program product
CN114765702A (en) Video processing method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination