CN117528152A - Method and device for determining sensitive video clips and electronic equipment - Google Patents

Method and device for determining sensitive video clips and electronic equipment Download PDF

Info

Publication number
CN117528152A
CN117528152A CN202311454738.0A CN202311454738A CN117528152A CN 117528152 A CN117528152 A CN 117528152A CN 202311454738 A CN202311454738 A CN 202311454738A CN 117528152 A CN117528152 A CN 117528152A
Authority
CN
China
Prior art keywords
sensitive
video
attribute value
comments
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311454738.0A
Other languages
Chinese (zh)
Inventor
向贵军
黄真
于嘉慧
唐子腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202311454738.0A priority Critical patent/CN117528152A/en
Priority to CN202410158244.6A priority patent/CN118018813A/en
Publication of CN117528152A publication Critical patent/CN117528152A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • H04N21/4545Input to filtering algorithms, e.g. filtering a region of the image
    • H04N21/45457Input to filtering algorithms, e.g. filtering a region of the image applied to a time segment

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a method and a device for determining a sensitive video clip and electronic equipment, and relates to the technical field of video wind control, wherein the method comprises the following steps: for each video segment contained in the target video to be detected, determining comments with sensitive information from comments issued for the video segment, and taking the comments as sensitive comments of the video segment; based on the sensitive comment of the video clip, determining a sensitive attribute value for representing the sensitivity degree of the video clip as an attribute value to be utilized of the video clip; determining sensitive video clips from the target video based on the target sensitive attribute values of the video clips; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip. According to the scheme, the occupation of resources for detecting the video can be reduced, and the detection efficiency is improved.

Description

Method and device for determining sensitive video clips and electronic equipment
Technical Field
The present invention relates to the field of video wind control technologies, and in particular, to a method and an apparatus for determining a sensitive video clip, and an electronic device.
Background
When watching a video, a user can issue comments aiming at the video content under the current playing progress of the video, and as the comments issued by the user possibly have negative sensitive information and the comments issued by the user are increased continuously along with time, a video provider is required to continuously detect the comments issued by the user so as to delete the sensitive comments with the sensitive information, the risk of the comments is reduced, the number of videos to be detected by the video provider is huge, a large amount of manpower resources and computing resources are required to be occupied for detection, and the efficiency is lower.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for determining a sensitive video clip and electronic equipment, so as to improve the efficiency of determining the sensitive video clip. The specific technical scheme is as follows:
in a first aspect of the present invention, there is provided a method for determining a sensitive video clip, the method comprising:
for each video segment contained in the target video to be detected, determining comments with sensitive information from comments issued for the video segment, and taking the comments as sensitive comments of the video segment;
based on the sensitive comment of the video clip, determining a sensitive attribute value for representing the sensitivity degree of the video clip as an attribute value to be utilized of the video clip;
Determining sensitive video clips from the target video based on target sensitive attribute values of the video clips; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
Optionally, the determining, based on the sensitive comment of the video clip, a sensitive attribute value for characterizing the sensitivity degree of the video clip as the attribute value to be utilized of the video clip includes:
determining a first sensitive comment which does not contain a sensitive word and has sensitive semantics and a second sensitive comment which contains the sensitive word from the sensitive comments of the video clip;
determining the number of sensitive comments belonging to each sensitive category in the first sensitive comments, and calculating the ratio of the number to the total number of comments published for the video clip as a first ratio corresponding to each sensitive category;
determining the number of sensitive words belonging to each sensitivity degree in the second sensitive comment, and calculating the ratio of the number to the total number of all sensitive words contained in all second sensitive comments as a second ratio corresponding to each sensitivity degree;
for each sensitive category, calculating a weighted sum of first ratios corresponding to the sensitive category according to preset weights of the sensitive category to obtain a category attribute value;
For each sensitivity degree, calculating a weighted sum of a second ratio corresponding to the sensitivity degree according to a preset weight of the sensitivity degree to obtain a degree attribute value;
and calculating the sum value of the obtained category attribute value and the degree attribute value to obtain a sensitivity attribute value for representing the sensitivity degree of the video clip.
Optionally, the determining the sensitive video segment from the target video based on the target sensitive attribute value of each video segment includes:
performing relation fitting on the playing progress of the target video and the sensitive attribute value based on the target sensitive attribute value of each video segment to obtain a mapping relation between the playing progress of the target video and the sensitive attribute value;
and determining the range of the playing progress of which the sensitive attribute value is larger than a first sensitive threshold value in the target video according to the mapping relation to obtain the sensitive video fragment in the target video.
Optionally, the method further comprises:
acquiring comments of the determined sensitive information when the determined sensitive video clips are detected at the historical moment, and taking the comments as reference comments;
calculating the text similarity between the comment to be detected in the determined sensitive video fragment and the reference comment; the comments to be detected are comments except the determined sensitive comments in the video clip;
And determining the comment with the text similarity with the reference comment being larger than a preset similarity threshold value in the comments to be detected as the comment containing the sensitive information.
Optionally, before the determining the sensitive video segments from the target video based on the target sensitive attribute values of the video segments, the method further includes:
calculating the attribute value to be utilized of the video segment and the average value of at least one historical sensitive attribute value to obtain a target sensitive attribute value of the video segment; wherein each history sensitive attribute value is: and a sensitivity attribute value for characterizing the sensitivity of the video segment, as determined by the historical moment.
Optionally, the method further comprises:
calculating the average value of the determined target sensitive attribute values of each video segment;
judging whether the obtained average value is larger than a second sensitivity threshold;
and if the target video is larger than the sensitive video, determining the target video as the sensitive video.
In a second aspect of the implementation of the present invention, there is also provided a device for determining a sensitive video clip, the device including:
the sensitive comment determining module is used for determining comments with sensitive information from comments posted for each video clip contained in the target video to be detected currently as sensitive comments of the video clip;
The attribute value determining module is used for determining a sensitive attribute value used for representing the sensitivity degree of the video clip based on the sensitive comment of the video clip, and the sensitive attribute value is used as an attribute value to be utilized of the video clip;
the sensitive segment determining module is used for determining sensitive video segments from the target video based on the target sensitive attribute values of the video segments; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
In yet another aspect of the present invention, an electronic device is provided, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; a memory for storing a computer program; and the processor is used for realizing any one of the above sensitive video segment determining methods when executing the program stored in the memory.
In yet another aspect of the present invention, there is also provided a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements the method for determining a sensitive video clip described in any of the above.
In yet another aspect of the present invention there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of determining a sensitive video clip as defined in any one of the above.
According to the method for determining the sensitive video clips, comments with sensitive information are determined from comments posted for each video clip according to each video clip contained in a target video to be detected currently, the comments are used as sensitive comments of the video clip, and then a sensitive attribute value used for representing the sensitivity degree of the video clip is determined based on the number of the sensitive comments of the video clip and is used as a attribute value to be utilized of the video clip; the determined sensitivity attribute value is positively correlated with the number of the sensitive comments of the video segment, and sensitive video segments are determined from target videos based on the target sensitivity attribute value of each video segment; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip. Therefore, the scheme can be used as a scheme for pre-detection, and the subsequent detection stage can detect the determined sensitive video fragments in a targeted manner without consuming a large amount of resources to detect the whole video, so that the occupation of detection resources can be reduced, and the detection efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
FIG. 1 is a flowchart illustrating a method for determining a sensitive video clip according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of mapping relationship between playback progress and sensitive attribute values in an embodiment of the present invention;
FIG. 3 is a flowchart of another method for determining a sensitive video clip according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a device for determining a sensitive video clip according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention.
In order to improve the efficiency of determining the sensitive video clips, the embodiment of the invention provides a method and a device for determining the sensitive video clips and electronic equipment. The method can be applied to electronic equipment, such as a server, a computer and the like.
The method for determining the sensitive video clips provided by the embodiment of the invention comprises the following steps:
for each video segment contained in the target video to be detected, determining comments with sensitive information from comments issued for the video segment, and taking the comments as sensitive comments of the video segment;
Based on the sensitive comment of the video clip, determining a sensitive attribute value for representing the sensitivity degree of the video clip as an attribute value to be utilized of the video clip;
determining sensitive video clips from the target video based on target sensitive attribute values of the video clips; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
In this embodiment, the target sensitive attribute value of each video segment may be automatically determined based on the sensitive comment containing the sensitive information in each video segment, and then the sensitive video segment may be determined from the target video based on the target sensitive attribute value of each video segment, so that the sensitive video segment may be quickly determined.
The method for determining the sensitive video clips provided by the embodiment of the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the method for determining a sensitive video clip provided by the embodiment of the present invention may include the following steps:
s101, for each video segment contained in a target video to be detected currently, determining a sensitive comment with sensitive information from comments issued for the video segment;
the video clips included in the target video may be divided in advance, for example, the target video may be divided into a plurality of video clips according to the playing progress of the video. Alternatively, the video clips included in the target video may be video clips within a specified playing progress range in the target video.
After the target video is online, users can issue comments aiming at the target video, and the issued comments change more and more along with time, so that after the target video is online, multiple sensitive fragment detection can be performed on the target video, for example, the detection can be performed periodically.
The comments posted for each video clip may be all comments posted by the user for that video clip, or comments posted by the user for that video clip during a specified period of time. The sensitive comment of each video clip can be determined when the sensitive video clip needs to be determined, or the sensitive comment of each video clip can be predetermined and stored in a preset comment database, and then the sensitive comment is directly obtained from the comment database when the sensitive comment is used.
The comment posted may be a comment having an association with the playing progress of the target video, for example, it may be a bullet screen. For each comment, the playing progress of the target video when the comment is posted may be recorded in advance, and as the timestamp of the comment, for example, if a comment is posted when the target video is played for 5 minutes, the timestamp of the comment may be 5 minutes.
Further, for each video clip, a comment whose timestamp is within the range may be determined as a comment posted for the video clip according to the range of play progress of the video clip in the target video.
In order to detect comments with sensitive information in the video clip, comments with sensitive words in each comment with sensitive information can be regarded as comments with sensitive information, or the video clip comments can be processed by using a natural language understanding model to identify comments with sensitive information, or the comments with sensitive information can be determined by means of manual auditing.
S102, based on the sensitive comment of the video clip, determining a sensitive attribute value for representing the sensitivity degree of the video clip as an attribute value to be utilized of the video clip;
In one implementation, a sensitivity attribute value for characterizing a sensitivity level of the video segment may be determined based on a number of sensitivity reviews, the sensitivity attribute value may be positively correlated with a number of historical sensitivity reviews, e.g., the greater the number of sensitivity reviews, the higher the sensitivity attribute value, or a ratio of the number of sensitivity reviews to a total number of reviews posted for the video segment may also be calculated, and the sensitivity attribute value may be determined based on the ratio, the sensitivity attribute value may be positively correlated with the ratio.
Further, when the video segment is detected, a preset type of the sensitive comment of the video segment can be determined, and then when the sensitive attribute value is determined, the ratio of each preset type of sensitive comment contained to the total number of comments issued for the video segment can be calculated, and then the obtained ratio is weighted and added according to preset weights corresponding to the preset types, so that the sensitive attribute value representing the sensitive degree of the video segment is obtained.
In an implementation manner, the determining, based on the sensitive comment of the video clip, a sensitive attribute value for characterizing the sensitivity level of the video clip, as the attribute value to be utilized of the video clip, may include the following steps:
Step A1, determining a first sensitive comment with sensitive semantics and a second sensitive comment containing sensitive words from the sensitive comments of the video clips;
sensitive reviews can be divided into two forms, one being reviews containing sensitive words and the other being reviews that do not contain sensitive words but have sensitive semantics, i.e., sensitive information in the semantics. In order to determine the second sensitive comment containing the sensitive word, a plurality of sensitive words may be preset, and further whether each comment to be detected contains the preset sensitive word may be detected, and if yes, the comment may be determined to be the second sensitive comment. To determine the first sensitive comment, it may be determined using a natural language understanding model or by means of a manual audit.
Step A2, determining the number of sensitive comments belonging to each sensitive category in the first sensitive comments, and calculating the ratio of the number to the total number of comments published for the video clip as a first ratio corresponding to each sensitive category;
in this step, the sensitivity category of the sensitive comment may be determined at the time of history detection, and for example, the sensitivity category may include: category related to violence, category containing low-custom information, category containing illegal violation information, and the like. And further, the number of the sensitive comments belonging to each sensitive category in the first sensitive comments can be determined, and the ratio of the number to the total number of comments published for the video clip is calculated to obtain a first ratio corresponding to the sensitive category. In this step, since the comments with the sensitive semantics generally need to be deleted, the deleted comments deleted due to the presence of the sensitive information may also be obtained first, and then, according to the number of sensitive comments belonging to each sensitive category in the deleted comment sensitive comments, the ratio of the number to the total number of comments posted for the video clip may be calculated as the first ratio corresponding to each sensitive category, which is also possible.
Step A3, determining the number of sensitive words belonging to each sensitivity degree in the second sensitive comment, and calculating the ratio of the number to the total number of all the sensitive words contained in all the second sensitive comments as a second ratio corresponding to each sensitivity degree;
in the step, a plurality of sensitive words can be preset, and corresponding sensitive degrees are set for each preset sensitive word, so that after the sensitive words in the video clips are determined, the corresponding sensitive degrees of the sensitive words can be further determined; wherein the sensitivity level may represent a sensitivity level of the sensitive word.
Step A4, calculating a weighted sum of a first ratio corresponding to each sensitive category according to preset weight of the sensitive category to obtain a category attribute value;
step A5, calculating a weighted sum of a second ratio corresponding to each sensitivity level according to preset weight of the sensitivity level to obtain a level attribute value;
and step A6, calculating the sum value of the obtained category attribute value and the degree attribute value to obtain a sensitivity attribute value for representing the sensitivity degree of the video fragment.
The preset weight of the sensitive category and the preset weight of the sensitive degree can be set according to experience and requirements.
In the implementation mode, the sensitivity attribute value for representing the sensitivity degree of the video segment is calculated by combining the sensitivity semantic meaning and the sensitivity word, and the obtained sensitivity attribute value can more accurately represent the sensitivity degree of the video segment. Of course, in order to improve the calculation efficiency, the above-mentioned category attribute value or degree attribute value may also be directly used alone as a sensitivity attribute value for characterizing the sensitivity level of the video segment, that is, the determined category attribute value may be directly used as a sensitivity attribute value for characterizing the sensitivity level of the video segment, or the determined degree attribute value may be directly used as a sensitivity attribute value for characterizing the sensitivity level of the video segment.
In addition, when the sensitivity attribute value representing the sensitivity degree of the video segment is calculated, information such as a target detection mode for detecting comments of the video segment of the target video can be combined with the video type of the target video, wherein the detection mode can comprise machine auditing, manual auditing and the like. For example, different preset weights may be preset for each video type, and different preset weights may be set for each detection mode, so as to determine the preset weights corresponding to the video type of the target video and/or the preset weights corresponding to the target detection mode, and after the sum value of the category attribute value and the degree attribute value is obtained, the preset weights corresponding to the video type of the target video and/or the preset weights corresponding to the target detection mode may be added, so as to obtain the sensitivity attribute value for representing the sensitivity degree of the video segment.
S103, determining sensitive video clips from the target video based on the target sensitive attribute values of the video clips; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
In one implementation, the determined attribute value to be utilized may be directly used as the target sensitive attribute value of the video clip.
In another implementation manner, before determining whether the video segment is a sensitive video segment based on the target sensitive attribute value of the video segment, the attribute value to be utilized of the video segment and an average value of at least one historical sensitive attribute value may be calculated to obtain the target sensitive attribute value of the video segment; wherein each history sensitive attribute value is: and a sensitivity attribute value for characterizing the sensitivity of the video segment, as determined by the historical moment.
Because comments in the video can be increased over time, each video segment in the target video can be detected multiple times, for example, the detection can be periodically performed to obtain history sensitive attribute values determined at multiple history moments. And further, the attribute values to be utilized and a plurality of historical sensitive attribute values are synthesized to determine a target sensitive attribute value, so that the obtained target sensitive attribute value can reflect the sensitivity degree of the video fragment more accurately.
After determining the target sensitive attribute value for each video segment, in one implementation, the video segment may be determined to be a sensitive video segment if the target sensitive attribute value is greater than a preset threshold. And if the target sensitive attribute value is not greater than a preset threshold value, determining that the video clip is not a sensitive video clip.
If the video clip is a sensitive video clip, the video clip can be marked, so that the auditing priority of the video clip can be improved in the subsequent auditing process of the video clip, that is, the video clip can be subjected to key detection in the subsequent auditing process. Because the number of videos required to be managed by the video platform is often more, it is difficult to detect all videos timely and comprehensively, so that sensitive video clips can be determined first, and then important detection can be performed on the sensitive video clips preferentially, so that the video detection efficiency is improved. For example, sometimes a comment is seen to be free of sensitive information alone, but is formed into a sensitive comment in a specific context, and the risk that such comment appears in a sensitive video clip is higher, so that key detection needs to be performed on the sensitive video clip, for example, the comment of the sensitive video clip can be detected in combination with video content of the sensitive video clip, such as a scenario, a speech, etc., and the detection mode can be manual detection or detection can be performed by using a natural language understanding model. In addition, the auditing priority of each video segment can be determined according to the determined target sensitive attribute value of each video segment, the higher the target sensitive attribute value is, the higher the auditing priority of the video segment is, and the video segment with higher auditing priority can be preferentially detected in the subsequent detection stage.
In this embodiment, the target sensitive attribute value of each video segment may be automatically determined based on the sensitive comment containing the sensitive information in each video segment, and then the sensitive video segment may be determined from the target video based on the target sensitive attribute value of each video segment.
In another embodiment of the present invention, the determining the sensitive video segment from the target video based on the target sensitive attribute value of each video segment may include the following steps:
step B1, performing relation fitting on the playing progress and the sensitive attribute value of the target video to obtain a mapping relation between the playing progress and the sensitive attribute value of the target video;
and B2, determining a range of the playing progress of which the sensitive attribute value is larger than a first sensitive threshold value in the target video according to the mapping relation, and obtaining the sensitive video fragment in the target video.
In this implementation manner, in order to determine the sensitive video segments more accurately, a relationship fitting may be performed on the playing progress and the sensitive attribute value of the target video according to the obtained target sensitive attribute values of each video segment, for example, for each video segment, a corresponding relationship between the playing progress of the target video corresponding to the center moment of the video segment and the target sensitive attribute value of the video segment may be established, and then a bayesian smoothing algorithm is used to perform a relationship fitting, so as to obtain a mapping relationship between the playing progress of the target video and the sensitive attribute value, that is, a functional relationship of the sensitive attribute value changing along with the playing progress, where the mapping relationship may be represented by a curve, as shown in fig. 2.
Furthermore, a range of the playing progress of which the sensitive attribute value is larger than the first sensitive threshold value in the target video can be determined according to the mapping relation and used as the sensitive video fragment in the target video. Specifically, according to the mapping relationship, a playing progress of the target video with a sensitive attribute value equal to the first sensitive threshold value can be determined as a target playing progress, and video clips between every two adjacent target playing progress are determined as intermediate video clips, so that intermediate video clips with any playing progress greater than the first sensitive threshold value except the endpoint playing progress are determined as sensitive video clips. The first sensitivity threshold may be preset according to experience and requirements.
In this embodiment, the target sensitive attribute value of each video segment can be automatically determined according to the historical sensitive comment of each video segment included in the target video to be detected, and then the sensitive video segment is determined from the target video based on the target sensitive attribute value of each video segment, so that the sensitive video segment is not required to be determined manually, and the efficiency of determining the sensitive video segment from the target video is improved.
In another embodiment of the present invention, the method for determining a sensitive video clip may further include the steps of:
Step C1, calculating the average value of the determined target sensitive attribute values of each video segment;
step C2, judging whether the obtained average value is larger than a second sensitivity threshold;
and C3, if the video is larger than the target video, determining that the target video is a sensitive video.
In this embodiment, the average value of the determined target sensitive attribute values of each video clip is calculated, and if the obtained average value is greater than the second sensitive threshold value, the target video is determined to be a sensitive video. It can be seen that by the scheme, whether the target video to be detected is a sensitive video can be automatically determined. And if the video is not larger than the sensitive video, determining that the target video is not the sensitive video.
In another embodiment of the present invention, the method for determining a sensitive video clip may further include the steps of:
step D1, obtaining comments of the determined sensitive information when the determined sensitive video clips are detected at the historical moment, and taking the comments as reference comments;
step D2, calculating the text similarity between the comment to be detected in the determined sensitive video fragment and the reference comment; the comments to be detected are comments except the determined sensitive comments in the video clip;
for example, when calculating the text similarity between the comment to be detected and the reference comment, vectorization processing may be performed on the comment to be detected and the reference comment to obtain a vector corresponding to the comment to be detected and the reference comment, and further, the cosine distance of the vector corresponding to the comment to be detected and the reference comment is calculated to obtain the text similarity between the comment to be detected and the reference comment.
And D3, determining the comment with the text similarity with the reference comment being larger than a preset similarity threshold value in the comments to be detected as the comment containing the sensitive information.
Because the sensitive video clips are video clips with larger risk of existence of sensitive comments, further risk prevention processing can be performed on the determined sensitive video clips, comments with the text similarity with the reference comments being larger than a preset similarity threshold value in the comments to be detected can be determined to be comments containing sensitive information, and further comments containing the sensitive information can be automatically deleted.
In the embodiment, the reference comment with the sensitive information determined when the determined sensitive video segment is detected at the historical moment is obtained, so that the text similarity between the comment to be detected in the sensitive video segment and the reference comment is calculated, and the comment to be detected with the text similarity larger than the preset similarity threshold is determined to be the comment containing the sensitive information, so that the comment containing the sensitive information in the sensitive video segment can be determined more accurately and comprehensively, the risk of a video platform can be reduced, and the network environment safety can be maintained better.
In a specific application scenario, the method for determining a sensitive video clip provided by the embodiment of the present invention may be as shown in fig. 3:
The comments posted during watching the video by the user are delivered to a content auditing end, and the content auditing end synchronizes the received comments, the video to which the comments belong, the video playing progress of the comments, the detection result of the comments and the like into a comment database, and groups the comments belonging to the same video according to the video to which the comments belong, namely groups the comments belonging to the same video; wherein the detection result may include: whether each comment is a result of a sensitive comment, a sensitive category of each sensitive comment, whether each comment is deleted, a detected sensitive word in each comment, a sensitive degree of the sensitive word and the like; in addition, the video type of the target video, the auditing mode of each comment and the like can be synchronized into the comment database.
When the sensitive segment detection is carried out on the target video, comments belonging to the target video and detection results of the comments can be obtained from the comment database, and according to the steps of the method for determining the sensitive video segments, the sensitive attribute values of all video segments contained in the target video are calculated, so that the sensitive video segments in the target video and whether the target video is the sensitive video or not are determined, and the sensitive video segments are used as the segment detection results, and can be used for assisting a content auditing end in carrying out video detection in the process of detecting the video afterwards.
In this embodiment, the target sensitive attribute value of each video segment may be automatically determined based on the sensitive comment containing the sensitive information in each video segment, and then the sensitive video segment may be determined from the target video based on the target sensitive attribute value of each video segment.
Based on the same inventive concept, the embodiment of the invention also provides a device for determining a sensitive video clip, as shown in fig. 4, the device comprises:
the sensitive comment determining module 401 is configured to determine, for each video clip included in the target video to be detected, a comment with sensitive information from comments posted for the video clip, as a sensitive comment of the video clip;
an attribute value determining module 402, configured to determine, based on the number of sensitive comments of the video segment, a sensitive attribute value for characterizing a sensitivity level of the video segment, as an attribute value to be utilized of the video segment; wherein the degree of sensitivity characterized by the determined sensitivity attribute value is positively correlated with the number of sensitive reviews for the video segment;
A sensitive segment determining module 403, configured to determine a sensitive video segment from the target video based on a target sensitive attribute value of each video segment; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
Optionally, the attribute value determining module 402 includes:
the comment determination submodule is used for determining a first sensitive comment which does not contain a sensitive word and has sensitive semantics and a second sensitive comment which contains the sensitive word from the sensitive comments of the video clip;
the first computing sub-module is used for determining the number of the sensitive comments belonging to each sensitive category in the first sensitive comments, and computing the ratio of the number to the total number of the comments published for the video clip as a first ratio corresponding to each sensitive category;
the second computing sub-module is used for determining the number of the sensitive words belonging to each sensitivity degree in the second sensitive comment, and computing the ratio of the number to the total number of all the sensitive words contained in all the second sensitive comments as a second ratio corresponding to each sensitivity degree;
a third calculating sub-module, configured to calculate, for each sensitive category, a weighted sum of first ratios corresponding to the sensitive category according to preset weights of the sensitive category, to obtain a category attribute value;
A fourth computing sub-module, configured to calculate, for each sensitivity level, a weighted sum of the second ratios corresponding to the sensitivity level according to a preset weight of the sensitivity level, to obtain a level attribute value;
and a fifth calculation sub-module, configured to calculate a sum value of the obtained category attribute value and the degree attribute value, to obtain a sensitivity attribute value for representing the sensitivity degree of the video clip.
Optionally, the sensitive fragment determination module 403 includes:
the relation determination submodule is used for carrying out relation fitting on the playing progress and the sensitive attribute value of the target video based on the target sensitive attribute value of each video segment to obtain the mapping relation between the playing progress and the sensitive attribute value of the target video;
and the sensitive video segment determining submodule is used for determining the range of the playing progress of the target video, in which the sensitive attribute value is larger than the first sensitive threshold value, according to the mapping relation to obtain the sensitive video segment in the target video.
Optionally, the apparatus further comprises:
the reference comment acquisition module is used for acquiring comments of the determined sensitive information when the determined sensitive video clips are detected at the historical moment and taking the comments as reference comments;
The similarity calculation module is used for calculating the text similarity between the comment to be detected in the determined sensitive video fragment and the reference comment; the comment to be detected is a comment except the determined sensitive comment in the video clip;
and the sensitive information comment determining module is used for determining the comment with the text similarity with the reference comment being larger than a preset similarity threshold value in the comments to be detected as the comment containing the sensitive information.
Optionally, the apparatus further comprises: the target sensitive attribute value determining module is configured to calculate, before the sensitive segment determining module 403 determines a sensitive video segment from the target video based on the target sensitive attribute value of each video segment, an attribute value to be utilized of the video segment and an average value of at least one historical sensitive attribute value to obtain a target sensitive attribute value of the video segment; wherein each history sensitive attribute value is: and a sensitivity attribute value for characterizing the sensitivity of the video segment, as determined by the historical moment.
Optionally, the apparatus further comprises:
the sensitive video determining module is used for calculating the average value of the determined target sensitive attribute values of each video segment; judging whether the obtained average value is larger than a second sensitivity threshold; and if the target video is larger than the sensitive video, determining the target video as the sensitive video.
The embodiment of the invention also provides an electronic device, as shown in fig. 5, which comprises a processor 501, a communication interface 502, a memory 503 and a communication bus 504, wherein the processor 501, the communication interface 502 and the memory 503 complete communication with each other through the communication bus 504,
a memory 503 for storing a computer program;
the processor 501 is configured to implement the steps of the method for determining a sensitive video clip according to any one of the above embodiments when executing the program stored in the memory 503.
The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (Peripheral Component Interconnect, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the terminal and other devices.
The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processor, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
In yet another embodiment of the present invention, a computer readable storage medium is provided, where a computer program is stored, the computer program, when executed by a processor, implements the method for determining a sensitive video clip according to any of the foregoing embodiments.
In a further embodiment of the present invention, a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of determining a sensitive video clip as described in any of the above embodiments is also provided.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims (10)

1. A method of determining a sensitive video clip, the method comprising:
for each video segment contained in the target video to be detected, determining comments with sensitive information from comments issued for the video segment, and taking the comments as sensitive comments of the video segment;
based on the sensitive comment of the video clip, determining a sensitive attribute value for representing the sensitivity degree of the video clip as an attribute value to be utilized of the video clip;
determining sensitive video clips from the target video based on target sensitive attribute values of the video clips; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
2. The method according to claim 1, wherein the determining a sensitivity attribute value for characterizing the sensitivity level of the video clip based on the sensitivity comment of the video clip, as the attribute value to be utilized for the video clip, comprises:
determining a first sensitive comment which does not contain a sensitive word and has sensitive semantics and a second sensitive comment which contains the sensitive word from the sensitive comments of the video clip;
Determining the number of sensitive comments belonging to each sensitive category in the first sensitive comments, and calculating the ratio of the number to the total number of comments published for the video clip as a first ratio corresponding to each sensitive category;
determining the number of sensitive words belonging to each sensitivity degree in the second sensitive comment, and calculating the ratio of the number to the total number of all sensitive words contained in all second sensitive comments as a second ratio corresponding to each sensitivity degree;
for each sensitive category, calculating a weighted sum of first ratios corresponding to the sensitive category according to preset weights of the sensitive category to obtain a category attribute value;
for each sensitivity degree, calculating a weighted sum of a second ratio corresponding to the sensitivity degree according to a preset weight of the sensitivity degree to obtain a degree attribute value;
and calculating the sum value of the obtained category attribute value and the degree attribute value to obtain a sensitivity attribute value for representing the sensitivity degree of the video clip.
3. The method of claim 1, wherein determining a sensitive video clip from the target video based on the target sensitive attribute value for each video clip comprises:
Performing relation fitting on the playing progress of the target video and the sensitive attribute value based on the target sensitive attribute value of each video segment to obtain a mapping relation between the playing progress of the target video and the sensitive attribute value;
and determining the range of the playing progress of which the sensitive attribute value is larger than a first sensitive threshold value in the target video according to the mapping relation to obtain the sensitive video fragment in the target video.
4. The method according to claim 1, wherein the method further comprises:
acquiring comments of the determined sensitive information when the determined sensitive video clips are detected at the historical moment, and taking the comments as reference comments;
calculating the text similarity between the comment to be detected in the determined sensitive video fragment and the reference comment; the comments to be detected are comments except the determined sensitive comments in the video clip;
and determining the comment with the text similarity with the reference comment being larger than a preset similarity threshold value in the comments to be detected as the comment containing the sensitive information.
5. The method of claim 1, wherein prior to determining sensitive video segments from the target video based on target sensitive attribute values for each video segment, the method further comprises:
Calculating the attribute value to be utilized of the video segment and the average value of at least one historical sensitive attribute value to obtain a target sensitive attribute value of the video segment; wherein each history sensitive attribute value is: and a sensitivity attribute value for characterizing the sensitivity of the video segment, as determined by the historical moment.
6. The method according to claim 1, wherein the method further comprises:
calculating the average value of the determined target sensitive attribute values of each video segment;
judging whether the obtained average value is larger than a second sensitivity threshold;
and if the target video is larger than the sensitive video, determining the target video as the sensitive video.
7. A device for determining a sensitive video clip, the device comprising:
the sensitive comment determining module is used for determining comments with sensitive information from comments posted for each video clip contained in the target video to be detected currently as sensitive comments of the video clip;
the attribute value determining module is used for determining a sensitive attribute value used for representing the sensitivity degree of the video clip based on the sensitive comment of the video clip, and the sensitive attribute value is used as an attribute value to be utilized of the video clip;
The sensitive segment determining module is used for determining sensitive video segments from the target video based on the target sensitive attribute values of the video segments; the target sensitive attribute value of each video clip is as follows: and determining based on the attribute value to be utilized of the video clip.
8. The apparatus of claim 7, wherein the attribute value determination module comprises:
the comment determination submodule is used for determining a first sensitive comment which does not contain a sensitive word and has sensitive semantics and a second sensitive comment which contains the sensitive word from the sensitive comments of the video clip;
the first computing sub-module is used for determining the number of the sensitive comments belonging to each sensitive category in the first sensitive comments, and computing the ratio of the number to the total number of the comments published for the video clip as a first ratio corresponding to each sensitive category;
the second computing sub-module is used for determining the number of the sensitive words belonging to each sensitivity degree in the second sensitive comment, and computing the ratio of the number to the total number of all the sensitive words contained in all the second sensitive comments as a second ratio corresponding to each sensitivity degree;
a third calculating sub-module, configured to calculate, for each sensitive category, a weighted sum of first ratios corresponding to the sensitive category according to preset weights of the sensitive category, to obtain a category attribute value;
A fourth computing sub-module, configured to calculate, for each sensitivity level, a weighted sum of the second ratios corresponding to the sensitivity level according to a preset weight of the sensitivity level, to obtain a level attribute value;
and a fifth calculation sub-module, configured to calculate a sum value of the obtained category attribute value and the degree attribute value, to obtain a sensitivity attribute value for representing the sensitivity degree of the video clip.
9. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for carrying out the method steps of any one of claims 1-6 when executing a program stored on a memory.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-6.
CN202311454738.0A 2023-11-03 2023-11-03 Method and device for determining sensitive video clips and electronic equipment Pending CN117528152A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202311454738.0A CN117528152A (en) 2023-11-03 2023-11-03 Method and device for determining sensitive video clips and electronic equipment
CN202410158244.6A CN118018813A (en) 2023-11-03 2024-02-04 Method and device for determining sensitive video clips and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311454738.0A CN117528152A (en) 2023-11-03 2023-11-03 Method and device for determining sensitive video clips and electronic equipment

Publications (1)

Publication Number Publication Date
CN117528152A true CN117528152A (en) 2024-02-06

Family

ID=89763671

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202311454738.0A Pending CN117528152A (en) 2023-11-03 2023-11-03 Method and device for determining sensitive video clips and electronic equipment
CN202410158244.6A Pending CN118018813A (en) 2023-11-03 2024-02-04 Method and device for determining sensitive video clips and electronic equipment

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202410158244.6A Pending CN118018813A (en) 2023-11-03 2024-02-04 Method and device for determining sensitive video clips and electronic equipment

Country Status (1)

Country Link
CN (2) CN117528152A (en)

Also Published As

Publication number Publication date
CN118018813A (en) 2024-05-10

Similar Documents

Publication Publication Date Title
CN110311902B (en) Abnormal behavior identification method and device and electronic equipment
CN110149540B (en) Recommendation processing method and device for multimedia resources, terminal and readable medium
CN109165691B (en) Training method and device for model for identifying cheating users and electronic equipment
WO2017113677A1 (en) User behavior data processing method and system
CN109190014B (en) Regular expression generation method and device and electronic equipment
CN108134944B (en) Identification method and device for anchor user with abnormal income and electronic equipment
CN111767713A (en) Keyword extraction method and device, electronic equipment and storage medium
US20150186502A1 (en) Method and apparatus and computer readable medium for computing string similarity metric
CN111125521A (en) Information recommendation method, device, equipment and storage medium
CN113591068B (en) Online login device management method and device and electronic device
CN111984867B (en) Network resource determining method and device
CN114331521A (en) Business data monitoring, analyzing and managing method, system, equipment and storage medium
CN111327609B (en) Data auditing method and device
CN113849760A (en) Sensitive information risk assessment method, system and storage medium
CN107766467A (en) Information detection method and device, electronic equipment and storage medium
CN112434717B (en) Model training method and device
CN110198490B (en) Live video theme classification method and device and electronic equipment
CN111400516A (en) Label determination method, electronic device and storage medium
CN112527615A (en) Equipment determination method and device, electronic equipment and storage medium
CN110442801B (en) Method and device for determining concerned users of target events
CN111860299A (en) Target object grade determining method and device, electronic equipment and storage medium
CN117528152A (en) Method and device for determining sensitive video clips and electronic equipment
CN113221845A (en) Advertisement auditing method, device, equipment and storage medium
CN110955760A (en) Evaluation method of judgment result and related device
CN115205619A (en) Training method, detection method, device and storage medium for detection model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20240206