CN106973305B - Method and device for detecting bad content in video - Google Patents

Method and device for detecting bad content in video Download PDF

Info

Publication number
CN106973305B
CN106973305B CN201710166928.0A CN201710166928A CN106973305B CN 106973305 B CN106973305 B CN 106973305B CN 201710166928 A CN201710166928 A CN 201710166928A CN 106973305 B CN106973305 B CN 106973305B
Authority
CN
China
Prior art keywords
detected
video file
video
content
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710166928.0A
Other languages
Chinese (zh)
Other versions
CN106973305A (en
Inventor
李应斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201710166928.0A priority Critical patent/CN106973305B/en
Publication of CN106973305A publication Critical patent/CN106973305A/en
Application granted granted Critical
Publication of CN106973305B publication Critical patent/CN106973305B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream

Abstract

The embodiment of the invention relates to the technical field of video processing, and discloses a method and a device for detecting bad content in a video, wherein the method comprises the following steps: acquiring a video file to be detected; carrying out video and audio separation on a video file to be detected to obtain audio information and image information; converting the audio information into first text content and converting the image information into second text content; merging and de-duplicating the first text content and the second text content to obtain target text content; comparing the target text content with the sensitive vocabulary list, finding out the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary; obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive words and the total word number of the target text content; processing the video file to be detected according to the bad content proportion value; by implementing the embodiment of the invention, the identification accuracy of the bad content in the video is improved, and the misjudgment rate of the bad video is reduced.

Description

Method and device for detecting bad content in video
Technical Field
The invention relates to the technical field of video processing, in particular to a method and a device for detecting bad content in a video.
Background
Network video goes deep into people's daily life and becomes a means for people to know knowledge and entertainment. The related range of the network video content is wide, the video content is not uniform, and bad content information such as violence, reaction or fraud is often mixed. With the spread of videos containing bad contents, the social order is disturbed, the social atmosphere is damaged, and the health growth of people, particularly teenagers, is greatly and negatively influenced. It is often necessary to examine the content of the network video to filter out network video with objectionable content. However, the information content of the video is usually large, and the existing filtering method cannot quickly search out the bad video, which is likely to cause the misjudgment of the bad video.
Disclosure of Invention
The embodiment of the invention discloses a method and a device for detecting bad contents in a video, which are used for improving the identification accuracy of the bad contents in the video and reducing the misjudgment rate of the bad video.
The invention discloses a method for detecting bad content in a video in a first aspect, which comprises the following steps:
acquiring a video file to be detected;
carrying out video and audio separation on the video file to be detected to obtain audio information and image information;
converting the audio information into first text content and converting the image information into second text content;
merging and de-duplicating the first text content and the second text content to obtain target text content;
comparing the target text content with the sensitive vocabulary list, finding out the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary;
obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive words and the total word number of the target text content;
and processing the video file to be detected according to the bad content proportion value.
As an optional implementation manner, in the first aspect of the present invention, the processing the video file to be detected according to the objectionable content ratio value includes:
when the bad content proportion value is smaller than or equal to a preset threshold value, determining that the video file to be detected is a video file with healthy content; and when the bad content proportion value is larger than the preset threshold value, starting a deleting program to delete the video file to be detected.
As an optional implementation manner, in the first aspect of the present invention, after the obtaining the video file to be detected, and before the performing video and audio separation on the video file to be detected to obtain the audio information and the image information, the method further includes:
acquiring the file name of the video file to be detected;
comparing the file name with the sensitive vocabulary list;
when the file name contains the sensitive words in the sensitive word list and the number of the contained sensitive words reaches a preset number, starting a deleting program to delete the video file to be detected;
and when the number of the sensitive words contained in the file name does not reach the preset number, executing the step of performing video and audio separation on the video file to be detected to obtain audio information and image information.
As an optional implementation manner, in the first aspect of the present invention, after the obtaining the video file to be detected, and before the performing video and audio separation on the video file to be detected to obtain the audio information and the image information, the method further includes:
acquiring source information of the video file to be detected;
judging whether the source address indicated by the source information is matched with one illegal source address in a preset illegal source address list or not;
if the video files are matched with the video files to be detected, a deleting program is started to delete the video files to be detected;
and if not, executing the step of carrying out video and audio separation on the video file to be detected to obtain audio information and image information.
As an optional implementation manner, in the first aspect of the present invention, the processing the video file to be detected according to the objectionable content ratio value includes:
when the bad content proportion value is larger than the preset threshold value, starting a deleting program to delete the video file to be detected;
when the bad content proportion value is smaller than or equal to a preset threshold value, extracting a plurality of continuous key frames from the video file to be detected, wherein the plurality of continuous key frames present a certain key scene in the video file to be detected;
acquiring the average motion intensity of the shots in the certain key scene;
judging whether the exercise intensity is greater than a preset intensity value;
if the motion intensity is larger than the preset intensity value, extracting image characteristic data and audio characteristic data from the continuous key frames;
when the image characteristic data is in a preset range of objectionable image characteristic data and the audio characteristic data is in a preset range of objectionable audio characteristic data, starting a deleting program to delete the video file to be detected;
and when the image characteristic data is not in a preset range of objectionable image characteristic data and the audio characteristic data is not in a preset range of objectionable audio characteristic data, determining the video file to be detected as a content health file.
The second aspect of the present invention discloses an apparatus for detecting objectionable content in a video, which may include:
the acquisition unit is used for acquiring a video file to be detected;
the separation unit is used for carrying out video and audio separation on the video file to be detected to obtain audio information and image information;
a text conversion unit for converting the audio information into a first text content and converting the image information into a second text content;
a merging and deduplication unit, configured to merge and deduplicate the first text content and the second text content to obtain a target text content;
the searching unit is used for comparing the target text content with the sensitive vocabulary list, searching the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary;
the calculating unit is used for obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive vocabulary and the total word number of the target text content;
and the processing unit is used for processing the video file to be detected according to the bad content proportion value.
As an optional implementation manner, in the second aspect of the present invention, the processing unit is configured to, according to the bad content ratio value, specifically, process the video file to be detected in a manner that:
the processing unit is used for determining that the video file to be detected is a video file with healthy content when the calculating unit determines that the ratio value of the bad content is smaller than or equal to a preset threshold value; and when the computing unit determines that the ratio value of the bad content is greater than the preset threshold value, starting a deleting program to delete the video file to be detected.
As an alternative embodiment, in the second aspect of the present invention, the apparatus further comprises:
the name detection unit is used for acquiring the file name of the video file to be detected and comparing the file name with the sensitive vocabulary list after the acquisition unit acquires the video file to be detected and before the separation unit performs video and audio separation on the video file to be detected and acquires audio information and image information;
the processing unit is further used for starting a deleting program to delete the video file to be detected when the name detection unit determines that the file name contains the sensitive words in the sensitive word list and the number of the contained sensitive words reaches a preset number;
the separation unit is further used for executing video and audio separation on the video file to be detected to obtain audio information and image information when the name detection unit determines that the number of the sensitive words contained in the file name does not reach the preset number.
As an alternative embodiment, in the second aspect of the present invention, the apparatus further comprises:
the source detection unit is used for acquiring the source information of the video file to be detected and judging whether a source address indicated by the source information is matched with one illegal source address in a preset illegal source address list or not after the acquisition unit acquires the video file to be detected and before the separation unit performs video and audio separation on the video file to be detected and acquires audio information and image information;
the processing unit is further used for starting a deleting program to delete the video file to be detected when the judgment result of the source detection unit is matched;
and the separation unit is also used for executing the video and audio separation of the video file to be detected to obtain audio information and image information when the judgment result of the source detection unit is not matched.
As an optional implementation manner, in the second aspect of the present invention, the processing unit is configured to, according to the bad content ratio value, specifically, process the video file to be detected in a manner that:
the processing unit is used for starting a deleting program to delete the video file to be detected when the calculating unit determines that the ratio value of the bad content is greater than the preset threshold value;
when the bad content proportion value is smaller than or equal to a preset threshold value, extracting a plurality of continuous key frames from the video file to be detected, wherein the plurality of continuous key frames present a certain key scene in the video file to be detected; acquiring the average motion intensity of the shots in the certain key scene; judging whether the exercise intensity is greater than a preset intensity value; if the motion intensity is larger than the preset intensity value, extracting image characteristic data and audio characteristic data from the continuous key frames; when the image characteristic data is in a preset range of objectionable image characteristic data and the audio characteristic data is in a preset range of objectionable audio characteristic data, starting a deleting program to delete the video file to be detected; and when the image characteristic data is not in a preset range of objectionable image characteristic data and the audio characteristic data is not in a preset range of objectionable audio characteristic data, determining the video file to be detected as a content health file.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
in the embodiment of the invention, a video file to be detected is obtained, an audio frequency and a video image in the video file to be detected are separated to obtain audio information and image information, then the audio information is converted into a first text content and the image information is converted into a second text content respectively, the first text content and the second text content are combined, and then duplication is removed to obtain a target text content. And then comparing the target text content with the sensitive vocabularies in the sensitive vocabulary list one by one, searching the sensitive vocabularies in the target text content and obtaining the total word number of all the sensitive vocabularies searched from the target text content, further obtaining the defective content proportion value of the video file to be detected according to the total word number of the sensitive vocabularies and the total word number of the target text content, and then processing the video file to be detected according to the defective content proportion value. By combining and de-duplicating the first text content and the second text content, the embodiment of the invention can ensure the uniqueness of the content in the target text content, improve the contrast speed and accuracy of the text content and the sensitive vocabulary list, improve the identification accuracy of the bad content in the video and reduce the misjudgment rate of the bad video.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for detecting objectionable content in a video according to an embodiment of the present invention;
FIG. 2 is another flow chart illustrating a method for detecting objectionable content in a video according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an apparatus for detecting objectionable content in a video according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an apparatus for detecting objectionable content in a video according to an embodiment of the present invention;
fig. 5 is another schematic structural diagram of an apparatus for detecting objectionable content in a video according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "comprises" and "comprising," and any variations thereof, of embodiments of the present invention are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the invention discloses a method for detecting bad content in a video, which is used for improving the identification accuracy of the bad content in the video and reducing the misjudgment rate of the bad video. The embodiment of the invention also discloses a device for detecting the bad content in the video.
The technical solution of the present invention will be described in detail with reference to the following embodiments.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a method for detecting objectionable content in a video according to an embodiment of the present invention; as shown in fig. 1, a method for detecting objectionable content in a video may include:
101. and acquiring the video file to be detected.
It will be appreciated that the video content of the video file to be detected is composed of audio and video images. In some embodiments, the video file to be detected may be a video file to be played by a user, and content detection is performed on the video file before playing, specifically including: and receiving a playing instruction input by a user for the video file to be detected, calling a video detection interface based on the playing instruction to start a detection device for the bad content in the video, and executing the step of acquiring the video file to be detected, wherein the video detection interface is implicitly associated with the detection device for the bad content in the video.
Further, after receiving a playing instruction input by a user for the video file to be detected and before calling a video detection interface to enable a detection device of bad content in a video based on the playing instruction, detecting whether the user allows the video detection interface to be called or not, and if the user does not allow the video detection interface to be called, prompting the user to enable a function of calling the video detection interface; after prompting the user to enable the function of calling the video detection interface, judging whether an enabling operation of the user for calling the function of the video detection interface is received, if so, executing a step of calling the video detection interface based on the playing instruction to enable a detection device of bad content in the video; if not, the video file to be detected is refused to be played.
In other embodiments, the video file to be detected is a video file that needs to be detected and is specified by a user, and step 101 specifically includes: and receiving a file name corresponding to the video file to be detected input by a user, and searching the video file to be detected corresponding to the file name in a video library to obtain the video file to be detected.
Further, after receiving a file name corresponding to the video file to be detected input by a user and before searching the video file to be detected corresponding to the file name in a video library to obtain the video file to be detected, detecting whether the user allows to call a video detection interface, wherein the video detection interface is implicitly associated with a detection device for bad content in a video, and if the user does not allow to call the video detection interface, prompting the user to enable the function of calling the video detection interface; after prompting the user to enable the function of calling the video detection interface, judging whether an enabling operation of the user for calling the function of the video detection interface is received, if so, executing a step of searching the video library for the video file to be detected corresponding to the file name to obtain the video file to be detected; if not, the flow ends.
102. And carrying out video and audio separation on the video file to be detected to obtain audio information and image information.
The audio information and the image information in the video file to be detected can be separated by adopting video editing software, for example, the video file to be detected is imported into a video track (time axis), then the audio is divided, namely the audio and the video image are divided, then the audio is stored as a file corresponding to an audio format to obtain the audio information, and the other audio is stored as an image file to obtain the image information.
103. The audio information is converted into a first text content and the image information is converted into a second text content.
As an optional implementation, converting the audio information into the first text content specifically includes:
and converting the voice contained in the audio information into text according to the time axis sequence of the audio information. Specifically, voices are sequentially extracted from the audio information according to a time axis sequence of the audio information, the voices are converted into texts through a Speech-to-text (STT) function or algorithm, and then text sentence breaking and typesetting are performed according to pauses of the voices in the audio information to obtain first text contents.
As an optional implementation, converting the image information into the second text content specifically includes:
and identifying the image information through an image identification tool to convert the image information into text content, so as to obtain the second text content.
104. And merging and de-duplicating the first text content and the second text content to obtain the target text content.
And combining the first text content and the second text content, and then removing duplication, namely removing repeated content, so that the target text content has no repeated content.
105. And comparing the target text content with the sensitive vocabulary list, finding out the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary.
Wherein the sensitive vocabulary list is established in advance. Specifically, a sensitive vocabulary basic database including bad contents such as violence, reaction, fraud and the like can be established, sensitive vocabularies related to violence, reaction, fraud and the like are automatically captured from a network and stored in the basic database, then vocabularies similar or similar to the sensitive vocabularies are captured and stored in the basic database, or various unhealthy vocabularies fed back by a user are obtained and stored in the basic database as sensitive vocabularies, finally the sensitive vocabularies in the basic database can be manually identified, and a sensitive vocabulary list is established for the finally determined sensitive vocabularies.
106. And obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive words and the total word number of the target text content.
The calculation formula of the bad content proportion value K is as follows:
K=N/M;
wherein, N is the total word number of all sensitive words in the sensitive word list included in the target text content, and M is the total word number of the target text content.
For example, if the target text content includes 5 sensitive words, 2 sensitive words include 2 words, and the other 3 sensitive words include 3 words, as compared with the list of sensitive words, the total number of words of all the sensitive words in the target text content is (in units of): 2 x 2+3 x 3 ═ 13 (pieces).
107. And processing the video file to be detected according to the bad content proportion value.
As an optional implementation manner, processing the video file to be detected according to the objectionable content ratio specifically includes:
when the bad content proportion value is smaller than or equal to a preset threshold value, determining that the video file to be detected is a video file with healthy content; and when the bad content proportion value is larger than a preset threshold value, starting a deleting program to delete the video file to be detected. In this embodiment, when the ratio of the objectionable content is greater than the preset threshold, it is determined that the video file to be detected contains most objectionable content, and the objectionable content exceeds the preset acceptable range, the video file to be detected is prohibited from being played, so that a deletion program is started to delete the video file to be detected, and the video file of the objectionable content is prevented from being transmitted over the network.
In the embodiment of the invention, a video file to be detected is obtained, an audio frequency and a video image in the video file to be detected are separated to obtain audio information and image information, then the audio information is converted into a first text content and the image information is converted into a second text content respectively, the first text content and the second text content are combined, and then duplication is removed to obtain a target text content. And then comparing the target text content with the sensitive vocabularies in the sensitive vocabulary list one by one, searching the sensitive vocabularies in the target text content and obtaining the total word number of all the sensitive vocabularies searched from the target text content, further obtaining the defective content proportion value of the video file to be detected according to the total word number of the sensitive vocabularies and the total word number of the target text content, and then processing the video file to be detected according to the defective content proportion value. By combining and de-duplicating the first text content and the second text content, the embodiment of the invention can ensure the uniqueness of the content in the target text content, improve the contrast speed and accuracy of the text content and the sensitive vocabulary list, improve the identification accuracy of the bad content in the video and reduce the misjudgment rate of the bad video.
Example two
Referring to fig. 2, fig. 2 is another schematic flow chart illustrating a method for detecting objectionable content in a video according to an embodiment of the present invention; as shown in fig. 2, a method for detecting objectionable content in a video may include:
201. and acquiring the video file to be detected.
Reference may be made to the detailed description in step 101, which is not described herein again.
202. And carrying out video and audio separation on the video file to be detected to obtain audio information and image information.
As an optional implementation manner, after the video file to be detected is acquired in step 201, and before the video and audio separation is performed on the video file to be detected in step 202 to obtain the audio information and the image information, the embodiment of the present invention further includes:
acquiring a file name of a video file to be detected;
comparing the file name with a sensitive vocabulary list;
when the file name contains the sensitive words in the sensitive word list and the number of the contained sensitive words reaches a preset number, starting a deleting program to delete the video file to be detected;
and when the number of the sensitive words contained in the file name does not reach the preset number, performing video and audio separation on the video file to be detected to obtain audio information and image information.
In the above embodiment, after the video file to be detected is obtained, the file name of the video file to be detected is further obtained, and if it is determined that the file name includes a certain number of sensitive words and the sensitive words are sensitive words in the sensitive word list, the video file to be detected is directly deleted.
It can also be understood that the file name is compared with the sensitive vocabulary list to obtain the sensitive vocabulary of the file name, then the total word number of all the sensitive vocabularies in the file name is obtained, the total word number is compared with the total word number of the file name to obtain a proportional value, and if the proportional value exceeds the specified value, the video file to be detected is deleted. If the value is less than or equal to the specified value, the step of performing video and audio separation on the video file to be detected to obtain audio information and image information can be further executed.
As another optional implementation manner, after the video file to be detected is acquired in step 201, and before the video and audio separation is performed on the video file to be detected in step 202 to obtain the audio information and the image information, the embodiment of the present invention further includes:
acquiring source information of a video file to be detected;
judging whether a source address indicated by the source information is matched with one illegal source address in a preset illegal source address list or not;
if the video files are matched with the video files to be detected, a deleting program is started to delete the video files to be detected;
and if not, executing the step of carrying out video and audio separation on the video file to be detected to obtain audio information and image information.
The source information includes a source internet protocol Address (IP), a source gateway, and the like. By the embodiment, whether the source of the video file to be detected is a legal source can be preliminarily judged through the source information of the video file to be detected, and then the steps of carrying out video and audio separation on the video file to be detected and obtaining audio information and image information are further executed. If the video file to be detected is an illegal source, the video file to be detected is a no-play video, and the video file to be detected is directly deleted to prevent the illegal video from being transmitted on the network.
As another optional implementation manner, after the video file to be detected is acquired in step 201, and before the video and audio separation is performed on the video file to be detected in step 202 to obtain the audio information and the image information, the embodiment of the present invention further includes:
acquiring source information of a video file to be detected;
judging whether a source address indicated by the source information is matched with one illegal source address in a preset illegal source address list or not;
if the video files are matched with the video files to be detected, a deleting program is started to delete the video files to be detected;
if not, acquiring the file name of the video file to be detected;
comparing the file name with a sensitive vocabulary list;
when the file name contains the sensitive words in the sensitive word list and the number of the contained sensitive words reaches a preset number, starting a deleting program to delete the video file to be detected;
and when the number of the sensitive words contained in the file name does not reach the preset number, performing video and audio separation on the video file to be detected to obtain audio information and image information.
Through the embodiment, the source and the file name of the video file to be detected can be combined, the primary draft filtering is carried out on the video file to be detected, and the misjudgment rate of bad videos can be reduced through multi-layer identification.
203. The audio information is converted into a first text content and the image information is converted into a second text content.
204. And merging and de-duplicating the first text content and the second text content to obtain the target text content.
205. And comparing the target text content with the sensitive vocabulary list, finding out the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary.
206. And obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive words and the total word number of the target text content.
207. And when the bad content proportion value is larger than a preset threshold value, starting a deleting program to delete the video file to be detected.
208. And when the bad content proportion value is smaller than or equal to a preset threshold value, extracting a plurality of continuous key frames from the video file to be detected, wherein the plurality of continuous key frames present a certain key scene in the video file to be detected.
In the embodiment of the invention, when the proportion value of the objectionable content is less than or equal to the preset threshold value, a method for correctly processing the video file to be detected is further obtained by combining a certain key scene, and the identification accuracy rate of the objectionable content in the video is improved.
209. And acquiring the average motion intensity of the shots in a certain key scene.
The average motion intensity of the shots is equal to the ratio of the sum of the motion intensities of all the shots in the scene to the number of the shots in the scene, and the specific calculation method is the prior art and is not described herein again.
210. And judging whether the exercise intensity is greater than a preset intensity value.
And if the exercise intensity is less than or equal to the preset intensity value, ending the process.
211. If the motion intensity is greater than a preset intensity value, image feature data and audio feature data are extracted from a plurality of consecutive key frames.
Wherein the image feature data comprises image feature data for each key frame and the audio feature data comprises audio feature data for the scene.
Specifically, the image feature data of each key frame comprises a color histogram of each key frame, and extracting the image feature data from a plurality of consecutive key frames comprises: and extracting a color histogram of each frame of image from a plurality of continuous key frames.
Specifically, the audio feature data includes a sample vector and a covariance matrix of the audio data. Further, the audio feature data may also include an energy entropy of the audio data.
212. And when the image characteristic data is in the preset range of the objectionable image characteristic data and the audio characteristic data is in the preset range of the objectionable audio characteristic data, starting a deleting program to delete the video file to be detected.
When the image feature data of each key frame comprises the color histogram of each key frame, determining that the image feature data is in a preset poor image feature data range comprises:
and when the statistical number of the preset number of colors in the color histogram of the key frame is determined to be within the statistical number range of the corresponding colors in the color histogram of the video frame extracted from the specific scene in advance, determining that the image characteristic data is within the preset range of the poor image characteristic data.
Determining that the audio characteristic data is within a preset range of objectionable audio characteristic data comprises: and calculating a sample vector and a covariance matrix of the audio data in the scene, and when the similarity between the sample vector and the covariance matrix of the audio data in the scene and the sample vector and the covariance matrix of the audio data extracted from the specific scene in advance is larger than a third preset threshold value, determining that the audio characteristic data is in a preset range of bad audio characteristic data.
When the audio feature data further includes energy entropy of the audio data, determining that the audio feature data is within a preset objectionable audio feature data range includes: dividing the audio data in the scene into multiple sections, calculating the energy entropy of each section of audio data, and determining that the audio characteristic data is in a preset range of bad audio characteristic data when the energy entropy of at least one section of audio data in the energy entropies of the multiple sections of audio data is smaller than a fourth preset threshold value.
213. And when the image characteristic data is not in the preset range of the objectionable image characteristic data and the audio characteristic data is not in the preset range of the objectionable audio characteristic data, determining the video file to be detected as a content health file.
It can be seen that, in the embodiment of the present invention, when the ratio of the objectionable content is less than or equal to the preset threshold, the audio information and the image information in the video file to be detected are further analyzed, so as to further analyze the specific gravity of the objectionable content contained therein, thereby improving the accuracy of determining the objectionable video.
EXAMPLE III
Referring to fig. 3, fig. 3 is a schematic structural diagram of an apparatus for detecting objectionable content in a video according to an embodiment of the present invention; as shown in fig. 3, an apparatus for detecting objectionable content in a video may comprise:
an obtaining unit 310, configured to obtain a video file to be detected;
the separation unit 320 is configured to perform video and audio separation on the video file to be detected, so as to obtain audio information and image information;
a text conversion unit 330 for converting the audio information into first text content and converting the image information into second text content;
a merging and deduplication unit 340, configured to merge and deduplicate the first text content and the second text content to obtain a target text content;
the searching unit 350 is configured to compare the target text content with the sensitive vocabulary list, search for a sensitive vocabulary in the target text content, and obtain a total word number of the sensitive vocabulary;
the calculating unit 360 is used for obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive vocabulary and the total word number of the target text content;
and the processing unit 370 is configured to process the video file to be detected according to the objectionable content ratio value.
In the embodiment of the present invention, the obtaining unit 310 obtains a video file to be detected, the separating unit 320 separates an audio image and a video image in the video file to be detected to obtain audio information and image information, the text converting unit 330 converts the audio information into a first text content and the image information into a second text content, respectively, and the merging and deduplication unit 340 merges the first text content and the second text content and then deduplicates the first text content and the second text content to obtain a target text content. Then, the searching unit 350 compares the target text content with the sensitive words in the list of sensitive words one by one, searches for the sensitive words in the target text content and obtains the total word count of all the sensitive words searched from the target text content, the calculating unit 360 further obtains the bad content proportion value of the video file to be detected according to the total word count of the sensitive words and the total word count of the target text content, and the processing unit 370 processes the video file to be detected according to the bad content proportion value. By combining and de-duplicating the first text content and the second text content, the embodiment of the invention can ensure the uniqueness of the content in the target text content, improve the contrast speed and accuracy of the text content and the sensitive vocabulary list, improve the identification accuracy of the bad content in the video and reduce the misjudgment rate of the bad video.
As an optional implementation manner, the video file to be detected may be a video file to be played by a user, content detection is performed on the video file before playing, and the manner for acquiring the video file to be detected by the acquiring unit 310 is specifically: the obtaining unit 310 is configured to receive a play instruction input by a user for the video file to be detected, and based on the play instruction, invoke a video detection interface to enable a detection device for detecting undesirable content in a video, and execute obtaining of the video file to be detected, where the video detection interface is implicitly associated with the detection device for detecting undesirable content in the video.
Further, after receiving a play instruction input by a user for the video file to be detected and before invoking a video detection interface to enable a detection device for objectionable content in a video based on the play instruction, the obtaining unit 310 detects whether the user allows invoking the video detection interface, and if the user does not allow invoking the video detection interface, prompts the user to enable a function of invoking the video detection interface; after prompting the user to enable the function of calling the video detection interface, judging whether an enabling operation of the user for calling the function of the video detection interface is received, if so, executing a detection device for calling the video detection interface to enable bad content in the video based on the playing instruction; if not, the video file to be detected is refused to be played.
In other embodiments, the video file to be detected is a video file that needs to be detected and is specified by a user, and the obtaining unit 310 is specifically configured to receive a file name corresponding to the video file to be detected and input by the user, and search the video file to be detected and corresponding to the file name in a video library to obtain the video file to be detected.
Further, after receiving a file name corresponding to the video file to be detected input by the user and before searching the video file to be detected corresponding to the file name in the video library to obtain the video file to be detected, the obtaining unit 310 detects whether the user allows to invoke a video detection interface, where the video detection interface is implicitly associated with a detection device for undesirable content in the video, and if the user does not allow to invoke the video detection interface, prompts the user to enable a function of invoking the video detection interface; after prompting the user to enable the function of calling the video detection interface, judging whether an enabling operation of the user for calling the function of the video detection interface is received, if so, searching the video library for the video file to be detected corresponding to the file name to obtain the video file to be detected; if not, the flow ends.
As an optional implementation manner, the processing unit 370 is configured to process the video file to be detected according to the objectionable content ratio value in a specific manner:
the processing unit 370 is configured to determine that the video file to be detected is a video file with healthy content when the calculating unit 360 determines that the ratio of the objectionable content is smaller than or equal to the preset threshold; when the calculating unit 360 determines that the ratio of the bad content is greater than the preset threshold, a deleting program is started to delete the video file to be detected.
In the above embodiment, when the ratio of the objectionable content is greater than the preset threshold, it indicates that the video file to be detected contains most objectionable content, and the objectionable content exceeds the preset acceptable range (preset threshold), the video file to be detected is prohibited from being played, so as to start a deletion program and delete the video file to be detected, so as to prevent the video file of the objectionable content from being transmitted over the network.
As an optional implementation manner, the manner for converting the audio information into the first text content by the merging and deduplication unit 340 is specifically: the merging and deduplication unit 340 is configured to convert the speech contained in the audio information into text according to the time axis sequence of the audio information. Specifically, the merging and deduplication unit 340 is configured to sequentially extract voices from the audio information according to a time axis sequence of the audio information, convert the voices into texts through a Speech-to-text (STT) function or algorithm, and perform text sentence-breaking and typesetting according to pauses of the voices in the audio information to obtain the first text content.
As an optional implementation manner, the manner for converting the image information into the second text content by the merging and deduplication unit 340 is specifically: the merging and deduplication unit 340 is configured to recognize the image information through an image recognition tool to convert the image information into text content, so as to obtain the second text content.
Example four
Referring to fig. 4, fig. 4 is another schematic structural diagram of an apparatus for detecting objectionable content in a video according to an embodiment of the present invention; the apparatus for detecting the defective content in the video shown in fig. 4 is optimized based on the apparatus for detecting the defective content in the video shown in fig. 3, and as shown in fig. 4, the apparatus for detecting the defective content in the video further includes:
the name detection unit 410 is configured to, after the acquisition unit 310 acquires the video file to be detected, and before the separation unit 320 performs video-audio separation on the video file to be detected to obtain audio information and image information, acquire a file name of the video file to be detected, and compare the file name with the sensitive vocabulary list;
the processing unit 370 is further configured to, when the name detecting unit 410 determines that the file name includes a sensitive vocabulary in the sensitive vocabulary list and the number of the included sensitive vocabularies reaches a preset number, start a deleting program to delete the video file to be detected;
the separating unit 320 is further configured to, when the name detecting unit 410 determines that the number of the sensitive words included in the file name does not reach the preset number, perform video and audio separation on the video file to be detected to obtain audio information and image information.
The separating unit 320 performs video and audio separation on the video file to be detected, and the manner of obtaining the audio information and the image information specifically includes: the separating unit 320 imports the video file to be detected into a video track (time axis) of the video editing software, then divides the audio, i.e., divides the audio and the video image, then stores the audio as a file corresponding to the audio format to obtain audio information, and stores the audio as an image file to obtain image information.
EXAMPLE five
Referring to fig. 5, fig. 5 is another schematic structural diagram of an apparatus for detecting objectionable content in a video according to an embodiment of the present invention; the apparatus for detecting the defective content in the video shown in fig. 5 is optimized based on the apparatus for detecting the defective content in the video shown in fig. 3, and as shown in fig. 5, the apparatus for detecting the defective content in the video further includes:
a source detecting unit 510, configured to, after the obtaining unit 310 obtains the video file to be detected, and before the separating unit 320 performs video and audio separation on the video file to be detected, and obtains audio information and image information, obtain source information of the video file to be detected, and determine whether a source address indicated by the source information matches with an illegal source address in a preset illegal source address list;
the processing unit 370 is further configured to, when the determination result of the source detecting unit 510 is a match, start a deleting program to delete the video file to be detected;
the separating unit 320 is further configured to, when the determination result of the source detecting unit 510 is not matching, perform video and audio separation on the video file to be detected to obtain audio information and image information.
As an optional implementation manner, the processing unit 370 is configured to process the video file to be detected according to the objectionable content ratio value in a specific manner:
the processing unit 370 is configured to, when the calculating unit 360 determines that the ratio of the objectionable content is greater than the preset threshold, start a deleting program to delete the video file to be detected;
when the calculating unit 360 determines that the ratio value of the bad content is smaller than or equal to a preset threshold value, extracting video key frames from the video file to be detected; extracting motion characteristic information of the video key frame, wherein the motion characteristic information is used for representing the motion intensity presented by a lens of the video key frame; judging whether the exercise intensity is greater than a preset intensity value; if the motion intensity is larger than the preset intensity value, extracting image characteristic data and audio characteristic data from the video key frame; when the image characteristic data is in a preset range of the objectionable image characteristic data and the audio characteristic data is in a preset range of the objectionable audio characteristic data, starting a deleting program to delete the video file to be detected; and when the image characteristic data is not in the preset range of the objectionable image characteristic data and the audio characteristic data is not in the preset range of the objectionable audio characteristic data, determining the video file to be detected as a content health file.
When the image feature data of each key frame includes the color histogram of each key frame, the manner for determining that the image feature data is in the preset poor image feature data range by the processing unit 370 is specifically:
the processing unit 370 determines that the image feature data is within the preset range of poor image feature data when determining that the statistical number of the preset number of colors in the color histogram of the key frame is within the statistical number range of the corresponding colors in the color histogram of the video frame extracted from the specific scene in advance.
The processing unit 370 is configured to determine that the audio characteristic data is within the preset poor audio characteristic data range specifically as follows: the processing unit 370 is configured to calculate a sample vector and a covariance matrix of the audio data in the scene, and when it is determined that the similarity between the sample vector and the covariance matrix of the audio data in the scene and the sample vector and the covariance matrix of the audio data extracted from the specific scene in advance is greater than a third preset threshold, it is determined that the audio feature data is within a preset range of bad audio feature data.
When the audio feature data further includes the energy entropy of the audio data, the processing unit 370 is configured to determine that the audio feature data is within the preset range of the objectionable audio feature data by: the processing unit 370 is configured to divide the audio data in the scene into multiple segments, calculate an energy entropy of each segment of the audio data, and determine that the audio feature data is within a preset range of objectionable audio feature data when the energy entropy of at least one segment of the audio data in the multiple segments of the audio data is smaller than a fourth preset threshold.
By implementing the device, the uniqueness of the content in the target text content can be ensured by combining and de-duplicating the first text content and the second text content, the contrast speed and accuracy of the text content and the sensitive vocabulary list are improved, the identification accuracy of the bad content in the video is improved, and the misjudgment rate of the bad video is reduced. And when the proportion value of the objectionable content is smaller than or equal to the preset threshold value, further analyzing the audio information and the image information in the video file to be detected so as to further analyze the proportion of the objectionable content contained in the video file to improve the accurate judgment rate of the objectionable video.
It will be understood by those skilled in the art that all or part of the steps in the methods of the embodiments described above may be implemented by instructions associated with a program, which may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), compact disc-Read-Only Memory (CD-ROM), or other Memory, magnetic disk, magnetic tape, or magnetic tape, Or any other medium which can be used to carry or store data and which can be read by a computer.
The method and the device for detecting the bad content in the video disclosed by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A method for detecting objectionable content in a video, comprising:
acquiring a video file to be detected;
carrying out video and audio separation on the video file to be detected to obtain audio information and image information;
converting voice contained in the audio information into text according to a time axis sequence of the audio information, converting the audio information into first text content, identifying image information to convert the image information into text content, and converting the image information into second text content;
merging and de-duplicating the first text content and the second text content to obtain target text content;
comparing the target text content with the sensitive vocabulary list, finding out the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary;
obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive words and the total word number of the target text content;
and processing the video file to be detected according to the bad content proportion value.
2. The method according to claim 1, wherein said processing the video file to be detected according to the bad content ratio value comprises:
when the bad content proportion value is smaller than or equal to a preset threshold value, determining that the video file to be detected is a video file with healthy content; and when the bad content proportion value is larger than the preset threshold value, starting a deleting program to delete the video file to be detected.
3. The method according to claim 1 or 2, wherein after the acquiring the video file to be detected, and before the performing video-audio separation on the video file to be detected to obtain audio information and image information, the method further comprises:
acquiring the file name of the video file to be detected;
comparing the file name with the sensitive vocabulary list;
when the file name contains the sensitive words in the sensitive word list and the number of the contained sensitive words reaches a preset number, starting a deleting program to delete the video file to be detected;
and when the number of the sensitive words contained in the file name does not reach the preset number, executing the step of performing video and audio separation on the video file to be detected to obtain audio information and image information.
4. The method according to claim 1 or 2, wherein after the acquiring the video file to be detected, and before the performing video-audio separation on the video file to be detected to obtain audio information and image information, the method further comprises:
acquiring source information of the video file to be detected;
judging whether the source address indicated by the source information is matched with one illegal source address in a preset illegal source address list or not;
if the video files are matched with the video files to be detected, a deleting program is started to delete the video files to be detected;
and if not, executing the step of carrying out video and audio separation on the video file to be detected to obtain audio information and image information.
5. The method according to claim 1, wherein said processing the video file to be detected according to the bad content ratio value comprises:
when the bad content proportion value is larger than a preset threshold value, starting a deleting program to delete the video file to be detected;
when the bad content proportion value is smaller than or equal to a preset threshold value, extracting a plurality of continuous key frames from the video file to be detected, wherein the plurality of continuous key frames present a certain key scene in the video file to be detected;
acquiring the average motion intensity of the shots in the certain key scene;
judging whether the exercise intensity is greater than a preset intensity value;
if the motion intensity is larger than the preset intensity value, extracting image characteristic data and audio characteristic data from the continuous key frames;
when the image characteristic data is in a preset range of objectionable image characteristic data and the audio characteristic data is in a preset range of objectionable audio characteristic data, starting a deleting program to delete the video file to be detected;
and when the image characteristic data is not in a preset range of objectionable image characteristic data and the audio characteristic data is not in a preset range of objectionable audio characteristic data, determining the video file to be detected as a content health file.
6. An apparatus for detecting objectionable content in a video, comprising:
the acquisition unit is used for acquiring a video file to be detected;
the separation unit is used for carrying out video and audio separation on the video file to be detected to obtain audio information and image information;
a text conversion unit for converting the audio information into a first text content and converting the image information into a second text content;
the merging and de-duplicating unit is used for converting the voice contained in the audio information into first text content according to the time shaft sequence of the audio information, identifying the image information to convert the image information into second text content, and merging and de-duplicating the first text content and the second text content to obtain target text content;
the searching unit is used for comparing the target text content with the sensitive vocabulary list, searching the sensitive vocabulary in the target text content and obtaining the total word number of the sensitive vocabulary;
the calculating unit is used for obtaining a bad content proportion value of the video file to be detected according to the total word number of the sensitive vocabulary and the total word number of the target text content;
and the processing unit is used for processing the video file to be detected according to the bad content proportion value.
7. The apparatus according to claim 6, wherein the processing unit is configured to process the video file to be detected according to the bad content ratio value by:
the processing unit is used for determining that the video file to be detected is a video file with healthy content when the calculating unit determines that the ratio value of the bad content is smaller than or equal to a preset threshold value; and when the computing unit determines that the ratio value of the bad content is greater than the preset threshold value, starting a deleting program to delete the video file to be detected.
8. The apparatus of claim 6 or 7, further comprising:
the name detection unit is used for acquiring the file name of the video file to be detected and comparing the file name with the sensitive vocabulary list after the acquisition unit acquires the video file to be detected and before the separation unit performs video and audio separation on the video file to be detected and acquires audio information and image information;
the processing unit is further used for starting a deleting program to delete the video file to be detected when the name detection unit determines that the file name contains the sensitive words in the sensitive word list and the number of the contained sensitive words reaches a preset number;
the separation unit is further used for executing video and audio separation on the video file to be detected to obtain audio information and image information when the name detection unit determines that the number of the sensitive words contained in the file name does not reach the preset number.
9. The apparatus of claim 6 or 7, further comprising:
the source detection unit is used for acquiring the source information of the video file to be detected and judging whether a source address indicated by the source information is matched with one illegal source address in a preset illegal source address list or not after the acquisition unit acquires the video file to be detected and before the separation unit performs video and audio separation on the video file to be detected and acquires audio information and image information;
the processing unit is further used for starting a deleting program to delete the video file to be detected when the judgment result of the source detection unit is matched;
and the separation unit is also used for executing the video and audio separation of the video file to be detected to obtain audio information and image information when the judgment result of the source detection unit is not matched.
10. The apparatus according to claim 6, wherein the processing unit is configured to process the video file to be detected according to the bad content ratio value by:
the processing unit is used for starting a deleting program to delete the video file to be detected when the calculating unit determines that the ratio value of the bad content is greater than a preset threshold value;
when the bad content proportion value is smaller than or equal to a preset threshold value, extracting a plurality of continuous key frames from the video file to be detected, wherein the plurality of continuous key frames present a certain key scene in the video file to be detected; acquiring the average motion intensity of the shots in the certain key scene; judging whether the exercise intensity is greater than a preset intensity value; if the motion intensity is larger than the preset intensity value, extracting image characteristic data and audio characteristic data from the continuous key frames; when the image characteristic data is in a preset range of objectionable image characteristic data and the audio characteristic data is in a preset range of objectionable audio characteristic data, starting a deleting program to delete the video file to be detected; and when the image characteristic data is not in a preset range of objectionable image characteristic data and the audio characteristic data is not in a preset range of objectionable audio characteristic data, determining the video file to be detected as a content health file.
CN201710166928.0A 2017-03-20 2017-03-20 Method and device for detecting bad content in video Active CN106973305B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710166928.0A CN106973305B (en) 2017-03-20 2017-03-20 Method and device for detecting bad content in video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710166928.0A CN106973305B (en) 2017-03-20 2017-03-20 Method and device for detecting bad content in video

Publications (2)

Publication Number Publication Date
CN106973305A CN106973305A (en) 2017-07-21
CN106973305B true CN106973305B (en) 2020-02-07

Family

ID=59330133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710166928.0A Active CN106973305B (en) 2017-03-20 2017-03-20 Method and device for detecting bad content in video

Country Status (1)

Country Link
CN (1) CN106973305B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107396139A (en) * 2017-08-22 2017-11-24 无锡天脉聚源传媒科技有限公司 A kind of method for processing video frequency and device
CN110020256A (en) * 2017-12-30 2019-07-16 惠州学院 The method and system of the harmful video of identification based on User ID and trailer content
CN110020252B (en) * 2017-12-30 2022-04-22 惠州学院 Method and system for identifying harmful video based on trailer content
CN110020254A (en) * 2017-12-30 2019-07-16 惠州学院 The method and system of the harmful video of identification based on User IP and video copy
CN110020255A (en) * 2017-12-30 2019-07-16 惠州学院 A kind of method and its system identifying harmful video based on User IP
CN110020257A (en) * 2017-12-30 2019-07-16 惠州学院 The method and system of the harmful video of identification based on User ID and video copy
CN110020251A (en) * 2017-12-30 2019-07-16 惠州学院 The method and system of the harmful video of identification based on User IP and trailer content
CN110119735A (en) * 2018-02-06 2019-08-13 上海全土豆文化传播有限公司 The character detecting method and device of video
CN108256513A (en) * 2018-03-23 2018-07-06 中国科学院长春光学精密机械与物理研究所 A kind of intelligent video analysis method and intelligent video record system
CN108683629A (en) * 2018-04-02 2018-10-19 东方视界科技(北京)有限公司 Transmission of video, playback method, storage medium, processor and terminal
CN108805069A (en) * 2018-06-04 2018-11-13 上海东方报业有限公司 Image detection method and device
CN109040782A (en) * 2018-08-29 2018-12-18 百度在线网络技术(北京)有限公司 Video playing processing method, device and electronic equipment
CN110019817A (en) * 2018-12-04 2019-07-16 阿里巴巴集团控股有限公司 A kind of detection method, device and the electronic equipment of text in video information
CN109729383B (en) * 2019-01-04 2021-11-02 深圳壹账通智能科技有限公司 Double-recording video quality detection method and device, computer equipment and storage medium
CN111882371A (en) * 2019-04-15 2020-11-03 阿里巴巴集团控股有限公司 Content information processing method, image-text content processing method, computer device, and medium
CN110851605A (en) * 2019-11-14 2020-02-28 携程计算机技术(上海)有限公司 Detection method and system for image-text information matching of OTA hotel and electronic equipment
CN111126373A (en) * 2019-12-23 2020-05-08 北京中科神探科技有限公司 Internet short video violation judgment device and method based on cross-modal identification technology
CN111586421A (en) * 2020-01-20 2020-08-25 全息空间(深圳)智能科技有限公司 Method, system and storage medium for auditing live broadcast platform information
CN111835739A (en) * 2020-06-30 2020-10-27 北京小米松果电子有限公司 Video playing method and device and computer readable storage medium
CN111935541B (en) * 2020-08-12 2021-10-01 北京字节跳动网络技术有限公司 Video correction method and device, readable medium and electronic equipment
CN112529390A (en) * 2020-12-02 2021-03-19 平安医疗健康管理股份有限公司 Task allocation method and device, computer equipment and storage medium
CN113591111B (en) * 2021-07-27 2022-10-25 展讯半导体(南京)有限公司 Audio data processing method and device, computer readable storage medium and terminal
CN114245205B (en) * 2022-02-23 2022-05-24 达维信息技术(深圳)有限公司 Video data processing method and system based on digital asset management
CN114979594B (en) * 2022-05-13 2023-05-30 深圳市和天创科技有限公司 Intelligent ground color adjusting system of monolithic liquid crystal projector
CN116644212B (en) * 2023-07-24 2023-12-01 科大讯飞股份有限公司 Video detection method, device, equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101035280A (en) * 2007-04-19 2007-09-12 鲍东山 Classified content auditing terminal system
CN101834982A (en) * 2010-05-28 2010-09-15 上海交通大学 Hierarchical screening method of violent videos based on multiplex mode
CN102014295A (en) * 2010-11-19 2011-04-13 嘉兴学院 Network sensitive video detection method
CN103218608A (en) * 2013-04-19 2013-07-24 中国科学院自动化研究所 Network violent video identification method
CN105812921A (en) * 2016-04-26 2016-07-27 Tcl海外电子(惠州)有限公司 Method and terminal for controlling media information play
CN105979359A (en) * 2016-06-24 2016-09-28 中国人民解放军63888部队 Video output control method and device based on content detection

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101035280A (en) * 2007-04-19 2007-09-12 鲍东山 Classified content auditing terminal system
CN101834982A (en) * 2010-05-28 2010-09-15 上海交通大学 Hierarchical screening method of violent videos based on multiplex mode
CN102014295A (en) * 2010-11-19 2011-04-13 嘉兴学院 Network sensitive video detection method
CN103218608A (en) * 2013-04-19 2013-07-24 中国科学院自动化研究所 Network violent video identification method
CN105812921A (en) * 2016-04-26 2016-07-27 Tcl海外电子(惠州)有限公司 Method and terminal for controlling media information play
CN105979359A (en) * 2016-06-24 2016-09-28 中国人民解放军63888部队 Video output control method and device based on content detection

Also Published As

Publication number Publication date
CN106973305A (en) 2017-07-21

Similar Documents

Publication Publication Date Title
CN106973305B (en) Method and device for detecting bad content in video
CN110147726B (en) Service quality inspection method and device, storage medium and electronic device
US9798934B2 (en) Method and apparatus for providing combined-summary in imaging apparatus
CN110148400B (en) Pronunciation type recognition method, model training method, device and equipment
CN104598644B (en) Favorite label mining method and device
JP4600828B2 (en) Document association apparatus and document association method
CN110503961B (en) Audio recognition method and device, storage medium and electronic equipment
CN106601243B (en) Video file identification method and device
CN110008378B (en) Corpus collection method, device, equipment and storage medium based on artificial intelligence
CN107609149B (en) Video positioning method and device
CN114465737B (en) Data processing method and device, computer equipment and storage medium
CN111785279A (en) Video speaker identification method and device, computer equipment and storage medium
EP3905084A1 (en) Method and device for detecting malware
CN109003600B (en) Message processing method and device
CN111488813B (en) Video emotion marking method and device, electronic equipment and storage medium
KR20060089922A (en) Data abstraction apparatus by using speech recognition and method thereof
CN110795597A (en) Video keyword determination method, video retrieval method, video keyword determination device, video retrieval device, storage medium and terminal
JP6344849B2 (en) Video classifier learning device and program
CN113761269B (en) Audio recognition method, apparatus and computer readable storage medium
CN114049898A (en) Audio extraction method, device, equipment and storage medium
CN115331703A (en) Song voice detection method and device
CN113628637A (en) Audio identification method, device, equipment and storage medium
CN108882033B (en) Character recognition method, device, equipment and medium based on video voice
CN109710735B (en) Reading content recommendation method based on multiple social channels and electronic equipment
CN113420178A (en) Data processing method and equipment

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant