CN110020257A - The method and system of the harmful video of identification based on User ID and video copy - Google Patents
The method and system of the harmful video of identification based on User ID and video copy Download PDFInfo
- Publication number
- CN110020257A CN110020257A CN201711500073.7A CN201711500073A CN110020257A CN 110020257 A CN110020257 A CN 110020257A CN 201711500073 A CN201711500073 A CN 201711500073A CN 110020257 A CN110020257 A CN 110020257A
- Authority
- CN
- China
- Prior art keywords
- video
- weight factor
- url
- database
- address
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
Abstract
A kind of method and system identifying harmful video, its method includes: when the page elements for judging webpage include the path URL of video, identify the User ID recorded in the content of pages of the webpage, the IP address that the path URL according to video obtains the domain name for including in the URL or the URL is directed toward, and the first weight factor, the second weight factor are exported based on the relevant inquiring of the User ID, IP address and domain name;And video file when obtaining minimum image quality, and video copy detection is carried out to the video file of the minimum image quality in preset harmful video database, third weight factor is exported according to the result of monitoring;Comprehensive first weight factor and the second weight factor and third weight factor, identify to whether the video belongs to harmful video.The disclosure can provide a kind of scheme for identifying harmful video using various modes in conjunction with the database that big data is made, use image processing means few as far as possible.
Description
Technical field
The disclosure belongs to information security field, such as is related to a kind of method and its system for identifying harmful video.
Background technique
In information-intensive society, it is full of information flow, including but not limited to text, video, audio, picture etc. everywhere.Wherein, video
File frequently includes auditory information and visual information, and ability to express is more comprehensive.However, with universal, the net of mobile Internet
A large amount of harmful video contents are full of on network, the features such as due to vision intuitive, impact, harmfulness is more more than harmful text
Originally, harmful picture and harmful audio etc., therefore these harmful videos are identified, and then be filtered, delete, eliminating danger
Evil, is very necessary.
Identification for network nocuousness video, present technology mainly have and can be divided into two major classes, one is conventional method,
It wherein again include two classes: (1) recognition methods based on single mode feature.Such methods are mainly to extract the visual signature of video,
According to these features come structural classification device.Such as in violence video identification, common feature have video motion vector, color,
Texture and shape etc..(2) recognition methods based on multi-modal Fusion Features, such methods are mainly to extract multiple moulds of video
The feature of state is merged with structural classification device.Such as in violence video identification, other than video features, many methods are also
Extract audio frequency characteristics, including short-time energy, burst of sound etc..Some methods also contemplate the text around network video, from this
Continue to extract some features in a little texts for fusion recognition.Another kind is the method for deep learning: (1) CNN utilizes convolution mind
Identifying processing is carried out to the harmful image of sensitivity in data bank through network, the internal feature of harmful objectionable video is obtained, utilizes
Whether there is harmful information in the video frame that the harmful video frame practised judges.(2) RNN Recognition with Recurrent Neural Network directly will
Harmful video information, frame of the study to harmful video, benefit are identified in video sequence input Recognition with Recurrent Neural Network in data bank
Judge identify whether new video is harmful video with the harmful video frame learnt.(3) CNN+RNN is learnt using CNN
Spatial-domain information in video in picture frame is finally combined the two using the time domain information in RNN identification video sequence
Identification judgement is carried out, video is identified using the frame learnt.
Existing image processing means mainly have following two method: conventional method and deep learning method.It is wherein traditional
Classical method word packet model, the model are made of four parts in method: (1) feature extraction phases (2) feature of bottom is compiled
Code (3) feature convergence (4) is classified using suitable classifier.Deep learning model is the model of another image procossing,
Mainly there is self-encoding encoder, is limited Boltzmann machine, deepness belief network, convolutional neural networks, Recognition with Recurrent Neural Network etc..With meter
Calculation machine hardware is constantly progressive, database it is perfect, using traditional method calculating process compared to for deep learning more
Simply, deep learning method can learn to more meaningful data, and constantly carry out parameter adjustment according to task, so for
In terms of image procossing, deep learning model has more powerful feature representation ability.
Existing recognition methods is all insufficient on recognition efficiency, in the situation of big data and Artificial Intelligence Development
Under, how harmful video is efficiently identified, with regard to becoming a problem in need of consideration.
Summary of the invention
Present disclose provides a kind of methods for identifying harmful video, comprising:
Step a), when the page elements for judging webpage include the path URL of video, in the page that identifies the webpage
The User ID recorded in appearance, inquiry whether there is the ID in first database, and according to the query result of ID output first
Weight factor;
Step b), the IP address that the path URL according to video obtains the domain name for including in the URL or the URL is directed toward,
Based on the domain name for including in the URL, whois inquiry, and/or the IP being directed toward based on the URL are carried out in the second database
Address is inquired in the second database with the presence or absence of the IP address or same network segment IP address for including in the URL, and according to
The query result of whois query result and/or IP address exports the second weight factor relevant to the path URL of video;
Step c), the online minimum image quality played in setting in the path URL and the video based on the video,
Video file when minimum image quality is obtained, and utilizes the video copy detection technology based on content, in preset harmful view
Video copy detection carried out to the video file of the minimum image quality in frequency database, and according to the result of monitoring output the
Three weight factors;
Step d), comprehensive first weight factor and the second weight factor and third weight factor, to the video whether
Belong to harmful video to be identified.
In addition, the disclosure further discloses a kind of system for identifying harmful video, comprising:
First weight factor generation module, is used for: when the page elements for judging webpage include the path URL of video,
It identifies the User ID recorded in the content of pages of the webpage, is inquired in first database and whether there is the ID, and according to
The query result of ID exports the first weight factor;
Second weight factor generation module, is used for: according to video the path URL obtain the domain name for including in the URL or
The IP address that the URL is directed toward carries out whois inquiry based on the domain name for including in the URL in the second database, and/or
Based on the IP address that the URL is directed toward, inquiry is with the presence or absence of the IP address or same for including in the URL in the second database
Network segment IP address, and according to whois query result and/or the query result of IP address, it exports relevant to the path URL of video
Second weight factor;
Third weight factor generation module, is used for: the online broadcasting in the path URL and the video based on the video is set
Minimum image quality in setting obtains video file when minimum image quality, and utilizes the video copy detection based on content
Technology carries out video copy detection to the video file of the minimum image quality in preset harmful video database, and
Third weight factor is exported according to the result of monitoring;
Identification module, for integrating the first weight factor and the second weight factor and third weight factor, to the view
Whether frequency, which belongs to harmful video, is identified.
By the method and its system, the disclosure can be in conjunction with the database that big data is made, use figure few as far as possible
As processing means, a kind of more efficient scheme for identifying harmful video is provided.
Detailed description of the invention
Fig. 1 is the schematic diagram of one embodiment the method in the disclosure;
Fig. 2 is the schematic diagram of system described in one embodiment in the disclosure.
Specific embodiment
In order to make those skilled in the art understand that technical solution disclosed by the disclosure, below in conjunction with embodiment and related
The technical solution of each embodiment is described in attached drawing, and described embodiment is a part of this disclosure embodiment, without
It is whole embodiments.Term " first " used by the disclosure, " second " etc. rather than are used for for distinguishing different objects
Particular order is described.In addition, " comprising " and " having " and their any deformation, it is intended that covering and non-exclusive packet
Contain.Such as contain the process of a series of steps or units or method or system or product or equipment are not limited to arrange
Out the step of or unit, but optionally further include the steps that not listing or unit, or further includes optionally for these mistakes
Other intrinsic step or units of journey, method, system, product or equipment.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments
It is contained at least one embodiment of the disclosure.Each position in the description occur the phrase might not each mean it is identical
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.It will be appreciated by those skilled in the art that
, embodiment described herein can combine with other embodiments.
It is a kind of process signal of the method for identification nocuousness video that one embodiment provides in the disclosure referring to Fig. 1, Fig. 1
Figure.As shown in the figure, which comprises
Step S100 identifies the page of the webpage when the page elements for judging webpage include the path URL of video
The User ID recorded in content, inquiry whether there is the ID in first database, and according to the query result of ID output the
One weight factor;
It is understood that first database maintenance is known, issued the User ID inventory of harmful video.
This is because harmful picture generally will form some sticky users, these users some can participate in propagating and have
Evil picture and most ID are relatively fixed or even the ID of considerable part user in different websites or forum is identical
ID。
For example, in the case of the User ID recognized is " tudou ":
If recording the User ID of entitled " tudou " in first database, the first weight factor property of can be exemplified
It is 1.0;
If the ID recorded in database has " tudou1 ", " tudou2 ", " tudou* " or approximate ID,
" tudou " then by the slight spare ID suspected for same subscriber, the first weight factor property of can be exemplified is 0.3;
If recording ID in database does not have " tudou " or similar ID, the first weight factor property of can be exemplified
It is 0.
Step S200, the path URL according to video is with obtaining the IP of the domain name for including in the URL or URL direction
Location carries out whois inquiry based on the domain name for including in the URL in the second database, and/or be directed toward based on the URL
IP address is inquired in the second database with the presence or absence of the IP address or same network segment IP address for including in the URL, and according to
The query result of whois query result and/or IP address exports the second weight factor relevant to the path URL of video;
It is understood that the second database maintenance is known, issued the domain name inventory and/or known, hair of harmful video
The IP address of the website of the excessively harmful video of cloth, IP address section inventory.
Whois inquiry is to investigate domain name registration people with nocuousness video and be associated with situation.Second database can be safeguarded
Following information: largely issued in domain name, internet the domain name registration people of harmful video etc. information and corresponding harmful video
Mark.
For example, in the case of domain name is www.a.com:
If recording the mark and its whois information of the domain name addresses, corresponding harmful video in the second database, that
The second weight factor property of can be exemplified is 1.0;
If not recording the mark of any harmful video of above-mentioned domain name www.a.com in the second database, but energy
Enough inquire the domain name of other websites of the domain name registration people of the domain name and the domain name registration people registration of the domain name, and second
Database includes the mark that harmful video is largely issued in other described websites on the internet, even when not having in the second database
The mark of any harmful video of above-mentioned domain name www.a.com on the books, the corresponding website of the www.a.com domain name are still high
Degree suspection is the source of harmful video, and the second weight factor property of can be exemplified is 0.9;
If not recording the mark of any harmful video of above-mentioned domain name www.a.com in the second database, but energy
The domain name for other websites that the domain name registration people of the domain name registration people and the domain name that enough inquire the domain name register, however the
Two databases do not include any mark about other website orientation nocuousness videos, and second weight factor can be shown
Example property is 0;
It is readily appreciated that, if not recording the mark of any harmful video of above-mentioned domain name www.a.com in the second database
Know, the domain name of other websites of the domain name registration people registration less than the domain name is also inquired, then second weight factor can also
With it is exemplary be 0.
Illustratively, the address IP that the URL is directed toward can also be obtained according to the path URL of video, carry out IP address/
IP address section inquiry, to export the second weight factor,
For example, in the case of IP address is 192.168.10.3:
If recording the IP address in the second database, the second weight factor property of can be exemplified is 1.0;
If the IP address recorded in the second database only has 192.168.10.4,192.168.10.3 is then by
Degree suspects the address for being the standby address of the video affiliated web site or replacing recently, and the second weight factor property of can be exemplified is
0.6;
If the IP address recorded in the second database has 192.168.10.4 and 192.168.10.5, or even records
All IP address of 192.168.10.X network segment, then 192.168.10.3 is then the video affiliated web site by strong suspicion
Standby address or the address replaced recently, the second weight factor property of can be exemplified be 0.9;
If including multiple 192.168.X.X network segments in the IP address recorded in database, without 192.168.10.X
Network segment, then 192.168.10.3 is belonged to the address of website by careful suspection then for harmful video, the second weight factor be can be exemplified
Property is 0.4.
Particularly, there is also the situations for comprehensively considering IP inventory and domain name inventory for above-mentioned steps, that is, pass through the IP of picture URL
Inquiry and domain name whois inquire the situation to determine the second weight factor jointly.
Assuming that the IP inquiry factor of picture URL is i, it is j that domain name whois, which inquires the factor, and the second weight factor is y, wherein 0
≤ i≤1,0≤j≤1,0≤y≤1 can determine according to the following formula the second weight factor:
Y=m × i+n × j, wherein m+n=1, m, n then respectively indicate the IP inquiry factor and domain name whois inquires the factor
Weight.
For example, m=n=1/2;
More for example, m, n are unequal, specifically according to each weight for inquiring the factor and the second weight factor can be determined
Actual conditions and adjust.
It is understood that y is heavier closer to 1, second weight factor, the probability that picture concerned belongs to harmful picture is bigger.
The above formula for calculating y belongs to linear formula, however when practical application, it is also possible to use non-linear formula.
Further, either linear formula or non-linear formula, it is contemplated that being determined by training or be fitted
Correlation formula and its parameter.
Step S300, the online minimum picture matter played in setting in the path URL and the video based on the video
Amount obtains video file when minimum image quality, and utilizes the video copy detection technology based on content, in preset nocuousness
Video copy detection is carried out to the video file of the minimum image quality in video database, and is exported according to the result of monitoring
Third weight factor;
Step S300 is the video copy detection based on content, and exported by the result of monitoring third weight because
Son.It is understood that preset nocuousness video database includes conventional harmful video or other decadent contents etc., and described
Preset nocuousness video database can be established in conjunction with big data technology, and can be thus continually updated.If the minimum picture
The video file when quality of face is monitored result identification are as follows: the doubtful copy of certain video in preset harmful video database
Version, then third weight factor can be embodied.It is understood that third weight factor may when meeting corresponding threshold condition
It is 1.0, it is also possible to 0.8 or 0.4, depending on specific threshold condition.
In addition, it is necessary to, it is emphasized that be based on the view for computing resource and time cost needed for reducing the present embodiment
The path URL of frequency and the online minimum image quality played in setting of the video, obtain video when minimum image quality
File.Obviously, it is next to take full advantage of video content corresponding to the minimum image quality in current video playing setting by inventor
Carry out efficiently video copy detection.But it is not intended that minimum picture or low image quality must be obtained by playing setting
Picture, because video content corresponding to low image quality can also be obtained by various samplings and further implement video copy inspection
It surveys.
It is understood that the step S300, both can carry out video processing in conjunction with traditional method, and can also combine depth
Learning model carries out video processing, and then identifies to harmful video.
Step S400, comprehensive first weight factor and the second weight factor and third weight factor are to the video
It is no to belong to harmful video and identified.
Illustratively, if the first weight factor is x, the second weight factor is y, and third weight factor is z, wherein 0≤x≤
1,0≤y≤1,0≤z≤1, can according to the following formula in summary weight factor calculate video harmful coefficient W:
W=a × x+b × y+c × z, wherein a+b+c=1, a, b, c then respectively indicate the weight of each weight factor.
For example, a=b=c=1/3;
It, specifically can be according to each weight factor and the actual conditions of identification harmful content more for example, a, b, c are unequal
And it adjusts.
It is understood that W is closer to 1, the probability that associated video belongs to harmful video is bigger.
The above formula for calculating W belongs to linear formula, however when practical application, it is also possible to use non-linear formula.
Further, either linear formula or non-linear formula, it is contemplated that being determined by training or be fitted
Correlation formula and its parameter.
To sum up, for above-described embodiment, only step S300 has carried out image procossing, and remaining step is then separately to ward off footpath
Diameter is utilized relevant inquiring, obtains relevant weight factor.Then comprehensive (alternatively referred to as merging) the multiple weights of step S400 because
Son carries out the identification of harmful video.Those skilled in the art know, are handled, are identified and be for each frame image of video
Very elapsed time cost, and inquire and then in contrast more save time cost.It is clear that above-described embodiment proposes
A kind of efficient method for identifying harmful video of richness.In addition, above-described embodiment obviously can further combined with big data and/or
The first database, the second database and other databases are established, updated to artificial intelligence.
In another embodiment, second database is third party database.
For example, in terms of the list of websites of harmful video of numerous websites and third party's maintenance of progress whois inquiry
Database or have recorded database in terms of the IP address of website of harmful picture, IP address section list.
In another embodiment, for being determined as harmful video after identification, for network address (such as the forum in its source
Or webpage), it collects the IP address information of the publisher for the harmful video recorded in the network address and updates first database.
This is because harmful video generally will form some sticky users, these users some can participate in propagating harmful video and
Most IP address can be relatively fixed, if address correlation itself describes the IP address letter of the publisher of harmful video
Breath, the disclosure then update aforementioned first database by collecting its IP address information.
In another embodiment, step S200 further include:
Further, the safety of domain name is inquired in third party's domain name safe list so as to the output safety factor,
And second weight factor relevant to domain name is modified by the factor of safety.
Such as virustotal.com this third party's domain name safe screen looks into website.It is understood that if third party's information
In think associated dns name include virus or wooden horse, then should improve the second weight factor, it is uneasy to have its source in related web site
Entirely.
It is understood that the embodiment is laid particular emphasis on from the second weight factor of network security angle modification, prevent user from meeting with
By unknown losses.This is because privacy and proprietary of the network security concerning user, if the related web site of harmful video exists
Network Security Vulnerabilities, then also bringing the harm of privacy leakage or property loss to user other than the harm of harmful video.
In another embodiment, in step S300 obtain minimum image quality when video file, including obtain view
The video content of frequency file run-out.
For the embodiment, it means that obtain video content when, in order to reduce to the greatest extent acquisition video content it is big
It is small, the preferential video content for selecting video file run-out.This is because run-out is often plot for harmful video
Climax parts, and the disseminator of these harmful videos is generally impossible to either for favourite hobby or what bad motivation
Delete the climax parts of run-out.That is, for the present embodiment, which greatly reduce the work of video copy detection
Amount.You need to add is that the embodiment is preferred embodiment, it is not meant to that video content 1/3 cannot be broadcast before video
Period selection corresponding contents are put, or choose corresponding contents from intermediate 1/3 play time section.
Preferably, the video content of run-out can be the content in the 1/3 play time interval of end of video.More preferably,
The video content of run-out can be the content in end a few minutes of video, such as 3 minutes, 5 minutes, 10 minutes;No matter rather
Clock, if 1/3 play time length of end is smaller, corresponding contents preferably in end 1/3 play time section naturally.
In another embodiment, in step S300
Video file when minimum image quality is obtained, further includes as follows:
Step c1): extract the audio in video;
Step c2): it whether include harmful content in identification audio, if so, then obtaining institute according to the beginning and ending time of audio
State the video content in the beginning and ending time.
For the embodiment, if recognizing in audio includes the harmful content, its time is positioned, from audio
Beginning and ending time be foundation, obtain the beginning and ending time in video content.It can more targetedly find so related harmful
Picture.
As it was noted above, if in conjunction with big data technology, the disclosure being capable of the multiple dimensions of fruitful combination, Duo Zhongmo
Formula quickly identifies harmful video in conjunction with IP information, domain-name information, video information, audio-frequency information.
Further, above-described embodiment can be implemented in router side or network provider side, filter in advance
Associated video.
Corresponding with method, referring to fig. 2, the disclosure discloses in another embodiment a kind of identifies harmful video
System, comprising:
First weight factor generation module, is used for: when the page elements for judging webpage include the path URL of video,
It identifies the User ID recorded in the content of pages of the webpage, is inquired in first database and whether there is the ID, and according to
The query result of ID exports the first weight factor;
Second weight factor generation module, is used for: according to video the path URL obtain the domain name for including in the URL or
The IP address that the URL is directed toward carries out whois inquiry based on the domain name for including in the URL in the second database, and/or
Based on the IP address that the URL is directed toward, inquiry is with the presence or absence of the IP address or same for including in the URL in the second database
Network segment IP address, and according to whois query result and/or the query result of IP address, it exports relevant to the path URL of video
Second weight factor;
Third weight factor generation module, is used for: the online broadcasting in the path URL and the video based on the video is set
Minimum image quality in setting obtains video file when minimum image quality, and utilizes the video copy detection based on content
Technology carries out video copy detection to the video file of the minimum image quality in preset harmful video database, and
Third weight factor is exported according to the result of monitoring;
Identification module, for integrating the first weight factor and the second weight factor and third weight factor, to the view
Whether frequency, which belongs to harmful video, is identified.
It is similar with the embodiment of each method above,
Preferably, second database is third party database.
It is furthermore preferred that the second weight factor generation module further include:
Amending unit is used for: it is further, inquired in third party's domain name safe list the safety of domain name so as to
The output safety factor, and second weight factor relevant to domain name is modified by the factor of safety.
It is furthermore preferred that in the third weight factor generation module obtain minimum image quality when video file, packet
Include the video content for obtaining video file run-out.
It is furthermore preferred that also by such as in third weight factor generation module described in the third weight factor generation module
Lower unit obtains video file when minimum image quality:
Audio extraction unit, for extracting the audio in video;
Audio identification unit, for identification in audio whether include harmful content, if so, then according to the start-stop of audio when
Between obtain video content in the beginning and ending time.
The disclosure discloses a kind of system for identifying harmful video in another embodiment, comprising:
Processor and memory, are stored with executable instruction in the memory, the processor execute these instructions with
Execute following operation:
Step a), when the page elements for judging webpage include the path URL of video, in the page that identifies the webpage
The User ID recorded in appearance, inquiry whether there is the ID in first database, and according to the query result of ID output first
Weight factor;
Step b), the IP address that the path URL according to video obtains the domain name for including in the URL or the URL is directed toward,
Based on the domain name for including in the URL, whois inquiry, and/or the IP being directed toward based on the URL are carried out in the second database
Address is inquired in the second database with the presence or absence of the IP address or same network segment IP address for including in the URL, and according to
The query result of whois query result and/or IP address exports the second weight factor relevant to the path URL of video;
Step c), the online minimum image quality played in setting in the path URL and the video based on the video,
Video file when minimum image quality is obtained, and utilizes the video copy detection technology based on content, in preset harmful view
Video copy detection carried out to the video file of the minimum image quality in frequency database, and according to the result of monitoring output the
Three weight factors;
Step d), comprehensive first weight factor and the second weight factor and third weight factor, to the video whether
Belong to harmful video to be identified.
The disclosure further discloses a kind of computer storage medium in another embodiment, is stored with executable instruction, institute
Instruction is stated for executing the following method for identifying harmful video:
Step a), when the page elements for judging webpage include the path URL of video, in the page that identifies the webpage
The User ID recorded in appearance, inquiry whether there is the ID in first database, and according to the query result of ID output first
Weight factor;
Step b), the IP address that the path URL according to video obtains the domain name for including in the URL or the URL is directed toward,
Based on the domain name for including in the URL, whois inquiry, and/or the IP being directed toward based on the URL are carried out in the second database
Address is inquired in the second database with the presence or absence of the IP address or same network segment IP address for including in the URL, and according to
The query result of whois query result and/or IP address exports the second weight factor relevant to the path URL of video;
Step c), the online minimum image quality played in setting in the path URL and the video based on the video,
Video file when minimum image quality is obtained, and utilizes the video copy detection technology based on content, in preset harmful view
Video copy detection carried out to the video file of the minimum image quality in frequency database, and according to the result of monitoring output the
Three weight factors;
Step d), comprehensive first weight factor and the second weight factor and third weight factor, to the video whether
Belong to harmful video to be identified.
It may include: at least one processor (such as CPU) for above system, at least one sensor (such as plus
Speedometer, gyroscope, GPS module or other locating modules), at least one processor, at least one communication bus, wherein logical
Believe bus for realizing the connection communication between various components.The equipment can also include at least one receiver, at least one
A transmitter, wherein receiver and transmitter can be wired sending port, be also possible to wireless device (for example including antenna
Device), for carrying out the transmission of signaling or data with other node devices.The memory can be high speed RAM memory,
It can be non-labile memory (Non-volatile memory), for example, at least a magnetic disk storage.Memory is optional
Can be at least one storage device for being located remotely from aforementioned processor.Batch processing code is stored in memory, and described
Processor can call the code stored in memory to execute relevant function by communication bus.
Embodiment of the disclosure also provides a kind of computer storage medium, wherein the computer storage medium can store journey
Sequence, the program include the part or complete for any method for identifying harmful video recorded in above method embodiment when executing
Portion's step.
Step in embodiment of the disclosure method can be sequentially adjusted, merged and deleted according to actual needs.
Module and unit in embodiment of the disclosure system can be combined, divided and deleted according to actual needs.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of action groups
It closes, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described, because according to this hair
Bright, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know that, specification
Described in embodiment belong to preferred embodiment, related movement, module, unit not necessarily present invention institute are necessary
's.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
In several embodiments provided by the disclosure, it should be understood that disclosed system, it can be by another way
It realizes.For example, embodiments described above is only illustrative, such as the division of the unit, only a kind of logic function
It can divide, there may be another division manner in actual implementation, such as multiple units or components can be combined or be can integrate
To another system, or some features can be ignored or not executed.Another point, each unit or the mutual coupling of component or
Direct-coupling or communication connection can be through some interfaces, and the indirect coupling or communication connection of device or unit can be electricity
Property or other form.
The unit as illustrated by the separation member may or may not be physically separated, and can both be located at
One place, or may be distributed over multiple network units.Can select according to the actual needs part therein or
Whole units achieve the purpose of the solution of this embodiment.
It, can also be in addition, each functional unit in each embodiment of the disclosure can integrate in one processing unit
It is each unit individualism, can also be integrated in one unit with two or more units.Above-mentioned integrated unit was both
It can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the disclosure is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be smart phone, personal digital assistant, wearable device, laptop, tablet computer) executes each of the disclosure
The all or part of the steps of a embodiment the method.And storage medium above-mentioned include: USB flash disk, read-only memory (R0M,
Read-0nly Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or
The various media that can store program code such as CD.
The above, above embodiments are only to illustrate the technical solution of the disclosure, rather than its limitations;Although referring to before
Embodiment is stated the disclosure is described in detail, it should be understood by those skilled in the art that: it still can be to aforementioned each reality
Technical solution documented by example is applied to modify or equivalent replacement of some of the technical features;And these modification or
Person's replacement, the range for the presently disclosed embodiments technical solution that it does not separate the essence of the corresponding technical solution.
Claims (10)
1. a kind of method for identifying harmful video, comprising:
Step a), when the page elements for judging webpage include the path URL of video, in the content of pages that identifies the webpage
The User ID of record, inquiry whether there is the ID in first database, and export the first weight according to the query result of ID
The factor;
Step b), the IP address that the path URL according to video obtains the domain name for including in the URL or the URL is directed toward, is based on
The domain name for including in the URL carries out whois inquiry, and/or the IP address being directed toward based on the URL in the second database,
Inquiry is with the presence or absence of the IP address or same network segment IP address for including in the URL in the second database, and is looked into according to whois
The query result of result and/or IP address is ask, the second weight factor relevant to the path URL of video is exported;
Step c), the online minimum image quality played in setting in the path URL and the video based on the video, obtains
Video file when minimum image quality, and the video copy detection technology based on content is utilized, in preset harmful video counts
Video copy detection is carried out to the video file of the minimum image quality according in library, and third power is exported according to the result of monitoring
Repeated factor;
Whether step d), comprehensive first weight factor and the second weight factor and third weight factor, belong to the video
Harmful video is identified.
2. according to the method described in claim 1, wherein, it is preferred that second database is third party database.
3. according to the method described in claim 1, wherein, step b) further include:
Further, the safety of domain name is inquired in third party's domain name safe list so as to the output safety factor, and led to
The factor of safety is crossed to be modified second weight factor.
4. according to the method described in claim 1, wherein, video file when obtaining minimum image quality in step c), packet
Include the video content for obtaining video file run-out.
5. according to the method described in claim 1, wherein, video file when obtaining minimum image quality in step c), also
Include the following:
Step c1): extract the audio in video;
Step c2): it whether include harmful content in identification audio, if so, then obtaining described rise according to the beginning and ending time of audio
The only video content in the time.
6. a kind of system for identifying harmful video, comprising:
First weight factor generation module, is used for: when the page elements for judging webpage include the path URL of video, identification
The User ID recorded in the content of pages of the webpage, inquiry whether there is the ID in first database, and according to ID's
Query result exports the first weight factor;
Second weight factor generation module, is used for: the path URL according to video obtains the domain name for including in the URL or described
The IP address that URL is directed toward carries out whois inquiry, and/or be based on based on the domain name for including in the URL in the second database
The IP address that the URL is directed toward, inquiry is with the presence or absence of the IP address or same network segment for including in the URL in the second database
IP address, and according to whois query result and/or the query result of IP address, it exports and the path URL relevant second of video
Weight factor;
Third weight factor generation module, is used for: in the online broadcasting setting in the path URL and the video based on the video
Minimum image quality, obtain video file when minimum image quality, and utilize the video copy detection technology based on content,
Video copy detection is carried out to the video file of the minimum image quality in preset harmful video database, and according to prison
The result of survey exports third weight factor;
Identification module is to the video for integrating the first weight factor and the second weight factor and third weight factor
It is no to belong to harmful video and identified.
7. system according to claim 6, wherein preferred, second database is third party database.
8. system according to claim 6, wherein the second weight factor generation module further include:
Amending unit is used for: it is further, the safety of domain name is inquired in third party's domain name safe list to export
Factor of safety, and second weight factor is modified by the factor of safety.
9. system according to claim 6, wherein the minimum picture matter of acquisition in the third weight factor generation module
Video file when amount, the video content including obtaining video file run-out.
10. system according to claim 6, wherein also by such as lower unit in the third weight factor generation module
Obtain video file when minimum image quality:
Audio extraction unit, for extracting the audio in video;
Whether audio identification unit includes for identification harmful content in audio, if so, then being obtained according to the beginning and ending time of audio
Take the video content in the beginning and ending time.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711500073.7A CN110020257A (en) | 2017-12-30 | 2017-12-30 | The method and system of the harmful video of identification based on User ID and video copy |
PCT/CN2018/072239 WO2019127655A1 (en) | 2017-12-30 | 2018-01-11 | Method and system for identifying harmful video on basis of user id and video copy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711500073.7A CN110020257A (en) | 2017-12-30 | 2017-12-30 | The method and system of the harmful video of identification based on User ID and video copy |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110020257A true CN110020257A (en) | 2019-07-16 |
Family
ID=67064473
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711500073.7A Withdrawn CN110020257A (en) | 2017-12-30 | 2017-12-30 | The method and system of the harmful video of identification based on User ID and video copy |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110020257A (en) |
WO (1) | WO2019127655A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101853377A (en) * | 2010-05-13 | 2010-10-06 | 复旦大学 | Method for identifying content of digital video |
CN102208992A (en) * | 2010-06-13 | 2011-10-05 | 天津海量信息技术有限公司 | Internet-facing filtration system of unhealthy information and method thereof |
CN103905372A (en) * | 2012-12-24 | 2014-07-02 | 珠海市君天电子科技有限公司 | Method and device for removing false alarm of phishing website |
CN105654051A (en) * | 2015-12-30 | 2016-06-08 | 北京奇艺世纪科技有限公司 | Video detection method and system |
CN106055574A (en) * | 2016-05-19 | 2016-10-26 | 微梦创科网络科技(中国)有限公司 | Method and device for recognizing illegal URL |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7403929B1 (en) * | 2004-07-23 | 2008-07-22 | Ellis Robinson Giles | Apparatus and methods for evaluating hyperdocuments using a trained artificial neural network |
CN100361450C (en) * | 2005-11-18 | 2008-01-09 | 郑州金惠计算机系统工程有限公司 | System for blocking off erotic images and unhealthy information in internet |
CN104574547B (en) * | 2015-01-28 | 2018-02-09 | 广东铂亚信息技术有限公司 | A kind of highway for preventing fee evasion method based on face recognition technology |
CN106101740B (en) * | 2016-07-13 | 2019-12-24 | 百度在线网络技术(北京)有限公司 | Video content identification method and device |
CN206657367U (en) * | 2016-12-29 | 2017-11-21 | 池州职业技术学院 | A kind of imperfect picture filter |
CN106599937A (en) * | 2016-12-29 | 2017-04-26 | 池州职业技术学院 | Bad image filtering device |
CN106973305B (en) * | 2017-03-20 | 2020-02-07 | 广东小天才科技有限公司 | Method and device for detecting bad content in video |
-
2017
- 2017-12-30 CN CN201711500073.7A patent/CN110020257A/en not_active Withdrawn
-
2018
- 2018-01-11 WO PCT/CN2018/072239 patent/WO2019127655A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101853377A (en) * | 2010-05-13 | 2010-10-06 | 复旦大学 | Method for identifying content of digital video |
CN102208992A (en) * | 2010-06-13 | 2011-10-05 | 天津海量信息技术有限公司 | Internet-facing filtration system of unhealthy information and method thereof |
CN103905372A (en) * | 2012-12-24 | 2014-07-02 | 珠海市君天电子科技有限公司 | Method and device for removing false alarm of phishing website |
CN105654051A (en) * | 2015-12-30 | 2016-06-08 | 北京奇艺世纪科技有限公司 | Video detection method and system |
CN106055574A (en) * | 2016-05-19 | 2016-10-26 | 微梦创科网络科技(中国)有限公司 | Method and device for recognizing illegal URL |
Non-Patent Citations (1)
Title |
---|
SUSHMA NAGESH BANNUR等: "Judging a Site by its Content: Learning the Textual,Structural, and Visual Features of Malicious Web Pages", 《AISEC’11》 * |
Also Published As
Publication number | Publication date |
---|---|
WO2019127655A1 (en) | 2019-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104579773B (en) | Domain name system analyzes method and device | |
CN104899508B (en) | A kind of multistage detection method for phishing site and system | |
CN108829715A (en) | For detecting the method, equipment and computer readable storage medium of abnormal data | |
CN107341399A (en) | Assess the method and device of code file security | |
Chen et al. | Ai@ ntiphish—machine learning mechanisms for cyber-phishing attack | |
Liu et al. | An efficient multistage phishing website detection model based on the CASE feature framework: Aiming at the real web environment | |
Basavaraju et al. | Supervised learning techniques in mobile device apps for Androids | |
CN115757991A (en) | Webpage identification method and device, electronic equipment and storage medium | |
CN104731937A (en) | User behavior data processing method and device | |
CN110020256A (en) | The method and system of the harmful video of identification based on User ID and trailer content | |
Liu et al. | Pixel Privacy 2019: Protecting Sensitive Scene Information in Images. | |
CN110019892A (en) | A kind of method and its system identifying harmful picture based on User ID | |
CN110020254A (en) | The method and system of the harmful video of identification based on User IP and video copy | |
US11176209B2 (en) | Dynamically augmenting query to search for content not previously known to the user | |
Vidya et al. | Web mining-concepts and application | |
CN110020252A (en) | The method and its system of the harmful video of identification based on trailer content | |
CN109993036A (en) | A kind of method and its system identifying harmful video based on User ID | |
CN115001763B (en) | Phishing website attack detection method and device, electronic equipment and storage medium | |
CN110019946A (en) | A kind of method and its system identifying harmful video | |
CN110020257A (en) | The method and system of the harmful video of identification based on User ID and video copy | |
CN110020253A (en) | The method and its system of the harmful video of the identification of video copy based on content | |
CN110020251A (en) | The method and system of the harmful video of identification based on User IP and trailer content | |
CN110020255A (en) | A kind of method and its system identifying harmful video based on User IP | |
KR20150101846A (en) | Image classification service system based on a sketch user equipment, service equipment, service method based on sketch and computer readable medium having computer program recorded therefor | |
CN110020035A (en) | Data identification method and device, storage medium and electronic device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20190716 |