CN110109952A - A kind of method and its system identifying harmful picture - Google Patents
A kind of method and its system identifying harmful picture Download PDFInfo
- Publication number
- CN110109952A CN110109952A CN201711499941.4A CN201711499941A CN110109952A CN 110109952 A CN110109952 A CN 110109952A CN 201711499941 A CN201711499941 A CN 201711499941A CN 110109952 A CN110109952 A CN 110109952A
- Authority
- CN
- China
- Prior art keywords
- picture
- weight factor
- domain name
- harmful
- address
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000006837 decompression Effects 0.000 claims abstract description 15
- 230000006835 compression Effects 0.000 claims abstract description 10
- 238000007906 compression Methods 0.000 claims abstract description 10
- 239000000284 extract Substances 0.000 claims abstract description 8
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 6
- 238000004891 communication Methods 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000012423 maintenance Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 241000700605 Viruses Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/5866—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
A kind of method and its system identifying harmful picture, method include: to obtain the path URL of picture, and then obtain domain name, IP address according to the path URL, and export the first weight factor, the second weight factor based on the relevant inquiring of the IP address and domain name;Also, the picture is further obtained, and extracts DC coefficient in the compression domain of picture, to identify the picture after carrying out part decompression to picture, and third weight factor is exported according to the result of identification picture;Comprehensive first weight factor and the second weight factor and third weight factor, identify to whether the picture belongs to harmful picture.The disclosure can provide a kind of scheme for identifying harmful picture using various modes in conjunction with the database that big data is made, use image processing means few as far as possible.
Description
Technical field
The disclosure belongs to information security field, such as is related to a kind of method and its system for identifying harmful picture.
Background technique
In information-intensive society, it is full of information flow, including but not limited to text, video, audio, picture etc. everywhere.Wherein, with view
Frequency is compared, and picture file had not only included certain visual information but also relatively low to memory space and bandwidth requirement.With mobile mutual
Networking it is universal, a large amount of harmful image contents are full of on network, the features such as due to vision intuitive, impact, harmfulness is more
Add more than harmful text and harmful audio etc., therefore these harmful pictures is identified, and then be filtered, delete, eliminate
Harm, is very necessary.
Identification for network nocuousness picture, present technology mainly have and can be divided into two major classes, one is conventional method,
Mainly pass through various classifiers.Another kind is the application of the method for deep learning, especially convolutional neural networks.However above two
Class method is all insufficient on recognition efficiency.
In the case of big data and Artificial Intelligence Development, how harmful picture is efficiently identified, just become a needs
Consider the problems of.
Summary of the invention
Present disclose provides a kind of methods for identifying harmful picture, comprising:
Step a) obtains the path URL of picture, and then obtains domain name, the address IP according to the path URL, and based on described
IP address, inquiry whether there is the address IP or same network segment IP address, and looking into according to IP address in first database
It askes result and exports the first weight factor relevant to IP;
Step b) is based on domain name, whois inquiry is carried out in the second database, and defeated according to whois query result
The second weight factor relevant to domain name out;
Step c) extracts DC coefficient in the compression domain of picture, so as to described in identification after picture progress part decompression
Picture, and third weight factor is exported according to the result of identification picture;
Step d), comprehensive first weight factor and the second weight factor and third weight factor, to the picture whether
Belong to harmful picture to be identified.
In addition, the disclosure further discloses a kind of system for identifying harmful picture, comprising:
First weight factor generation module, is used for: obtain the path URL of picture, and then according to the path URL obtain domain name,
IP address, and it is based on the IP address, it is inquired with the presence or absence of the IP address or same network segment IP in first database
Location, and the first weight factor relevant to IP is exported according to the query result of IP address;
Second weight factor generation module, is used for: it is based on domain name, whois inquiry is carried out in the second database, and
The second weight factor relevant to domain name is exported according to whois query result;
Third weight factor generation module, is used for: DC coefficient is extracted in the compression domain of picture, to carry out to picture
Part identifies the picture after decompressing, and exports third weight factor according to the result of identification picture;
Identification module, for integrating the first weight factor and the second weight factor and third weight factor, to the figure
Whether piece, which belongs to harmful picture, is identified.
By the method and its system, the disclosure can be in conjunction with the database that big data is made, use figure few as far as possible
As processing means, a kind of more efficient scheme for identifying harmful picture is provided.
Detailed description of the invention
Fig. 1 is the schematic diagram of one embodiment the method in the disclosure;
Fig. 2 is the schematic diagram of system described in one embodiment in the disclosure.
Specific embodiment
In order to make those skilled in the art understand that technical solution disclosed by the disclosure, below in conjunction with embodiment and related
The technical solution of each embodiment is described in attached drawing, and described embodiment is a part of this disclosure embodiment, without
It is whole embodiments.Term " first " used by the disclosure, " second " etc. rather than are used for for distinguishing different objects
Particular order is described.In addition, " comprising " and " having " and their any deformation, it is intended that covering and non-exclusive packet
Contain.Such as contain the process of a series of steps or units or method or system or product or equipment are not limited to arrange
Out the step of or unit, but optionally further include the steps that not listing or unit, or further includes optionally for these mistakes
Other intrinsic step or units of journey, method, system, product or equipment.
Referenced herein " embodiment " is it is meant that a particular feature, structure, or characteristic described can wrap in conjunction with the embodiments
It is contained at least one embodiment of the disclosure.Each position in the description occur the phrase might not each mean it is identical
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.It will be appreciated by those skilled in the art that
, embodiment described herein can combine with other embodiments.
It is a kind of process signal of the method for identification nocuousness picture that one embodiment provides in the disclosure referring to Fig. 1, Fig. 1
Figure.As shown in the figure, which comprises
Step S100 obtains the path URL of picture, and then obtains domain name, IP address according to the path URL, and be based on institute
IP address is stated, inquiry whether there is the IP address or same network segment IP address in first database, and according to IP address
Query result exports the first weight factor relevant to IP;
It is understood that first database maintenance is known, issued the IP address inventory of harmful picture.
For example, in the case of IP address is 192.168.10.3:
If recording the IP address in first database, the first weight factor property of can be exemplified is 1.0;
If the IP address recorded in database only has 192.168.10.4,192.168.10.3 then to be cherished by moderate
Standby address suspected of the picture affiliated web site or the address replaced recently, the first weight factor property of can be exemplified are 0.6;
If the IP address recorded in database has 192.168.10.4 and 192.168.10.5, or even describes
192.168.10.X all IP address of network segment, then 192.168.10.3 is then the picture affiliated web site by strong suspicion
Standby address or the address replaced recently, the first weight factor property of can be exemplified are 0.9;
If including multiple 192.168.X.X network segments in the IP address recorded in database, without 192.168.10.X
Network segment, then 192.168.10.3 is then the address of harmful picture affiliated web site by careful suspection, the first weight factor can be shown
Example property is 0.4.
Step S200 is based on domain name, whois inquiry is carried out in the second database, and according to whois query result
Export the second weight factor relevant to domain name;
It is understood that the second database maintenance is known, issued the domain name inventory of harmful picture.
Whois inquiry is to investigate domain name registration people with nocuousness picture and be associated with situation.Second database can be safeguarded
Following information: domain name, largely issue on internet the domain name registration people of harmful picture information and corresponding harmful picture
Mark.
For example, in the case of domain name is www.a.com:
If recording the mark and its whois information of the domain name addresses, corresponding harmful picture in the second database, that
The second weight factor property of can be exemplified is 1.0;
If not recording the mark of any harmful picture of above-mentioned domain name www.a.com in the second database, but energy
Enough inquire the domain name of other websites of the domain name registration people of the domain name and the domain name registration people registration of the domain name, and second
Database includes the mark that harmful picture is largely issued in other described websites on the internet, even when not having in the second database
The mark of any harmful picture of above-mentioned domain name www.a.com on the books, the corresponding website of the www.a.com domain name are still high
Degree suspection is the source of harmful picture, and the second weight factor property of can be exemplified is 0.9;
If not recording the mark of any harmful picture of above-mentioned domain name www.a.com in the second database, but energy
The domain name for other websites that the domain name registration people of the domain name registration people and the domain name that enough inquire the domain name register, however the
Two databases do not include any mark about other website orientation nocuousness pictures, and second weight factor can be shown
Example property is 0;
It is readily appreciated that, if not recording the mark of any harmful picture of above-mentioned domain name www.a.com in the second database
Know, the domain name of other websites of the domain name registration people registration less than the domain name is also inquired, then second weight factor can also
With it is exemplary be 0.
Step S300 obtains the picture, and extracts DC coefficient in the compression domain of picture, so as to picture carry out portion
The picture is identified after decomposition pressure, and third weight factor is exported according to the result of identification picture;
Step S300 is that third weight factor is exported by the recognition result of picture.If detecting conventional harmful figure
Piece or other decadent contents etc., then third weight factor can be embodied.It is understood that conventional nocuousness picture or other be not good for
When the number that health content occurs meets corresponding threshold condition, third weight factor may be 1.0, it is also possible to 0.8 or 0.4,
Depending on specific threshold condition.
In addition, it is necessary to, it is emphasized that be carried out for computing resource and time cost needed for reducing the present embodiment to picture
It is first to extract DC coefficient from the compression domain of picture when identification, can be used to image knowledge to carry out part decompression to picture
Not.Since inventor utilizes: image information is largely focused on DC coefficient and its neighbouring low-frequency spectra this characteristic, institute
By DC coefficient part decompression can be carried out to picture, image recognition is carried out using the image information of part decompression, and
Unfavorable all information in complete picture, to reduce workload.Typically, meet the picture text of JPEG coding standard
Part can be handled in this way.
It is understood that can be used for described in the disclosure in this field the technological means that the harmful information of picture identifies
Picture.The step S300 can both carry out the processing of image in conjunction with traditional method, also can be used in conjunction with deep learning mould
Type carries out the processing of image, and then identifies to harmful picture.
More particularly, in one case, the picture is identified after carrying out part decompression to picture in the step S300,
It specifically includes:
It, will be being safeguarded in the picture and third party's image data base, known harmful after carrying out part decompression to picture
Picture carries out feature comparison, to identify the picture, and when being identified as nocuousness, further updates the picture to described
Third party's image data base;Wherein, preparatory by the picture for known harmful sites of creeping in third party's image data base
It establishes.
Step S400, comprehensive first weight factor and the second weight factor and third weight factor are to the picture
It is no to belong to harmful picture and identified.
Illustratively, if the first weight factor is x, the second weight factor is y, and third weight factor is z, wherein 0≤x≤
1,0≤y≤1,0≤z≤1, can according to the following formula in summary weight factor calculate picture harmful coefficient W:
W=a × x+b × y+c × z, wherein a+b+c=1, a, b, c then respectively indicate the weight of each weight factor.
For example, a=b=c=1/3;
It, specifically can be according to each weight factor and the actual conditions of identification harmful content more for example, a, b, c are unequal
And it adjusts.
It is understood that W is closer to 1, the probability that picture concerned belongs to harmful picture is bigger.
The above formula for calculating W belongs to linear formula, however when practical application, it is also possible to use non-linear formula.
Further, either linear formula or non-linear formula, it is contemplated that being determined by training or be fitted
Correlation formula and its parameter.
To sum up, for above-described embodiment, only step S300 has carried out image procossing, and remaining step is then separately to ward off footpath
Diameter is utilized relevant inquiring, obtains relevant weight factor.Then comprehensive (alternatively referred to as merging) the multiple weight factors of step S400
Carry out the identification of harmful picture.Those skilled in the art know that specific image procossing identifies relative consumption time cost,
And it inquires and then in contrast more saves time cost.It is clear that above-described embodiment proposes a kind of efficient identification of richness
The method of harmful picture.In addition, above-described embodiment obviously can be established, more further combined with big data and/or artificial intelligence
The new first database, the second database and other databases.
In another embodiment, second database is third party database.
For example, in terms of the list of websites of harmful picture of numerous websites and third party's maintenance of progress whois inquiry
Database.
In another embodiment, for being identified as the network address (such as forum or webpage) of harmful picture, the net is collected
The IP address information of the publisher for the harmful picture recorded on location simultaneously updates first database.This is because harmful picture
It generally will form some sticky users, these users some can participate in propagating harmful picture and most IP address being phase
To fixation, if address correlation itself describes the IP address information of the publisher of harmful picture, disclosure if, passes through receipts
Collect its IP address information to update aforementioned first database.
In another embodiment, step S200 further include:
Further, the safety of domain name is inquired in third party's domain name safe list so as to the output safety factor,
And second weight factor relevant to domain name is modified by the factor of safety.
Such as virustotal.com this third party's domain name safe screen looks into website.It is understood that if third party's information
In think associated dns name include virus or wooden horse, then should improve the second weight factor, it is uneasy to have its source in related web site
Entirely.
It is understood that the embodiment is laid particular emphasis on from the second weight factor of network security angle modification, prevent user from meeting with
By unknown losses.This is because privacy and proprietary of the network security concerning user, if the related web site of harmful picture exists
Network Security Vulnerabilities, then also bringing the harm of privacy leakage or property loss to user other than the harm of harmful picture.
In another embodiment, step S300 further includes as follows:
Step c1): audio of creeping in the website of the picture;
Step c2): it whether include harmful content in identification audio, if so, then correcting third weight factor.
For the embodiment, if recognizing in audio includes the harmful content, this illustrates that related web site has
Menace then corrects third weight factor, such as increases third weight factor.
As it was noted above, if in conjunction with big data technology, the disclosure being capable of the multiple dimensions of fruitful combination, Duo Zhongmo
Formula quickly identifies harmful picture in conjunction with IP information, domain-name information, image information, audio-frequency information.
Further, above-described embodiment can be implemented in router side or network provider side, filter in advance
Picture concerned.
Corresponding with method, referring to fig. 2, the disclosure discloses in another embodiment a kind of identifies harmful picture
System, comprising:
First weight factor generation module, is used for: obtain the path URL of picture, and then according to the path URL obtain domain name,
IP address, and it is based on the IP address, it is inquired with the presence or absence of the IP address or same network segment IP in first database
Location, and the first weight factor relevant to IP is exported according to the query result of IP address;
Second weight factor generation module, is used for: it is based on domain name, whois inquiry is carried out in the second database, and
The second weight factor relevant to domain name is exported according to whois query result;
Third weight factor generation module, is used for: the picture obtained, extracts DC coefficient in the compression domain of picture,
To identify the picture after carrying out part decompression to picture, and third weight factor is exported according to the result of identification picture;
Identification module, for integrating the first weight factor and the second weight factor and third weight factor, to the figure
Whether piece, which belongs to harmful picture, is identified.
It is similar with the embodiment of each method above,
Preferably, second database is third party database.
It is furthermore preferred that the second weight factor generation module further include:
Amending unit is used for: it is further, inquired in third party's domain name safe list the safety of domain name so as to
The output safety factor, and second weight factor relevant to domain name is modified by the factor of safety.
It is furthermore preferred that the third weight factor generation module, is also used to:, will be described after carrying out part decompression to picture
Picture is compared with harmful picture safeguard in third party's image data base, known carries out feature, to identify the picture, and
When being identified as nocuousness, further the picture is updated to third party's image data base;Wherein, third party's image
It is pre-established in database by the picture for known harmful sites of creeping.
It is furthermore preferred that also correcting third weight factor by such as lower unit in the third weight factor generation module:
Audio is creeped unit, for audio of creeping in the website of the picture;
Audio identification unit, for identification in audio whether include harmful content, if so, then correct third weight because
Son.
The disclosure discloses a kind of system for identifying harmful picture in another embodiment, comprising:
Processor and memory, are stored with executable instruction in the memory, the processor execute these instructions with
Execute following operation:
Step a) obtains the path URL of picture, and then obtains domain name, the address IP according to the path URL, and based on described
IP address, inquiry whether there is the address IP or same network segment IP address, and looking into according to IP address in first database
It askes result and exports the first weight factor relevant to IP;
Step b) is based on domain name, whois inquiry is carried out in the second database, and defeated according to whois query result
The second weight factor relevant to domain name out;
Step c) obtains the picture, and extracts DC coefficient in the compression domain of picture, to carry out part to picture
The picture is identified after decompression, and third weight factor is exported according to the result of identification picture;
Step d), comprehensive first weight factor and the second weight factor and third weight factor, to the picture whether
Belong to harmful picture to be identified.
The disclosure further discloses a kind of computer storage medium in another embodiment, is stored with executable instruction, institute
Instruction is stated for executing the following method for identifying harmful picture:
Step a) obtains the path URL of picture, and then obtains domain name, the address IP according to the path URL, and based on described
IP address, inquiry whether there is the address IP or same network segment IP address, and looking into according to IP address in first database
It askes result and exports the first weight factor relevant to IP;
Step b) is based on domain name, whois inquiry is carried out in the second database, and defeated according to whois query result
The second weight factor relevant to domain name out;
Step c) obtains the picture, and extracts DC coefficient in the compression domain of picture, to carry out part to picture
The picture is identified after decompression, and third weight factor is exported according to the result of identification picture;
Step d), comprehensive first weight factor and the second weight factor and third weight factor, to the picture whether
Belong to harmful picture to be identified.
It may include: at least one processor (such as CPU) for above system, at least one sensor (such as plus
Speedometer, gyroscope, GPS module or other locating modules), at least one processor, at least one communication bus, wherein logical
Believe bus for realizing the connection communication between various components.The equipment can also include at least one receiver, at least one
A transmitter, wherein receiver and transmitter can be wired sending port, be also possible to wireless device (for example including antenna
Device), for carrying out the transmission of signaling or data with other node devices.The memory can be high speed RAM memory,
It can be non-labile memory (Non-volatile memory), for example, at least a magnetic disk storage.Memory is optional
Can be at least one storage device for being located remotely from aforementioned processor.Batch processing code is stored in memory, and described
Processor can call the code stored in memory to execute relevant function by communication bus.
Embodiment of the disclosure also provides a kind of computer storage medium, wherein the computer storage medium can store journey
Sequence, the program include the part or complete for any method for identifying harmful picture recorded in above method embodiment when executing
Portion's step.
Step in embodiment of the disclosure method can be sequentially adjusted, merged and deleted according to actual needs.
Module and unit in embodiment of the disclosure system can be combined, divided and deleted according to actual needs.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of action groups
It closes, but those skilled in the art should understand that, the present invention is not limited by the sequence of acts described, because according to this hair
Bright, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know that, specification
Described in embodiment belong to preferred embodiment, related movement, module, unit not necessarily present invention institute are necessary
's.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
In several embodiments provided by the disclosure, it should be understood that disclosed system, it can be by another way
It realizes.For example, embodiments described above is only illustrative, such as the division of the unit, only a kind of logic function
It can divide, there may be another division manner in actual implementation, such as multiple units or components can be combined or be can integrate
To another system, or some features can be ignored or not executed.Another point, each unit or the mutual coupling of component or
Direct-coupling or communication connection can be through some interfaces, and the indirect coupling or communication connection of device or unit can be electricity
Property or other form.
The unit as illustrated by the separation member may or may not be physically separated, and can both be located at
One place, or may be distributed over multiple network units.Can select according to the actual needs part therein or
Whole units achieve the purpose of the solution of this embodiment.
It, can also be in addition, each functional unit in each embodiment of the disclosure can integrate in one processing unit
It is each unit individualism, can also be integrated in one unit with two or more units.Above-mentioned integrated unit was both
It can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, the technical solution of the disclosure is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be smart phone, personal digital assistant, wearable device, laptop, tablet computer) executes each of the disclosure
The all or part of the steps of a embodiment the method.And storage medium above-mentioned include: USB flash disk, read-only memory (R0M,
Read-0nly Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or
The various media that can store program code such as CD.
The above, above embodiments are only to illustrate the technical solution of the disclosure, rather than its limitations;Although referring to before
Embodiment is stated the disclosure is described in detail, it should be understood by those skilled in the art that: it still can be to aforementioned each reality
Technical solution documented by example is applied to modify or equivalent replacement of some of the technical features;And these modification or
Person's replacement, the range for the presently disclosed embodiments technical solution that it does not separate the essence of the corresponding technical solution.
Claims (10)
1. a kind of method for identifying harmful picture, comprising:
Step a) obtains the path URL of picture, and then obtains domain name, IP address according to the path URL, and based on the IP
Location, inquiry whether there is the IP address or same network segment IP address in first database, and according to the inquiry knot of IP address
Fruit exports the first weight factor relevant to IP;
Step b), be based on domain name, in the second database carry out whois inquiry, and according to whois query result output with
Relevant second weight factor of domain name;
Step c) obtains the picture, and extracts DC coefficient in the compression domain of picture, to carry out part decompression to picture
After identify the picture, and third weight factor is exported according to the result of identification picture;
Whether step d), comprehensive first weight factor and the second weight factor and third weight factor, belong to the picture
Harmful picture is identified.
2. according to the method described in claim 1, wherein, it is preferred that second database is third party database.
3. according to the method described in claim 1, wherein, step b) further include:
Further, the safety of domain name is inquired in third party's domain name safe list so as to the output safety factor, and led to
The factor of safety is crossed to be modified second weight factor relevant to domain name.
4. the picture is identified after carrying out part decompression to picture in step c) according to the method described in claim 1, wherein,
It specifically includes:
After carrying out part decompression to picture, by harmful picture safeguarded in the picture and third party's image data base, known
Feature comparison is carried out, to identify the picture, and when being identified as nocuousness, is further updated the picture to the third
Square image data base;Wherein, it is pre-established in third party's image data base by the picture for known harmful sites of creeping.
5. according to the method described in claim 1, wherein, step c) further includes as follows:
Step c1): audio of creeping in the website of the picture;
Step c2): it whether include harmful content in identification audio, if so, then correcting third weight factor.
6. a kind of system for identifying harmful picture, comprising:
First weight factor generation module, is used for: obtaining the path URL of picture, and then with obtaining domain name, IP according to the path URL
Location, and it is based on the IP address, inquiry whether there is the IP address or same network segment IP address in first database, and
The first weight factor relevant to IP is exported according to the query result of IP address;
Second weight factor generation module, is used for: it is based on domain name, the progress whois inquiry in the second database, and according to
Whois query result exports the second weight factor relevant to domain name;
Third weight factor generation module, is used for: the picture obtained, and extracts DC coefficient in the compression domain of picture, with
Just the picture is identified after carrying out part decompression to picture, and third weight factor is exported according to the result of identification picture;
Identification module is to the picture for integrating the first weight factor and the second weight factor and third weight factor
It is no to belong to harmful picture and identified.
7. system according to claim 6, wherein preferred, second database is third party database.
8. system according to claim 6, wherein the second weight factor generation module further include:
Amending unit is used for: it is further, the safety of domain name is inquired in third party's domain name safe list to export
Factor of safety, and second weight factor relevant to domain name is modified by the factor of safety.
9. system according to claim 6, wherein the third weight factor generation module is also used to: being carried out to picture
After part decompresses, by the picture compared with harmful picture safeguard in third party's image data base, known carries out feature, with
Just it identifies the picture, and when being identified as nocuousness, further updates the picture to third party's image data base;Its
In, it is pre-established in third party's image data base by the picture for known harmful sites of creeping.
10. system according to claim 6, wherein also by such as lower unit in the third weight factor generation module
Correct third weight factor:
Audio is creeped unit, for audio of creeping in the website of the picture;
Whether audio identification unit includes for identification harmful content in audio, if so, then correcting third weight factor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711499941.4A CN110109952A (en) | 2017-12-30 | 2017-12-30 | A kind of method and its system identifying harmful picture |
PCT/CN2018/072247 WO2019127663A1 (en) | 2017-12-30 | 2018-01-11 | Harmful picture identification method and system therefor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711499941.4A CN110109952A (en) | 2017-12-30 | 2017-12-30 | A kind of method and its system identifying harmful picture |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110109952A true CN110109952A (en) | 2019-08-09 |
Family
ID=67062934
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711499941.4A Pending CN110109952A (en) | 2017-12-30 | 2017-12-30 | A kind of method and its system identifying harmful picture |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110109952A (en) |
WO (1) | WO2019127663A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103905372A (en) * | 2012-12-24 | 2014-07-02 | 珠海市君天电子科技有限公司 | Method and device for removing false alarm of phishing website |
CN104615760A (en) * | 2015-02-13 | 2015-05-13 | 北京瑞星信息技术有限公司 | Phishing website recognizing method and phishing website recognizing system |
CN106055574A (en) * | 2016-05-19 | 2016-10-26 | 微梦创科网络科技(中国)有限公司 | Method and device for recognizing illegal URL |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6751348B2 (en) * | 2001-03-29 | 2004-06-15 | Fotonation Holdings, Llc | Automated detection of pornographic images |
CN100490532C (en) * | 2006-04-30 | 2009-05-20 | 华为技术有限公司 | Video code stream filtering method and filtering node |
CN102880613A (en) * | 2011-07-14 | 2013-01-16 | 腾讯科技(深圳)有限公司 | Identification method of porno pictures and equipment thereof |
CN106354800A (en) * | 2016-08-26 | 2017-01-25 | 中国互联网络信息中心 | Undesirable website detection method based on multi-dimensional feature |
-
2017
- 2017-12-30 CN CN201711499941.4A patent/CN110109952A/en active Pending
-
2018
- 2018-01-11 WO PCT/CN2018/072247 patent/WO2019127663A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103905372A (en) * | 2012-12-24 | 2014-07-02 | 珠海市君天电子科技有限公司 | Method and device for removing false alarm of phishing website |
CN104615760A (en) * | 2015-02-13 | 2015-05-13 | 北京瑞星信息技术有限公司 | Phishing website recognizing method and phishing website recognizing system |
CN106055574A (en) * | 2016-05-19 | 2016-10-26 | 微梦创科网络科技(中国)有限公司 | Method and device for recognizing illegal URL |
Non-Patent Citations (1)
Title |
---|
杨辉 等: "压缩域DCT系数对图像视频检索影响的研究", 《南京邮电学院学报》 * |
Also Published As
Publication number | Publication date |
---|---|
WO2019127663A1 (en) | 2019-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10778702B1 (en) | Predictive modeling of domain names using web-linking characteristics | |
US11399288B2 (en) | Method for HTTP-based access point fingerprint and classification using machine learning | |
CN110380954B (en) | Data sharing method and device, storage medium and electronic device | |
CN110020122B (en) | Video recommendation method, system and computer readable storage medium | |
CN104933363A (en) | Method and device for detecting malicious file | |
CN103189836A (en) | Method for classification of objects in a graph data stream | |
CN103888480B (en) | Network information security authentication method and cloud device based on cloud monitoring | |
CN110019892A (en) | A kind of method and its system identifying harmful picture based on User ID | |
CN104765746A (en) | Data processing method and device for mobile communication terminal browser | |
Bijitha et al. | On the effectiveness of image processing based malware detection techniques | |
CN103886238A (en) | Account login method and device based on palm prints | |
KR20100120966A (en) | System for sorting phising site base on searching web site and method therefor | |
CN107665229B (en) | Information searching method, device and equipment | |
CN110020256A (en) | The method and system of the harmful video of identification based on User ID and trailer content | |
CN110363023B (en) | Anonymous network tracing method based on PHMM | |
CN110019946A (en) | A kind of method and its system identifying harmful video | |
CN110109952A (en) | A kind of method and its system identifying harmful picture | |
CN109993036A (en) | A kind of method and its system identifying harmful video based on User ID | |
CN110020259A (en) | A kind of method and its system identifying harmful picture based on User IP | |
CN109271706A (en) | Hair style generation method and device | |
CN110020252A (en) | The method and its system of the harmful video of identification based on trailer content | |
TW201626279A (en) | Protection method and computer system thereof | |
CN110020251A (en) | The method and system of the harmful video of identification based on User IP and trailer content | |
CN205427857U (en) | Identity identification system based on many biological characteristics combine equipment fingerprint | |
CN110020254A (en) | The method and system of the harmful video of identification based on User IP and video copy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |