CN114760124B - Big data based computer network security intelligent analysis system and method - Google Patents

Big data based computer network security intelligent analysis system and method Download PDF

Info

Publication number
CN114760124B
CN114760124B CN202210364704.1A CN202210364704A CN114760124B CN 114760124 B CN114760124 B CN 114760124B CN 202210364704 A CN202210364704 A CN 202210364704A CN 114760124 B CN114760124 B CN 114760124B
Authority
CN
China
Prior art keywords
website
detected
size
module
current user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210364704.1A
Other languages
Chinese (zh)
Other versions
CN114760124A (en
Inventor
娄存恺
金旭佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yabang Management Technology Beijing Co ltd
Original Assignee
Yabang Management Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yabang Management Technology Beijing Co ltd filed Critical Yabang Management Technology Beijing Co ltd
Priority to CN202210364704.1A priority Critical patent/CN114760124B/en
Publication of CN114760124A publication Critical patent/CN114760124A/en
Application granted granted Critical
Publication of CN114760124B publication Critical patent/CN114760124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Technology Law (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a computer network safety intelligent analysis system and a computer network safety intelligent analysis method based on big data, wherein the intelligent analysis system comprises an authentication database, an operation information monitoring module, a website judging module and an access analysis module, the authentication database is used for storing the website of an authentication website, the operation information monitoring module is used for monitoring the operation information of a current user of a computer, when the situation that the current user opens a new website is detected, the new website is the to-be-detected website, the website of the to-be-detected website is the to-be-detected website, the website judging module is used for judging whether the to-be-detected website is the website in the authentication database, if the to-be-detected website is the website in the authentication database, the current user is allowed to directly access, if the to-be-detected website is the website other than the website in the authentication database, the access analysis module is used for obtaining the characteristic information of the to-be-detected website and the historical operation information of the current user, and accordingly judging whether to send access early warning information.

Description

Big data based computer network security intelligent analysis system and method
Technical Field
The invention relates to the technical field of computers, in particular to a computer network security intelligent analysis system and method based on big data.
Background
With the development of internet technology, more and more websites come into operation, people inevitably complete various matters in life by visiting the websites in daily life, such as transferring money in online banks or purchasing online in e-commerce websites, and the like, thereby facilitating the daily life of people. At the same time, many illegal websites are bred. These illegal websites either steal private information such as bank accounts and passwords submitted by users, or induce users to go wrong way, and once they are mistakenly entered into these illegal websites, there is a possibility that they may pose a significant threat to the physical and mental health and property safety of users.
In the prior art, some websites newly visited by a user cannot be effectively monitored, and the user is easy to mistakenly access an illegal website.
Disclosure of Invention
The invention aims to provide a computer network security intelligent analysis system and method based on big data, so as to solve the problems in the background technology.
In order to solve the technical problems, the invention provides the following technical scheme: a computer network security intelligent analysis system based on big data comprises an authentication database, an operation information monitoring module, a website judging module and an access analysis module, wherein the authentication database is used for storing the website of an authentication website, the operation information monitoring module is used for monitoring the operation information of a current user of a computer, when the situation that the current user opens a new website is detected, the new website is obtained and is a to-be-detected website, the website of the to-be-detected website is the to-be-detected website, the website judging module is used for judging whether the to-be-detected website is the website in the authentication database, if the to-be-detected website is the website in the authentication database, the current user is allowed to directly access, and if the to-be-detected website is the website other than the website in the authentication database, the access analysis module obtains the characteristic information of the to-be-detected website and the historical operation information of the current user, and accordingly judges whether to send access early warning information.
Further, the access analysis module comprises a website analysis module and a user analysis module, the website analysis module comprises a font parameter acquisition module, a color parameter acquisition module, a matching parameter acquisition module, an in-doubt parameter calculation module and an in-doubt parameter comparison module, the font parameter acquisition module acquires font parameters S of the website to be detected according to font information on a page of the website to be detected, the color parameter acquisition module divides a homepage of the website to be detected into m detection regions according to different background colors of the regions, sorts the background color of the homepage of the website to be detected according to the area occupied by the background color in the homepage of the website to be detected in a sequence from multiple to multiple, selects the first background color as a central color, and calculates the color parameter of the website to be detected
Figure GDA0003804066290000021
Wherein t is the number of detection areas with the background color as the central color in the m detection areas, e is the ratio of the area of the detection areas with the background color as the central color in the website to be detected to the total area of all the detection areas of the website to be detected, the matching parameter acquisition module calculates the matching parameter Z of the homepage of the website to be detected according to the font information and the color matching information of the page to be detected, and the in-doubt parameter calculation module calculates the in-doubt parameter R =0.62 of the website to be detected* S + 0.22C + 0.16Z, the in-doubt parameter comparison module compares the in-doubt parameter of the website to be detected with the in-doubt threshold, and if the in-doubt parameter of the website to be detected is greater than the in-doubt threshold, the current user is allowed to directly access; and if the doubt parameter of the website to be detected is smaller than the doubt threshold value, analyzing the historical operation information of the current user.
Further, the font parameter obtaining module includes a center size selecting module, a size classifying module, a ranking coefficient obtaining module and a font parameter calculating module, the center size selecting module sorts font sizes of the website homepage to be detected according to a size order from large to small to obtain an analysis ranking, the number of fonts of each font size in the website homepage to be detected is obtained respectively, the font size with the largest number of selected fonts is the center size, the size classifying module obtains the font size on the left side of the center size in the analysis ranking as a first size, the font size on the right side of the center size as a second size, the font size category number in the first size is a first category number, the font size category number in the second size is a second category number, the ranking coefficient obtaining module calculates the ranking coefficient b = d1/d2 of the center size, wherein d1 is the smaller one of the first category number and the second category number, d2 is the larger one of the first category number and the second category number, and the font parameter calculating module to be detected calculates the font parameter of the website to be detected
Figure GDA0003804066290000022
The matching parameter acquiring module calculates a matching parameter Z = Ns/h of the website homepage to be detected, wherein, h is the number of fonts corresponding to the center size, g1 is the sum of the numbers of the fonts corresponding to the first size and the center size, and g2 is the sum of the numbers of the fonts corresponding to the second size and the center size, and Ns is the number of the fonts with the center size and the background color in the website homepage to be detected.
Further, the user analysis module includes a feature average value calculation module and an average value comparison module, the feature average value calculation module obtains an average value of feature indexes of a current user logging in a computer to access a website every time in a recent period of time, wherein the feature index w = Ys/Yz of the current user logging in the computer to access the website every time, yz is the total number of the current user logging in the computer to access the website every time, ys is the number of illegal websites in the website accessed by the current user logging in the computer every time, the average value comparison module compares the average value of the feature indexes of the current user with a feature threshold, if the average value of the feature indexes of the current user is smaller than the feature threshold, the current user is allowed to directly access, and if not, early warning information that the website to be detected is suspected to have danger is sent to the user.
A big data-based computer network security intelligent analysis method comprises the following steps:
pre-establishing an authentication database for storing the website address of the authentication website,
monitoring the operation information of the current user of the computer, acquiring a new website as a to-be-detected website when detecting that the current user opens the new website, and acquiring the website of the to-be-detected website as the to-be-detected website,
if the web address to be detected is a web address in the authentication database, the current user is allowed to directly access,
and if the website to be detected is a website other than the website in the authentication database, acquiring the characteristic information of the website to be detected and the historical operation information of the current user, and judging whether to send access early warning information or not according to the characteristic information.
Further, the acquiring the characteristic information of the website to be detected includes:
collecting page information of a website to be detected,
sorting the font sizes of the website homepage to be detected according to the size from large to small to obtain analysis sorting, respectively obtaining the number of fonts of each font size in the website homepage to be detected, selecting the font size with the largest number of fonts as the center size,
obtaining the font size on the left side of the center size in the analysis sorting as a first size, the font size on the right side of the center size as a second size, the font size category number in the first size as a first category number, the font size category number in the second size as a second category number,
then the center size ranking factor b = d1/d2, where d1 is the smaller of the first number of classes and the second number of classes, d2 is the larger of the first number of classes and the second number of classes,
calculating font parameters of to-be-detected website
Figure GDA0003804066290000031
Wherein h is the number of the fonts corresponding to the center size, g1 is the sum of the number of the fonts corresponding to the first size and the center size, g2 is the sum of the number of the fonts corresponding to the second size and the center size,
dividing the web page to be detected into m detection regions according to the difference of the background colors of the regions, sorting the background colors of the web page to be detected according to the area occupied by the background colors in the web page to be detected in a sequence from the top to the bottom, selecting the first background color as the central color,
calculating color parameters of to-be-detected website
Figure GDA0003804066290000032
Wherein t is the number of detection regions with the background color as the central color in the m detection regions, e is the ratio of the area of the detection region with the background color as the central color in the website to be detected to the total area of all the detection regions of the website to be detected,
calculating a matching parameter Z = Ns/h of the website homepage to be detected, wherein Ns is the number of fonts with the font size as the center size in the website homepage to be detected and the background color as the center color;
then the in-doubt parameter R =0.62 + s +0.22 + c +0.16 + z of the website to be detected,
if the doubt parameter of the website to be detected is greater than the doubt threshold value, allowing the current user to directly access;
and if the doubt parameter of the website to be detected is smaller than the doubt threshold value, analyzing the historical operation information of the current user.
Further, the analyzing the historical operation information of the current user includes:
acquiring an average value of characteristic indexes of a current user for accessing a website by logging in a computer every time in a recent period of time, wherein the characteristic index w = Ys/Yz of the current user for accessing the website by logging in the computer every time, yz is the total number of the current user for accessing the website by logging in the computer every time, ys is the number of illegal websites in the current user for accessing the website by logging in the computer every time,
if the average value of the feature index of the current user is less than the feature threshold, allowing the current user direct access,
and otherwise, sending early warning information suspected of danger in the website to be detected to the user.
Further, the pre-established authentication database includes:
when the number of times that the user accesses a certain website is larger than the threshold number of times, the website is an authentication website.
Further, the illegal website includes a phishing website, a gambling website, and a marketing website.
Compared with the prior art, the invention has the following beneficial effects: when a user accesses a new website, whether the current website has a potential risk is judged by acquiring the distribution condition of font sizes in the website and the condition of background colors of areas in the website, when the website is judged to have the potential risk, the historical operation condition information of the user is analyzed, and when the user is judged to be possibly deceived by the website, the reminding early warning information is sent out, so that the probability that the user is deceived by the website is reduced, and the personal and property safety of the user in the internet surfing process is maintained.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a block diagram of a big data-based computer network security intelligent analysis system according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present invention provides a technical solution: a computer network security intelligent analysis system based on big data comprises an authentication database, an operation information monitoring module, a website judging module and an access analysis module, wherein the authentication database is used for storing the website of an authentication website, the operation information monitoring module is used for monitoring the operation information of a current user of a computer, when the situation that the current user opens a new website is detected, the new website is obtained and is a to-be-detected website, the website of the to-be-detected website is the to-be-detected website, the website judging module is used for judging whether the to-be-detected website is the website in the authentication database, if the to-be-detected website is the website in the authentication database, the current user is allowed to directly access, and if the to-be-detected website is the website other than the website in the authentication database, the access analysis module obtains the characteristic information of the to-be-detected website and the historical operation information of the current user, and accordingly judges whether to send access early warning information.
The access analysis module comprises a website analysis module and a user analysis module, the website analysis module comprises a font parameter acquisition module, a color parameter acquisition module, a matching parameter acquisition module, an in-doubt parameter calculation module and an in-doubt parameter comparison module, the font parameter acquisition module acquires a font parameter S of the website to be detected according to font information on a page of the website to be detected, the color parameter acquisition module divides a homepage of the website to be detected into m detection areas according to the difference of background colors of the areas, sorts the background colors of the homepage of the website to be detected according to the sequence from more than one to less according to the area occupied by the background colors in the homepage of the website to be detected, selects the first background color as a central color, and calculates the color parameter of the website to be detected
Figure GDA0003804066290000051
The method comprises the following steps that t is the number of detection areas with the background color as the central color in m detection areas, e is the ratio of the area of the detection areas with the background color as the central color in a website to be detected to the total area of all the detection areas of the website to be detected, a matching parameter obtaining module calculates a matching parameter Z of a homepage of the website to be detected according to font information and color matching information of a page to be detected, an in-doubt parameter calculating module calculates an in-doubt parameter R = 0.62S + 0.22C + 0.16Z of the website to be detected, an in-doubt parameter comparing module compares the in-doubt parameter of the website to be detected with an in-doubt threshold, and if the in-doubt parameter of the website to be detected is greater than the in-doubt threshold, a current user is allowed to directly access; and if the doubt parameter of the website to be detected is smaller than the doubt threshold value, analyzing the historical operation information of the current user.
The font parameter obtaining module comprises a center size selecting module, a size classifying module, a ranking coefficient obtaining module and a font parameter calculating module, wherein the center size selecting module is used for sequencing font sizes of the website homepage to be detected from large to small to obtain analysis sequencing, the number of fonts of each font size in the website homepage to be detected is obtained respectively, the font size with the largest number of the selected fonts is the center size, the size classifying module is used for obtaining the font size on the left side of the center size in the analysis sequencing as a first size, the font size on the right side of the center size as a second size, the font size number in the first size is a first number, the font size number in the second size is a second number, the ranking coefficient obtaining module is used for calculating the ranking coefficient b = d1/d2 of the center size, wherein d1 is the smaller one of the first number and the second number, d2 is the larger one of the first number and the second number, and the font parameter calculating module is used for calculating the website to be detected
Figure GDA0003804066290000061
H is the number of the fonts corresponding to the center size, g1 is the sum of the number of the fonts corresponding to the first size and the center size, and g2 is the second sizeThe matching parameter acquisition module calculates the matching parameter Z = Ns/h of the website homepage to be detected, wherein Ns is the number of fonts with the font size as the center size in the website homepage to be detected, and the background color as the center color.
The user analysis module comprises a characteristic average value calculation module and an average value comparison module, wherein the characteristic average value calculation module obtains an average value of characteristic indexes of a current user logging in a computer to access a website every time in a recent period of time, the characteristic index w = Ys/Yz of the current user logging in the computer to access the website every time, yz is the total number of the current user logging in the computer to access the website every time, ys is the number of illegal websites in the website accessed by the current user logging in the computer every time, the average value comparison module compares the average value of the characteristic indexes of the current user with a characteristic threshold value, if the average value of the characteristic indexes of the current user is smaller than the characteristic threshold value, the current user is allowed to directly access, and if not, early warning information that the website to be detected is suspected to have danger is sent to the user.
A big data-based computer network security intelligent analysis method comprises the following steps:
pre-establishing an authentication database, wherein the authentication database is used for storing the website address of an authentication website, when the number of times that a user accesses a certain website is greater than a number threshold value, the website is the authentication website,
monitoring the operation information of the current user of the computer, acquiring a new website as a to-be-detected website when detecting that the current user opens the new website, and acquiring the website of the to-be-detected website as the to-be-detected website,
if the web address to be detected is a web address in the authentication database, the current user is allowed to directly access,
if the web address to be detected is a web address other than the web address in the authentication database,
collecting page information of a website to be detected,
sorting the font sizes of the website homepage to be detected according to the size from large to small to obtain analysis sorting, respectively obtaining the number of fonts of each font size in the website homepage to be detected, selecting the font size with the largest number of fonts as the center size,
obtaining the font size on the left side of the center size in the analysis sorting as a first size, the font size on the right side of the center size in the analysis sorting as a second size, the font size category number in the first size as a first category number, the font size category number in the second size as a second category number,
then the center size ranking factor b = d1/d2, where d1 is the smaller of the first number of classes and the second number of classes, d2 is the larger of the first number of classes and the second number of classes,
calculating font parameters of to-be-detected website
Figure GDA0003804066290000071
Wherein h is the number of fonts corresponding to the center size, g1 is the sum of the number of fonts corresponding to the first size and the center size, g2 is the sum of the number of fonts corresponding to the second size and the center size,
for example, the order of the analysis ranks is: the number of fonts corresponding to size 1, size 2, size 3, size 4, size 5 and size 6 is 10, 25, 80, 50, 20 and 8 respectively,
then the center size is size 3, size 1 and size 2 are first sizes, size 4, size 5 and size 6 are second sizes, the number of the first categories is 2, the number of the second categories is 3, then the ranking coefficient b =2/3 of the center size, then the font parameters of the website to be detected
Figure GDA0003804066290000072
According to the method, the font size in the homepage is set to be larger in consideration of clicking of a plurality of illegal websites for attraction, so that the number of large-size fonts is larger, while the font distribution of the normal websites is in the condition that the large-size fonts and the small-size fonts are relatively fewer, and the medium-size fonts are more, the font parameters are calculated by utilizing the number proportion and the ranking coefficient of the central-size fonts, and the websites are judged to be illegal networks according to the font parametersThe probability of the website is smaller, and when the font parameters are larger, the probability that the website is an illegal website is smaller;
dividing the website homepage to be detected into m detection areas according to the difference of the background colors of the areas, sorting the background colors of the website homepage to be detected according to the area occupied by the background colors in the website homepage to be detected in a sequence from multiple to few, and selecting the first background color as a central color, wherein the background colors of the areas do not comprise the white regions arranged on the two sides of the webpage;
calculating color parameters of to-be-detected website
Figure GDA0003804066290000073
The method comprises the steps that t is the number of detection areas with the background color as the central color in m detection areas, and e is the ratio of the area of the detection areas with the background color as the central color in a website to be detected to the total area of all the detection areas of the website to be detected, the method considers that a lot of illegal websites click for attraction, the colors of the liriohu whistle arranged on the homepage of the website are mixed and scattered, a normal website is often provided with a main background color, the area occupied by the main background color is larger, based on the method, the color parameters are used as the probability of judging the website to be the illegal website, when the color parameters are smaller, the website is not provided with the main background color, the colors of all the areas are mixed, and the probability of the website to be the illegal website is larger;
calculating a matching parameter Z = Ns/h of the website homepage to be detected, wherein Ns is the number of fonts with the font size as the center size in the website homepage to be detected and the background color as the center color; in a normal website, the font corresponding to the main background color is the font size with the largest font number in the whole page, so that the probability that the website to be detected is an illegal website is lower when the matching parameter is larger;
then the in-doubt parameter R =0.62 + s +0.22 + c +0.16 + z of the website to be detected,
if the doubt parameter of the website to be detected is greater than the doubt threshold value, allowing the current user to directly access;
if the doubt parameter of the website to be detected is smaller than the doubt threshold value, analyzing the historical operation information of the current user; when the doubt parameter is smaller, the probability that the website to be detected is an illegal website is higher;
the analyzing the historical operation information of the current user comprises:
acquiring an average value of characteristic indexes of websites accessed by a current user through logging in a computer every time in a recent period of time, wherein the characteristic index w = Ys/Yz of the websites accessed by the current user through logging in the computer every time, yz is the total number of the websites accessed by the current user through logging in the computer every time, ys is the number of illegal websites in the websites accessed by the current user through logging in the computer every time, and the illegal websites comprise phishing websites, gambling websites and marketing websites;
if the average value of the characteristic indexes of the current user is less than the characteristic threshold value, allowing the current user to directly access,
and if the average value of the characteristic indexes of the current user is larger than or equal to the characteristic threshold, sending early warning information of suspected danger of the website to be detected to the user. When the average value of the characteristic indexes of the user is larger, the probability that the user accesses an illegal website in the internet surfing process is higher, and therefore, the user needs to be reminded and early warned in advance.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. An intelligent analysis system for computer network security based on big data is characterized by comprising an authentication database, an operation information monitoring module, a website judging module and an access analysis module, wherein the authentication database is used for storing websites of authentication websites, the operation information monitoring module is used for monitoring operation information of a current user of a computer, when the situation that the current user opens a new website is detected, the new website is a to-be-detected website, the website of the to-be-detected website is the to-be-detected website, the website judging module is used for judging whether the to-be-detected website is the website in the authentication database, if the to-be-detected website is the website in the authentication database, the current user is allowed to directly access, and if the to-be-detected website is the website in the authentication database except the website, the access analysis module obtains characteristic information of the to-be-detected website and historical operation information of the current user, and judges whether to send access early warning information according to the characteristic information and the historical operation information of the current user;
the access analysis module comprises a website analysis module and a user analysis module, the website analysis module comprises a font parameter acquisition module, a color parameter acquisition module, a matching parameter acquisition module, an in-doubt parameter calculation module and an in-doubt parameter comparison module, the font parameter acquisition module acquires font parameters S of the website to be detected according to font information on a page of the website to be detected, the color parameter acquisition module divides the homepage of the website to be detected into m detection regions according to the difference of background colors of the regions, sorts the background colors of the homepage of the website to be detected according to the sequence from multiple to few of the area occupied by the background colors in the homepage of the website to be detected, selects the first background color as a central color, and calculates the color parameter of the website to be detected
Figure FDA0003804066280000011
WhereinT is the number of detection areas with the background color as the central color in the m detection areas, e is the ratio of the area of the detection areas with the background color as the central color in the website to be detected to the total area of all the detection areas of the website to be detected, the matching parameter obtaining module calculates the matching parameter Z of the homepage of the website to be detected according to the font information and the color matching information of the page to be detected, the in-doubt parameter calculating module calculates the in-doubt parameter R = 0.62S + 0.22C + 0.16Z of the website to be detected, the in-doubt parameter comparing module compares the in-doubt parameter of the website to be detected with the in-doubt threshold, and if the in-doubt parameter of the website to be detected is greater than the in-doubt threshold, the current user is allowed to directly access; if the doubt parameter of the website to be detected is smaller than the doubt threshold value, analyzing the historical operation information of the current user;
the font parameter obtaining module comprises a center size selecting module, a size classifying module, a ranking coefficient obtaining module and a font parameter calculating module, wherein the center size selecting module is used for sequencing font sizes of the website homepage to be detected from large to small to obtain analysis sequencing, the number of fonts of each font size in the website homepage to be detected is obtained respectively, the font size with the largest number of the selected fonts is the center size, the size classifying module is used for obtaining the font size on the left side of the center size in the analysis sequencing as a first size, the font size on the right side of the center size as a second size, the font size number in the first size is a first number, the font size number in the second size is a second number, the ranking coefficient obtaining module is used for calculating the ranking coefficient b = d1/d2 of the center size, wherein d1 is the smaller one of the first number and the second number, d2 is the larger one of the first number and the second number, and the font parameter calculating module is used for calculating the website to be detected
Figure FDA0003804066280000021
Wherein h is the number of fonts corresponding to the center size, g1 is the sum of the numbers of the fonts corresponding to the first size and the center size, and g2 is the sum of the numbers of the fonts corresponding to the second size and the center size, and the matching parameters are obtainedAnd calculating a matching parameter Z = Ns/h of the website homepage to be detected by the module, wherein Ns is the number of fonts with the font size as the center size in the website homepage to be detected and the background color as the center color.
2. The big data based computer network security intelligent analysis system of claim 1, wherein: the user analysis module comprises a characteristic average value calculation module and an average value comparison module, wherein the characteristic average value calculation module obtains an average value of characteristic indexes of a current user logging in a computer to access a website every time in a recent period of time, the characteristic index w = Ys/Yz of the current user logging in the computer to access the website every time, yz is the total number of the current user logging in the computer to access the website every time, ys is the number of illegal websites in the website accessed by the current user logging in the computer every time, the average value comparison module compares the average value of the characteristic indexes of the current user with a characteristic threshold value, if the average value of the characteristic indexes of the current user is smaller than the characteristic threshold value, the current user is allowed to directly access, and if not, early warning information that the website to be detected is suspected to have danger is sent to the user.
3. A computer network security intelligent analysis method based on big data is characterized in that: the intelligent analysis method comprises the following steps:
pre-establishing an authentication database for storing a website address of an authentication website,
monitoring the operation information of the current user of the computer, acquiring a new website as a website to be detected when detecting that the current user opens the new website,
if the web address to be detected is a web address in the authentication database, the current user is allowed to directly access,
if the website to be detected is a website other than the website in the authentication database, acquiring characteristic information of the website to be detected and historical operation information of a current user, and judging whether to send access early warning information or not according to the characteristic information;
the acquiring of the characteristic information of the website to be detected comprises the following steps:
collecting page information of a website to be detected,
sorting the font sizes of the website homepage to be detected according to the size from large to small to obtain analysis sorting, respectively obtaining the number of fonts of each font size in the website homepage to be detected, selecting the font size with the largest number of fonts as the center size,
obtaining the font size on the left side of the center size in the analysis sorting as a first size, the font size on the right side of the center size as a second size, the font size category number in the first size as a first category number, the font size category number in the second size as a second category number,
then the center size ranking factor b = d1/d2, where d1 is the smaller of the first number of classes and the second number of classes, d2 is the larger of the first number of classes and the second number of classes,
calculating font parameters of to-be-detected website
Figure FDA0003804066280000031
Wherein h is the number of the fonts corresponding to the center size, g1 is the sum of the number of the fonts corresponding to the first size and the center size, g2 is the sum of the number of the fonts corresponding to the second size and the center size,
dividing the web page to be detected into m detection regions according to the difference of the background colors of the regions, sorting the background colors of the web page to be detected according to the area occupied by the background colors in the web page to be detected in a sequence from the top to the bottom, selecting the first background color as the central color,
calculating color parameters of to-be-detected website
Figure FDA0003804066280000032
Wherein t is the number of detection regions with the background color as the central color in the m detection regions, e is the ratio of the area of the detection region with the background color as the central color in the website to be detected to the total area of all the detection regions of the website to be detected,
calculating a matching parameter Z = Ns/h of the website homepage to be detected, wherein Ns is the number of fonts with the font size as the center size in the website homepage to be detected and the background color as the center color;
then the in-doubt parameter R =0.62 + s +0.22 + c +0.16 + z of the website to be detected,
if the doubt parameter of the website to be detected is greater than the doubt threshold value, allowing the current user to directly access;
and if the doubt parameter of the website to be detected is smaller than the doubt threshold value, analyzing the historical operation information of the current user.
4. The big data-based computer network security intelligent analysis method according to claim 3, wherein: the analyzing the historical operation information of the current user comprises:
acquiring an average value of characteristic indexes of a current user for accessing a website by logging in a computer every time in a recent period of time, wherein the characteristic index w = Ys/Yz of the current user for accessing the website by logging in the computer every time, yz is the total number of the current user for accessing the website by logging in the computer every time, ys is the number of illegal websites in the current user for accessing the website by logging in the computer every time,
if the average value of the feature index of the current user is less than the feature threshold, allowing the current user direct access,
and otherwise, sending early warning information suspected of danger in the website to be detected to the user.
5. The big data-based computer network security intelligent analysis method according to claim 3, wherein: the pre-established authentication database comprises:
when the number of times that the user accesses a certain website is larger than the threshold number of times, the website is an authentication website.
6. The big data-based computer network security intelligent analysis method according to claim 4, wherein: the illegal websites comprise phishing websites, gambling websites and marketing websites.
CN202210364704.1A 2022-04-07 2022-04-07 Big data based computer network security intelligent analysis system and method Active CN114760124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210364704.1A CN114760124B (en) 2022-04-07 2022-04-07 Big data based computer network security intelligent analysis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210364704.1A CN114760124B (en) 2022-04-07 2022-04-07 Big data based computer network security intelligent analysis system and method

Publications (2)

Publication Number Publication Date
CN114760124A CN114760124A (en) 2022-07-15
CN114760124B true CN114760124B (en) 2022-10-04

Family

ID=82329103

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210364704.1A Active CN114760124B (en) 2022-04-07 2022-04-07 Big data based computer network security intelligent analysis system and method

Country Status (1)

Country Link
CN (1) CN114760124B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102710645A (en) * 2012-06-06 2012-10-03 珠海市君天电子科技有限公司 Method and system for detecting phishing website
CN103856437A (en) * 2012-11-28 2014-06-11 深圳市金蝶中间件有限公司 Site security detection method and system
CN104935605A (en) * 2015-06-30 2015-09-23 北京奇虎科技有限公司 Detection method, device and system for fishing websites
CN108683666A (en) * 2018-05-16 2018-10-19 新华三信息安全技术有限公司 A kind of web page identification method and device
CN113242223A (en) * 2021-04-30 2021-08-10 刘厚泽 Website detection method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11411992B2 (en) * 2019-11-07 2022-08-09 Mcafee, Llc Visual detection of phishing websites via headless browser

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102710645A (en) * 2012-06-06 2012-10-03 珠海市君天电子科技有限公司 Method and system for detecting phishing website
CN103856437A (en) * 2012-11-28 2014-06-11 深圳市金蝶中间件有限公司 Site security detection method and system
CN104935605A (en) * 2015-06-30 2015-09-23 北京奇虎科技有限公司 Detection method, device and system for fishing websites
CN108683666A (en) * 2018-05-16 2018-10-19 新华三信息安全技术有限公司 A kind of web page identification method and device
CN113242223A (en) * 2021-04-30 2021-08-10 刘厚泽 Website detection method and device

Also Published As

Publication number Publication date
CN114760124A (en) 2022-07-15

Similar Documents

Publication Publication Date Title
US11475143B2 (en) Sensitive data classification
Boyack et al. Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches
CN104077396B (en) Method and device for detecting phishing website
Zhuang et al. An intelligent anti-phishing strategy model for phishing website detection
G. Martín et al. A survey for user behavior analysis based on machine learning techniques: current models and applications
CN107256357B (en) Detection and analysis method for android malicious application based on deep learning
CN104899508A (en) Multistage phishing website detecting method and system
CN107888602A (en) A kind of method and device for detecting abnormal user
CN104268570B (en) A kind of stratification list classification Ship Target false-alarm elimination method based on difference in class
CN107800679A (en) Palm off the detection method of academic journal website
Saunders et al. Using automated comparisons to quantify handwriting individuality
Boahen et al. Detection of compromised online social network account with an enhanced knn
CN111415167B (en) Network fraud transaction detection method and device, computer storage medium and terminal
CN117272204A (en) Abnormal data detection method, device, storage medium and electronic equipment
Liu et al. Detecting industry clusters from the bottom up based on co-location patterns mining: A case study in Dongguan, China
CN117235532B (en) Training and detecting method for malicious website detection model based on M-Bert
Zhou et al. Abnormal profiles detection based on time series and target item analysis for recommender systems
CN114760124B (en) Big data based computer network security intelligent analysis system and method
Zaman et al. Phishing website detection using effective classifiers and feature selection techniques
CN116668151A (en) Network intrusion detection method and device based on improved CSA optimization SVM
Zhu et al. PDHF: Effective phishing detection model combining optimal artificial and automatic deep features
CN106874739A (en) A kind of recognition methods of terminal iidentification and device
US11984196B2 (en) Community assignments in identity by descent networks and genetic variant origination
Alabdulwahab et al. Cyberbullying Detection using Machine Learning and Deep Learning
CN115392351A (en) Risk user identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220914

Address after: Room 229, 2nd Floor, Building 8, Yard 2, Jinsui Road, Shunyi District, Beijing 101300

Applicant after: Yabang Management Technology (Beijing) Co.,Ltd.

Address before: Room 211-021, floor 2, building 7, Harbin Songbei (Shenzhen Longgang) science and Technology Innovation Industrial Park, No. 3043, Zhigu Second Street, Songbei District, Harbin, Heilongjiang 150028

Applicant before: Heilongjiang Mindong Sensing Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant