KR101767589B1 - Web address extraction system for checking malicious code and method thereof - Google Patents
Web address extraction system for checking malicious code and method thereof Download PDFInfo
- Publication number
- KR101767589B1 KR101767589B1 KR1020150146240A KR20150146240A KR101767589B1 KR 101767589 B1 KR101767589 B1 KR 101767589B1 KR 1020150146240 A KR1020150146240 A KR 1020150146240A KR 20150146240 A KR20150146240 A KR 20150146240A KR 101767589 B1 KR101767589 B1 KR 101767589B1
- Authority
- KR
- South Korea
- Prior art keywords
- address
- web
- malicious code
- addresses
- extracting
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0236—Filtering by address, protocol, port number or service, e.g. IP-address or URL
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0281—Proxies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The present invention relates to a system and method for automatically extracting web addresses for malicious code checking, comprising: (a) collecting connection logs from a relay device and extracting web addresses from the collected connection logs; (b) A step of extracting a check target address by filtering the duplicated address and the allowable address from the web address, (c) extracting a sub address from the extracted web address of the check target address, And generating an address list to be checked.
Description
The present invention relates to a web address automatic extraction system and method for malicious code checking, and more particularly, to a Web proxy that is accessed by a client to use a web service, a connection log The present invention relates to a web address automatic extraction system and method for malicious code checking that automatically extracts a web address for malicious code checking.
The web is very convenient for us, and is used almost daily by almost everybody in the world, but it is frequently exploited as a malicious code infector. If a website visited by a large number of users is exploited for spreading malicious code, the damage may spread widely, and special attention should be paid. Preemptive detection and action against malicious sites can minimize the spread of malicious code damage.
In recent years, attack techniques such as exploiting unknown vulnerability exploit and detection avoiding technology have evolved, and it is necessary to upgrade detection technology. There are low-involvement Web crawling detection methods that depend on signatures, detection methods that are wide range of detection and can detect unknown attacks but are slow and high interaction action-based detection methods.
The number of websites operated on the Internet is large, and the number of URLs to be checked is increased to one million units, ten million units or more in consideration of the lower page.
However, companies that do not have an enterprise-wide IT infrastructure management system create handwritten URLs (IPs, URLs) of web sites that are subject to malicious code checking, There was a problem that the object was missing.
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and it is an object of the present invention to provide a Web address for checking a malicious code, which can collect, in real time, And an automatic extraction system and method.
Another object of the present invention is to provide a method and system for automatically detecting a web site where a malicious code is hidden by checking a web page of a sub web address connected to the web page through analysis of a web page source of the web address to be checked, System and method.
Another purpose of this incidence is to provide a method for distinguishing between stopover points and ejaculation points through website malware inspection.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.
According to an aspect of the present invention, there is provided a method for collecting access logs from a relay device, the method comprising: (a) collecting access logs from a relay device and extracting web addresses from the collected access logs; (b) (C) extracting a lower address from the web page of the extracted inspection target address, and extracting a check target address list including the check target address and the lower address A method of automatically extracting a Web address for malicious code checking of an inspection target extraction device is provided.
The method of automatically extracting web addresses for malicious code checking includes checking whether or not connection to the addresses to be checked in the address list to be inspected is possible after the step (c), checking for malicious code The method comprising the steps of:
Wherein the step (c) includes the steps of: crawling a web page corresponding to the check target address to extract a portion of the web page source where the link exists, as a lower address; And generating a checklist of addresses to be included.
According to another aspect of the present invention, there is provided a web browsing system comprising: a collecting unit for collecting a connection log from a relay device; a web address extracting unit for extracting web addresses from the collected access log; Extracting a lower address from a web page of the extracted inspection target address and generating an inspection target address list including the inspection target address and the lower address is provided .
The inspection object extraction unit may crawl a web page corresponding to the inspection target address and extract a portion of the web page source in which a link exists, as a lower address.
In addition, the inspection object extracting unit may collect a lower address by analyzing a header part of an HTTP request and an HTTP response generated when a web page corresponding to the inspection target address is visited.
According to another aspect of the present invention, there is provided a relay apparatus for relaying an access request to a web site, the relay apparatus comprising: Extracts an address to be checked, extracts a lower address from the extracted web address of the address to be checked, generates an address list to be checked including the address to be checked and a lower address A web address automatic extraction system for malicious code checking including an inspection target extraction device is provided.
The relay device may be a web proxy server or an L7 switch.
The system for automatically extracting web addresses for malicious code checking comprises: receiving a check target address list from the check target extraction device; confirming whether or not the check target addresses in the check target address list are connectable; And a malicious code checking device for performing malicious code checking by accessing the address.
The inspection target extraction device may crawl a web page corresponding to the inspection target address and extract a portion of the web page source in which a link exists, as a lower address.
Meanwhile, the above-mentioned 'web address automatic extraction system and method for malicious code checking' can be recorded in a recording medium readable by an electronic device after being implemented in the form of a program, or can be recorded in a program download management device Can be distributed.
According to the present invention, when a log is processed in accordance with the purpose of checking by collecting malicious code checking objects by utilizing a relay device such as a Web proxy or an L7 switch that the client accesses to use the web service, Can be extracted.
Also, by analyzing the web page source of the web address to be checked, the web page of the lower web address connected to the web page can be checked to detect the web site where the malicious code is hidden.
In addition, malicious URLs in malicious sites identified as malicious sites can be extracted through web site visits and malicious code discrimination.
The effects of the present invention are not limited to the above-mentioned effects, and various effects can be included within the scope of what is well known to a person skilled in the art from the following description.
FIG. 1 is a diagram illustrating a web address automatic extraction system for malicious code checking according to an embodiment of the present invention. Referring to FIG.
FIG. 2 is a block diagram schematically showing a configuration of a check target extraction apparatus according to an embodiment of the present invention.
3 is a block diagram schematically showing the configuration of a malicious code checking apparatus according to an embodiment of the present invention.
4 is a diagram illustrating a method for automatically extracting web addresses for malicious code checking according to an embodiment of the present invention.
5 is a diagram illustrating a method of automatically extracting a Web address for malicious code checking according to another embodiment of the present invention.
6 is a diagram illustrating a malicious code checking method of a malicious code checking apparatus according to an embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a system and method for automatically extracting web addresses for malicious code checking according to the present invention will be described in detail with reference to the accompanying drawings. The embodiments are provided so that those skilled in the art can easily understand the technical spirit of the present invention, and thus the present invention is not limited thereto. In addition, the matters described in the attached drawings may be different from those actually implemented by the schematic drawings to easily describe the embodiments of the present invention.
In the meantime, each constituent unit described below is only an example for implementing the present invention. Thus, in other implementations of the present invention, other components may be used without departing from the spirit and scope of the present invention.
In addition, each component may be implemented solely by hardware or software configuration, but may be implemented by a combination of various hardware and software configurations performing the same function. Also, two or more components may be implemented together by one hardware or software.
Also, the expression " comprising " is intended to merely denote that such elements are present as an expression of " open ", and should not be understood to exclude additional elements.
FIG. 1 is a diagram illustrating a web address automatic extraction system for malicious code checking according to an embodiment of the present invention. Referring to FIG.
Referring to FIG. 1, a web address automatic extraction system for malicious code checking includes at least one client 110, a
The
The
The
The
The
The inspection
That is, the inspection
The check
As described above, the inspection
According to another embodiment of the present invention, the check
That is, the check
In addition, the inspection
In addition, the inspection
Accordingly, the check
A plurality of
As described above, the inspection
In addition, the inspection
Meanwhile, the inspection
The malicious
The malicious
When the malicious
On the other hand, if the check target address list is composed of the main address and the sub address, the malicious
If the web site to be checked is the main address, the malicious code checking device executes a predetermined number of multiple browsers and simultaneously visits each inspection target web site. For example, the malicious
If the web site to be checked is a subordinate address, the malicious
If malicious code infection attempts are not detected by using multiple browsers and multi-frames at the same time, a visit is made to the next inspection target group. If an infection attempt is confirmed, ≪ / RTI > At this time, when tracing the site in question, the tree search can be used to quickly locate the site with a minimum number of checks.
The malicious
The malicious
When the malicious site is extracted, the malicious
The malicious
In addition, the malicious
Meanwhile, the malicious
FIG. 2 is a block diagram schematically showing a configuration of a check target extraction apparatus according to an embodiment of the present invention.
2, the
The
The collecting
The inspection
According to another embodiment of the present invention, the check
In addition, the check
In addition, the inspection
The inspection
Meanwhile, the collecting
The
The
The
The check
3 is a block diagram schematically showing the configuration of a malicious code checking apparatus according to an embodiment of the present invention.
Referring to FIG. 3, the malicious
The
The
When the checking target address list is received, the
On the other hand, if the check target address list is composed of the main address and the sub address, the
If the Web site to be checked is the main address, the
If the web site to be checked is a sub-address, the
If malicious code infection attempts are not detected by using multiple browsers and multi-frames at the same time, a visit is made to the next inspection target group. If an infection attempt is confirmed, ≪ / RTI > At this time, when tracing the site in question, the tree search can be used to quickly locate the site with a minimum number of checks.
The
The
When the malicious site is extracted, the
On the other hand, the
The
The
The
The malicious
4 is a diagram illustrating a method for automatically extracting web addresses for malicious code checking according to an embodiment of the present invention. It is to be understood that this is only one embodiment including preferred steps in achieving the object of the present invention, and it goes without saying that some steps may be modified, added or deleted.
Referring to FIG. 4, the check target extraction apparatus collects connection logs from the relay device (S402), and extracts web addresses from the collected connection logs (S404).
Then, the check target extraction device filters the duplicated address and the allowed address from the extracted web address (S406) and creates the check target address list (S408). At this time, the inspection target extraction device generates an inspection target address list by filtering the firewall public IP, and the created inspection target address list is composed of the URL or IP of the website.
When the step S408 is performed, the check target extracting device not only stores the check target address list but also transmits the check target address list to the malicious code checking device which performs malicious code checking (S410). At this time, the inspection target extraction device can transmit the check target address list to the malicious code checking device using various protocols such as FTP, SNMP, and syslog.
A method for automatically extracting a Web address for malicious code checking according to an embodiment of the present invention may be implemented in the form of a program. In such a state, the program may be stored in a computer-readable recording medium, It can also be distributed via a provisioning server.
5 is a diagram illustrating a method of automatically extracting a Web address for malicious code checking according to another embodiment of the present invention. It is to be understood that this is only one embodiment including preferred steps in achieving the object of the present invention, and it goes without saying that some steps may be modified, added or deleted.
Referring to FIG. 5, the check target extraction apparatus collects connection logs from the relay device (S502), and extracts web addresses from the collected connection logs (S504).
Then, the inspection target extraction device filters the duplicated address and the allowed address in the extracted web address (S506) and extracts the inspection target address (S508). At this time, the inspection target extraction device extracts the inspection target address by filtering the firewall public IP.
When the step S508 is performed, the check target extraction device extracts the lower address from the web page of the extracted check target address (S510), and creates the check target address list including the check target address and the lower address (S512). That is, the inspection target extraction device crawls the web page of the inspection target URL in the inspection target address list and extracts the portion of the web page source where the link exists, as a lower address. Here, the link portion includes the src portion of the script, the URL of the A href, the URL contained in the URL tag, the src portion of the img, and the like. In this way, when sub-URLs are collected through web page source analysis, it is possible to check the sub-pages used for link clicks or page movement that require user's action.
In addition, the inspection target extraction device can extract a sub-URL according to the depth connected with the inspection target URL. This makes it possible for the malicious code checking device to check the malicious behavior of each web address connected to the target web address in addition to the target web address.
In addition, the inspection target extraction apparatus can collect a sub-URL by analyzing a header part of an HTTP request and a HTTP response generated when a web page is visited. The relationship between the two URLs can be confirmed by using the referrer URL information and the request URL (GET URL) information of the HTTP response. The waypoint URL is the website information visited before the requesting URL, and the requesting URL becomes the subordinate URL of the waypoint URL.
In this way, the inspection target extraction device can generate a check target address list including the sub address of the inspection target address.
Thereafter, the check target extraction apparatus not only stores the check target address list, but also transmits the check target address list to a check device that performs malicious code check (S514). At this time, the inspection target extraction device can transmit the check target address list to the malicious code checking device using various protocols such as FTP, SNMP, and syslog.
A method for automatically extracting a Web address for malicious code checking according to an embodiment of the present invention may be implemented in the form of a program. In such a state, the program may be stored in a computer-readable recording medium, It can also be distributed via a provisioning server.
6 is a diagram illustrating a malicious code checking method of a malicious code checking apparatus according to an embodiment of the present invention. It is to be understood that this is only one embodiment including preferred steps in achieving the object of the present invention, and it goes without saying that some steps may be modified, added or deleted.
Referring to FIG. 6, when the malicious code checking device receives the check target address list from the check target extraction device (S602), the malicious code checking device accesses the website corresponding to the check target address (S604). At this time, the malicious code checking device confirms whether or not the web site to be inspected can be accessed, and performs the visit inspection only for the websites confirmed as alive. In order to check the availability of the web site to be checked at a high speed, the malicious code checking device confirms whether or not a response is received after transmitting a DNS (domain name system) query. When a DNS response is received, it is determined that a web service is provided to the TCP port 80 when an acknowledgment signal is received after transmitting a synchronization signal to the TCP 80 port. Here, the malicious code checking apparatus can confirm whether or not a plurality of web sites can be accessed simultaneously using a multithread thread.
When the malicious code checking device receives the address list to be checked, the malicious code checking device accesses a plurality of websites to be checked simultaneously using multiple browsers. Here, the list of web sites to be inspected consists of URLs of a large-scale web site to be inspected. Then, the malicious code checking device executes the browser at a predetermined and simultaneously accessible unit, and visits the web site to be checked through the browser. For example, if you can have 100 browsers running at the same time, the Malicious Code Checker will connect the Web sites to be checked in the checked address list in units of 100.
On the other hand, if the address list to be checked consists of the main address and the sub address, the malicious code checking device visits the corresponding website according to the main address and the sub address.
If the web site to be checked is the main address, the malicious code checking device executes a predetermined number of multiple browsers and simultaneously visits each inspection target web site. For example, the Malware Checker runs 30 different browsers and visits 30 different websites to be inspected at the same time through each browser.
If the web site to be checked is a sub-address, the malicious code checking device amplifies the speed by using a multi-browser multi-frame visiting technique at the same time. For example, if you open 20 browsers with 5 frames inserted at the same time and visit the website to be checked, it is possible to check 100 (5 × 20) sites with a single check.
If malicious code infection attempts are not detected by using multiple browsers and multi-frames at the same time, a visit is made to the next inspection target group. If an infection attempt is confirmed, ≪ / RTI > At this time, when tracing the site in question, the tree search can be used to quickly locate the site with a minimum number of checks.
When S604 is performed, the malicious code checking device checks whether there is an attempt to infect malicious code among a plurality of web sites to be checked (S606). At this time, the malicious code checking device can check whether an attack that infects malicious code occurs by analyzing the correlation between the file, process and registry phenomenon that occurs after visiting the website to be inspected.
If a malicious code infection attempt is detected among a plurality of web sites to be inspected, the malicious code checking device extracts malicious sites (S608). At this time, the malicious code checking apparatus extracts malicious sites from a plurality of inspection target web sites when the inspection range is narrowed to a predetermined ratio using a tree search.
When the malicious site is extracted, the malicious code checking device accesses the malicious site and traces the malicious URL that distributes the malicious code (S610). Here, the malicious code checking device extracts an access URL in which a connection is generated when a malicious site is visited, blocks the extracted access URLs one by one, and tracks the vulnerability attack URL by revisiting the malicious site.
The method of automatically extracting Web addresses for checking malicious code can be written as a program, and the codes and code segments constituting the program can be easily deduced by a programmer in the field. In addition, a program related to a method for automatically extracting web addresses for malicious code checking can be stored in an information storage medium (readable medium) that can be read by an electronic device, and can be read and executed by an electronic device.
Thus, those skilled in the art will appreciate that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. It is therefore to be understood that the above-described embodiments are illustrative only and not restrictive of the scope of the invention. It is also to be understood that the flow charts shown in the figures are merely the sequential steps illustrated in order to achieve the most desirable results in practicing the present invention and that other additional steps may be provided or some steps may be deleted .
The technical features and implementations described herein may be implemented in digital electronic circuitry, or may be implemented in computer software, firmware, or hardware, including the structures described herein, and structural equivalents thereof, . Also, implementations that implement the technical features described herein may be implemented as computer program products, that is, modules relating to computer program instructions encoded on a program storage medium of the type for execution by, or for controlling, the operation of the processing system .
The computer-readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter that affects the machine readable propagation type signal, or a combination of one or more of the foregoing.
In the present specification, the term " apparatus "or" system "includes all apparatuses, apparatuses, and machines for processing data, including, for example, a processor, a computer or a multiprocessor or a computer. The processing system may include any code that, in addition to the hardware, forms an execution environment for a computer program upon request, such as, for example, code that constitutes processor firmware, a protocol stack, a database management system, an operating system, can do.
A computer program, known as a program, software, software application, script or code, may be written in any form of programming language, including compiled or interpreted language or a priori, procedural language, Routines, or other units suitable for use in a computer environment.
On the other hand, a computer program does not necessarily correspond to a file in the file system, but may be stored in a single file provided to the requested program or in a plurality of interactive files (for example, one or more modules, File), or a portion of a file that holds another program or data (e.g., one or more scripts stored in a markup language document).
A computer program may be embodied to run on multiple computers or on one or more computers located at one site or distributed across a plurality of sites and interconnected by a wired / wireless communication network.
On the other hand, computer readable media suitable for storing computer program instructions and data include, for example, semiconductor memory devices such as EPROM, EEPROM, and flash memory devices, such as magnetic disks such as internal hard disks or external disks, And any type of non-volatile memory, media and memory devices, including CD and DVD discs. The processor and memory may be supplemented by, or incorporated in, special purpose logic circuits.
Implementations implementing the technical features described herein may include, for example, back-end components such as a data server, or may include middleware components, such as, for example, an application server, Or a client computer having a graphical user interface, or any combination of one or more of such backend, middleware or front end components. The components of the system may be interconnected by any form or medium of digital data communication, for example, a communication network.
Hereinafter, a more specific embodiment capable of implementing the configurations including the system described herein and the web address automatic extraction method for malicious code checking will be described in detail.
The system described herein and the method for automatically extracting web addresses for malicious code checking may be executed on a client device or a server associated with a web based storage system or on one or more processors included in a server to execute computer software, Lt; RTI ID = 0.0 > and / or < / RTI > The processor may be part of a computing platform, such as a server, a client, a network infrastructure, a mobile computing platform, a fixed computing platform, and the like, and may specifically be a type of computer or processing device capable of executing program instructions, code, In addition, the processor may further include a method for automatically extracting a Web address for checking a malicious code, a memory for storing instructions, a code, and a program. If the memory does not include a memory, Access to storage devices such as CD-ROMs, DVDs, memories, hard disks, flash drives, RAMs, ROMs, caches, etc. in which the instructions, codes, and programs are stored.
In addition, the system described herein and the method for automatically extracting web addresses for malicious code checking can be used in part or in whole through a server, a client, a gateway, a hub, a router, or an apparatus executing computer software on network hardware. The software may be executed in various types of servers such as a file server, a print server, a domain server, an Internet server, an intranet server, a host server, a distributed server, A storage medium, a communication device, a port, a client, and other servers via a wired / wireless network.
In addition, the automatic method of extracting web addresses for malicious code checking, commands, and codes can also be executed by the server, and other devices required for executing the method of automatically extracting web addresses for malicious code checking can be classified into a hierarchical structure ≪ / RTI >
In addition, the server can provide an interface to other devices including, without limitation, clients, other servers, printers, database servers, print servers, file servers, communication servers, distributed servers, The remote execution of the program can be facilitated.
Further, any of the devices connected to the server via the interface may further include at least one storage device capable of storing a web address automatic extraction method, a command, and a code for malicious code checking, and the central processor of the server may be different Commands, codes, and the like to be executed on the device can be provided to the device and stored in the storage device.
Meanwhile, the system described herein and the method for automatically extracting web addresses for malicious code checking can be partially or entirely used through a network infrastructure. The network infrastructure may include both a device such as a computing device, a server, a router, a hub, a firewall, a client, a personal computer, a communication device, a routing device, etc. and a separate module capable of performing each function, In addition to one device and module, it may further include storage media such as a story flash memory, buffer, stack, RAM, ROM, and the like. In addition, the automatic method of extracting web addresses for checking malicious codes, commands, codes, and the like can be executed and stored by any of devices, modules, and storage media included in the network infrastructure. Other devices needed to implement the extraction method may also be implemented as part of the network infrastructure.
In addition, the system described in the present specification and the method of automatically extracting web addresses for malicious code checking can be implemented by hardware or a combination of hardware and software suitable for a specific application. Herein, the hardware includes both general-purpose computer devices such as personal computers, mobile communication terminals, and enterprise-specific computer devices, and the computer devices may include memory, a microprocessor, a microcontroller, a digital signal processor, an application integrated circuit, a programmable gate array, Or the like, or a combination thereof.
Computer software, instructions, code, etc., as described above, may be stored or accessed by a readable device, such as a computer component having digital data used to compute for a period of time, such as RAM or ROM Permanent storage such as semiconductor storage, optical disc, large capacity storage such as hard disk, tape, drum, optical storage such as CD or DVD, flash memory, floppy disk, magnetic tape, paper tape, Memory such as storage and dynamic memory, static memory, variable storage, network-attached storage such as the cloud, and the like. Here, the commands and codes are data-oriented languages such as SQL and dBase, system languages such as C, Objective C, C ++, and assembly, architectural languages such as Java and NET, application languages such as PHP, Ruby, Perl and Python But it is not so limited and may include all languages well known to those skilled in the art.
In addition, "computer readable media" as described herein includes all media that contribute to providing instructions to a processor for program execution. But are not limited to, transmission media such as coaxial cables, copper wires, optical fibers, and the like that transmit data to nonvolatile media such as data storage devices, optical disks, magnetic disks, etc., volatile media such as dynamic memory and the like.
On the other hand, configurations implementing the technical features of the present invention, which are included in the block diagrams and flowcharts shown in the accompanying drawings, refer to the logical boundaries between the configurations.
However, according to an embodiment of the software or hardware, the depicted arrangements and their functions may be implemented in the form of a stand alone software module, a monolithic software structure, a code, a service and a combination thereof and may execute stored program code, All such embodiments are to be regarded as being within the scope of the present invention since they can be stored in a medium executable on a computer having a processor and their functions can be implemented.
Accordingly, the appended drawings and the description thereof illustrate the technical features of the present invention, but should not be inferred unless a specific arrangement of software for implementing such technical features is explicitly mentioned. That is, various embodiments described above may exist, and some embodiments may be modified while retaining the same technical features as those of the present invention, and these should also be considered to be within the scope of the present invention.
It should also be understood that although the flowcharts depict the operations in the drawings in a particular order, they are shown for the sake of obtaining the most desirable results, and such operations must necessarily be performed in the specific order or sequential order shown, Should not be construed as being. In certain cases, multitasking and parallel processing may be advantageous. In addition, the separation of the various system components of the above-described embodiments should not be understood as requiring such separation in all embodiments, and the described program components and systems are generally integrated into a single software product, It can be packaged.
As such, the specification is not intended to limit the invention to the precise form disclosed. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the spirit and scope of the present invention as defined by the appended claims. It is possible to apply a deformation.
The scope of the present invention is defined by the appended claims rather than the foregoing description, and all changes or modifications derived from the meaning and scope of the claims and equivalents thereof are deemed to be included in the scope of the present invention. .
The present invention provides a system and method for automatically extracting web addresses for malicious code checking, thereby collecting malicious code checking targets in real time by utilizing a relay device such as a Web proxy and an L7 switch that a client accesses to use a web service And it is possible to extract the URL necessary for real-time detection by processing the log to meet the purpose of the check.
Also, by analyzing the web page source of the web address to be checked, the web page of the lower web address connected to the web page can be checked to detect the web site where the malicious code is hidden.
In addition, malicious URLs in malicious sites identified as malicious sites can be extracted through web site visits and malicious code discrimination.
100: Client
200: Relay device
300: Inspection target extraction device
310, 410:
320:
330: Inspection object extraction unit
340, 430:
350, 440:
400: malicious code checking device
420:
500: DNS server
600: service server
Claims (10)
(b) filtering the duplicated address and the allowed address from the extracted web address to extract an address to be checked; And
(c) extracting a lower address from the web page of the extracted inspection target address, and creating an inspection target address list including the inspection target address and the lower address;
A method for automatically extracting a web address for malicious code checking of an inspection target extraction device including a web address extraction method.
After the step (c)
Checking whether or not access to the addresses to be checked in the check target address list is possible and accessing an address determined to be connectable to perform a malicious code check; Automatic address extraction method.
The step (c)
Crawling a web page corresponding to the check target address and extracting a portion of the web page source in which a link exists, as a lower address;
And generating an inspection target address list including the inspection target address and the extracted sub address. The method of claim 1,
Extracts web addresses from the collected access logs, extracts addresses to be checked by filtering duplicated addresses and allowed addresses from the extracted web addresses, extracts sub-addresses from the extracted web addresses of the checked addresses, An inspection object extraction unit for generating an inspection object address list including the inspection object address and the lower address;
And an extracting unit for extracting the object to be inspected.
Wherein the inspection object extracting unit extracts a portion of a web page source in which a link exists, as a lower address, by crawling a web page corresponding to the inspection target address.
Wherein the inspection object extraction unit analyzes a header part of an HTTP request and an HTTP response generated when a web page corresponding to the inspection target address is visited and collects the lower address. Target extraction device.
Extracting web addresses from the relay device, extracting web addresses from the extracted web address, filtering duplicate addresses and allowed addresses from the extracted web addresses, extracting an address to be checked, Extracts a check target address list including the check target address and the sub address;
A web address automatic extraction system for malicious code checking.
Wherein the relay device is a web proxy server or an L7 switch.
A malicious code for performing a malicious code check by accessing an address determined to be connectable by checking whether or not access to the addresses to be checked in the address list to be checked is possible, An automatic web address extraction system for malicious code checking that further includes a check device.
Wherein the inspection target extraction device crawls a web page corresponding to the inspection target address and extracts a portion of the web page source where a link exists as a lower address. Automatic extraction system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150146240A KR101767589B1 (en) | 2015-10-20 | 2015-10-20 | Web address extraction system for checking malicious code and method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150146240A KR101767589B1 (en) | 2015-10-20 | 2015-10-20 | Web address extraction system for checking malicious code and method thereof |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160142466A Division KR101767594B1 (en) | 2016-10-28 | 2016-10-28 | Web address extraction system for checking malicious code and method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170046000A KR20170046000A (en) | 2017-04-28 |
KR101767589B1 true KR101767589B1 (en) | 2017-08-11 |
Family
ID=58701933
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150146240A KR101767589B1 (en) | 2015-10-20 | 2015-10-20 | Web address extraction system for checking malicious code and method thereof |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101767589B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2021111802A (en) | 2020-01-06 | 2021-08-02 | 富士通株式会社 | Detection program, detection method, and information processing device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101281160B1 (en) * | 2006-02-03 | 2013-07-02 | 주식회사 엘지씨엔에스 | Intrusion Prevention System using extract of HTTP request information and Method URL cutoff using the same |
-
2015
- 2015-10-20 KR KR1020150146240A patent/KR101767589B1/en active IP Right Grant
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101281160B1 (en) * | 2006-02-03 | 2013-07-02 | 주식회사 엘지씨엔에스 | Intrusion Prevention System using extract of HTTP request information and Method URL cutoff using the same |
Also Published As
Publication number | Publication date |
---|---|
KR20170046000A (en) | 2017-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10193929B2 (en) | Methods and systems for improving analytics in distributed networks | |
Kumar et al. | Signature based intrusion detection system using SNORT | |
CN104125209B (en) | Malice website prompt method and router | |
CN103634306B (en) | The safety detection method and safety detection server of network data | |
Akiyama et al. | Searching structural neighborhood of malicious urls to improve blacklisting | |
CN108768921B (en) | Malicious webpage discovery method and system based on feature detection | |
CN106302512B (en) | Method, equipment and system for controlling access | |
CN105760379B (en) | Method and device for detecting webshell page based on intra-domain page association relation | |
CN103685294A (en) | Method and device for identifying attack sources of denial of service attack | |
US20150047038A1 (en) | Techniques for validating distributed denial of service attacks based on social media content | |
CN108573146A (en) | A kind of malice URL detection method and device | |
CN107528812B (en) | Attack detection method and device | |
WO2017063274A1 (en) | Method for automatically determining malicious-jumping and malicious-nesting offensive websites | |
CN110362992A (en) | Based on the method and apparatus for stopping in the environment of cloud or detecting computer attack | |
CN103631830A (en) | Method and device for detecting web spiders | |
JP5752642B2 (en) | Monitoring device and monitoring method | |
CN113518077A (en) | Malicious web crawler detection method, device, equipment and storage medium | |
CN108351941B (en) | Analysis device, analysis method, and computer-readable storage medium | |
CN102664872A (en) | System used for detecting and preventing attack to server in computer network and method thereof | |
RU2738337C1 (en) | Intelligent bots detection and protection system and method | |
CN103312692B (en) | Chained address safety detecting method and device | |
JP5791548B2 (en) | Address extraction device | |
Samarasinghe et al. | On cloaking behaviors of malicious websites | |
CN103440454B (en) | A kind of active honeypot detection method based on search engine keywords | |
KR101767594B1 (en) | Web address extraction system for checking malicious code and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
A107 | Divisional application of patent | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |