CN116366231A - Anti-crawler method and system for protecting website resources based on encryption confusion - Google Patents

Anti-crawler method and system for protecting website resources based on encryption confusion Download PDF

Info

Publication number
CN116366231A
CN116366231A CN202310187885.XA CN202310187885A CN116366231A CN 116366231 A CN116366231 A CN 116366231A CN 202310187885 A CN202310187885 A CN 202310187885A CN 116366231 A CN116366231 A CN 116366231A
Authority
CN
China
Prior art keywords
request
encryption
confusion
ciphertext
aes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310187885.XA
Other languages
Chinese (zh)
Other versions
CN116366231B (en
Inventor
田振
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Maxtech Co ltd
Original Assignee
Beijing Maxtech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Maxtech Co ltd filed Critical Beijing Maxtech Co ltd
Priority to CN202310187885.XA priority Critical patent/CN116366231B/en
Publication of CN116366231A publication Critical patent/CN116366231A/en
Application granted granted Critical
Publication of CN116366231B publication Critical patent/CN116366231B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/14Protecting executable software against software analysis or reverse engineering, e.g. by obfuscation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • H04L63/145Countermeasures against malicious traffic the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0631Substitution permutation network [SPN], i.e. cipher composed of a number of stages or rounds each involving linear and nonlinear transformations, e.g. AES algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/065Encryption by serially and continuously modifying data stream elements, e.g. stream cipher systems, RC4, SEAL or A5/3
    • H04L9/0656Pseudorandom key sequence combined element-for-element with data sequence, e.g. one-time-pad [OTP] or Vernam's cipher
    • H04L9/0662Pseudorandom key sequence combined element-for-element with data sequence, e.g. one-time-pad [OTP] or Vernam's cipher with particular pseudorandom sequence generator
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3297Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving time stamps, e.g. generation of time stamps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Power Engineering (AREA)
  • Technology Law (AREA)
  • Multimedia (AREA)
  • Virology (AREA)
  • Computing Systems (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses an anti-crawler method and a system for protecting website resources based on encryption confusion, wherein when a client sends an HTTP request to a server, an MD5 encryption key is added in a request link address, and an AES encryption ciphertext is added in the head of the request; mixing the JavaScript codes of the HTTP request; after the server acquires the request, consistency verification is carried out according to the extracted MD5 encryption key and the AES encryption ciphertext, normal data is returned to the client after verification is passed, and otherwise, the request is refused or dirty data is returned. By the security process of confusion of the client execution method and encryption of the request parameters, the difficulty of analysis, simulation and restoration of the request by the crawler is improved, the analysis difficulty of the API request response process is increased, the service is protected from being easily cracked, and the security of the API is improved.

Description

Anti-crawler method and system for protecting website resources based on encryption confusion
Technical Field
The invention relates to the technical field of anticreepers, in particular to an anticreeper method and system for protecting website resources based on encryption confusion.
Background
In the development of digital economy nowadays, crawler technology has already penetrated into aspects of technological development and business innovation. However, behind the successor manual boring and regular operations, a large number of crawler automation threats are also hidden. Particularly, by means of machine learning and artificial intelligence technology, the crawler technology is improved in efficiency and intelligence, so that the automation threat of the crawler is more hidden. Along with the ubiquitous situation of crawlers, enterprises are facing more and more serious threat of malicious crawlers while obtaining high value brought by the good-looking crawlers.
The current common techniques for anticreeper are the following: 1. detecting the access frequency of the IP address; 2. detecting whether the user behavior has periodicity; 3. detecting whether the access request contains required parameters; 4. encrypting the returned content; 5. encrypting the JavaScript request parameters; 6. dynamically loading through ajax; 7. a digital/sliding verification code is used. These methods can either be bypassed with the common crawler method or can affect the user's experience, with one more verification code step.
Therefore, there is a need to develop an effective anti-climbing technique to make up for the deficiencies of the prior art.
Disclosure of Invention
Therefore, the invention provides an anti-crawler method and system for protecting website resources based on encryption confusion, which are used for solving the problems that the anti-crawler effect is poor, the user experience is affected and the data security cannot be ensured in the prior art.
In order to achieve the above object, the present invention provides the following technical solutions:
according to a first aspect of an embodiment of the present invention, an anti-crawler method for protecting website resources based on encryption confusion is provided, where the method includes:
when a client sends an HTTP request to a server, adding an MD5 encryption key in a request link address, wherein the MD5 encryption key is obtained by encrypting a time stamp with a random salt value by using an MD5 message digest algorithm, and adding an AES encryption ciphertext in the head of the request, wherein the AES encryption ciphertext is obtained by encrypting a request parameter containing the time stamp and the random salt value by using an AES encryption algorithm, and then adding a random data character string;
mixing the JavaScript codes of the HTTP request;
after the server acquires the request, consistency verification is carried out according to the extracted MD5 encryption key and the AES encryption ciphertext, normal data is returned to the client after verification is passed, and otherwise, the request is refused or dirty data is returned.
Further, the method for confusing the JavaScript code of the HTTP request specifically includes:
performing constant confusion and operation confusion processing on the JavaScript code;
carrying out control flow flattening treatment on JavaScript codes;
carrying out grammar ugging treatment on JavaScript codes;
and compressing the JavaScript code.
Further, after the server side obtains the request, consistency verification is performed according to the extracted MD5 encryption key and AES encryption ciphertext, which specifically includes:
the website server extracts the MD5 encryption key of the request link, extracts the AES encryption ciphertext in the request header, symmetrically restores the extracted AES encryption ciphertext to obtain a time stamp and a random salt value when the request is made, and judges the authenticity of the request by verifying the consistency of the encryption values of the time stamp and the random salt value and the MD5 encryption key in the request link.
Further, the AES encrypted ciphertext obtaining method includes:
a. the plaintext is transmitted into an AES encryption algorithm in the form of parameters;
b. executing a key expansion algorithm to obtain a round key;
c. executing a byte substitution algorithm;
d. executing a line shifting algorithm;
e. performing a row blending algorithm;
f. performing a round key addition algorithm;
g. judging whether the number of the loop iterations is reached, if so, turning to h to obtain a ciphertext C, otherwise, turning to C;
h. executing the pseudo-random number generating function to obtain a random data character string G;
i. and outputting the final ciphertext C+G.
According to a second aspect of an embodiment of the present invention, there is provided an anti-crawler system for protecting website resources based on encryption confusion, the system comprising:
the parameter encryption module is used for adding an MD5 encryption key in a request link address when the client sends an HTTP request to the server, wherein the MD5 encryption key is obtained by encrypting a time stamp with a random salt value by using an MD5 message digest algorithm, and adding an AES encryption ciphertext in the head of the request, wherein the AES encryption ciphertext is obtained by encrypting a request parameter containing the time stamp and the random salt value by using an AES encryption algorithm, and then adding a random data character string;
the code confusion module is used for carrying out confusion on JavaScript codes of the HTTP request;
and the verification module is used for carrying out consistency verification according to the extracted MD5 encryption key and the AES encryption ciphertext after the server acquires the request, returning normal data to the client after the verification is passed, and rejecting the request or returning dirty data if the verification is passed.
Further, the code obfuscation module is specifically configured to:
performing constant confusion and operation confusion processing on the JavaScript code;
carrying out control flow flattening treatment on JavaScript codes;
carrying out grammar ugging treatment on JavaScript codes;
and compressing the JavaScript code.
Further, the verification module is specifically configured to:
the website server extracts the MD5 encryption key of the request link, extracts the AES encryption ciphertext in the request header, symmetrically restores the extracted AES encryption ciphertext to obtain a time stamp and a random salt value when the request is made, and judges the authenticity of the request by verifying the consistency of the encryption values of the time stamp and the random salt value and the MD5 encryption key in the request link.
According to a third aspect of an embodiment of the present invention, a computer storage medium is provided, in which one or more program instructions are included for performing a method as claimed in any one of the above, by an anti-crawler system for protecting web site resources based on encryption confusion.
The invention has the following advantages:
according to the anti-crawler method and the system based on encryption confusion protection of website resources, when a client sends an HTTP request to a server, an MD5 encryption key is added to a request link address, and an AES encryption ciphertext is added to the head of the request; mixing the JavaScript codes of the HTTP request; after the server acquires the request, consistency verification is carried out according to the extracted MD5 encryption key and the AES encryption ciphertext, normal data is returned to the client after verification is passed, and otherwise, the request is refused or dirty data is returned. By the security process of confusion of the client execution method and encryption of the request parameters, the difficulty of analysis, simulation and restoration of the request by the crawler is improved, the analysis difficulty of the API request response process is increased, the service is protected from being easily cracked, and the security of the API is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.
FIG. 1 is a flowchart of an anti-crawler method for protecting website resources based on encryption confusion provided in embodiment 1 of the present invention;
fig. 2 is a flowchart of an implementation of an anticreeper method for protecting website resources based on encryption confusion provided in embodiment 1 of the present invention.
Detailed Description
Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
As shown in fig. 1 and fig. 2, the present embodiment proposes an anti-crawler method for protecting website resources based on encryption confusion, where the method includes:
and S100, when the client sends an HTTP request to the server, adding an MD5 encryption key in a request link address, wherein the MD5 encryption key is obtained by encrypting a time stamp with a random salt value by using an MD5 message digest algorithm, adding an AES encryption ciphertext in the head of the request, and the AES encryption ciphertext is obtained by encrypting a request parameter containing the time stamp and the random salt value by using an AES encryption algorithm, and then adding a random data character string.
1. When the client sends an HTTP request to the server, the request link needs to be added with a randomly generated MD5 key, which consists of a timestamp and a salt value. Wherein the timestamp takes a 13-bit number, and the salt value is defined to be composed of numbers, letters and symbols which do not exceed 32 bits. And generating a message digest by using an MD5 algorithm from the composed content.
Examples: target address:
http://www.localhost/channel/36F17C3939AC3E7B2FC9396FA8E953EA
wherein 36F17C3939AC3E7B2FC9396FA8E953EA consists of a request timestamp+random salt;
the background is acquired by the following steps: http:// www.localhost/channel/{ SecretKey }
The SecretKey is the authentication message digest.
2. Parameters and user operation information requested by a client side to a server side are input into an AES encryption algorithm, and the obtained ciphertext is put into a header for sending the request to the server side.
The specific steps of the AES encryption method are as follows:
a. the plaintext is transmitted into an AES encryption algorithm in the form of parameters;
b. executing a key expansion algorithm to obtain a round key;
c. executing a byte substitution algorithm;
d. executing a line shifting algorithm;
e. performing a row blending algorithm;
f. performing a round key addition algorithm;
g. judging whether the number of the loop iterations is reached, if so, turning to h to obtain a ciphertext C, otherwise, turning to C;
h. executing the pseudo-random number generating function to obtain a random data character string G;
i. and outputting the final ciphertext C+G.
In the encryption method based on the AES encryption algorithm, the AES encryption algorithm is used to encrypt the plaintext data to obtain the ciphertext C, the pseudorandom number generator generates the random data string G, and the ciphertext finally consists of the ciphertext C and the random data string G. The ciphertext obtained has randomness.
Examples: AES encryption is performed on (request time stamp, salt value, request address in step one), and the data string G is generated randomly, and the generated ciphertext is put into the headers.
S200, confusing JavaScript codes of the HTTP request.
The JavaScript code is obfuscated, the static method of the encryption process is obfuscated,
the specific operation steps are as follows:
(1) Confusion of constants. According to JavaScript syntax feature, a.b such a method call can be written as a [ "b" ], and further encapsulates a array call method: a [1 x,2 x,3 x. ], method a.b is implemented as a [ arr [ n ] ] (n is a digital index). Since n is a number which can be encrypted by exclusive or x, it becomes a [ arr [ n1 x ] ] (n 1 is the original method index), and further the purpose of changing the name searching method into the index searching method is achieved, and confusion is formed.
Such as: the original method is a.b, a.c, a.d, the writing method is adjusted to be a [ "b", "c", "d" ], converted index scheduling is used as a [1,2,3], or index calling is a [1 x,2 x,3 x ], and the correct writing method of the calling a.b is a [1^x ], so that codes are confused, and the cracking difficulty is improved.
(2) Control flow flattening. The execution of JavaScript sentences is sequential, each single sentence is proposed, blocks in a switch-case are added, and then the execution sequence is controlled by the case, so that the execution sequence and the reading sequence of the original method are disturbed. The specific operation method is to generate an execution sequence code, divide the character string by an |divider, and obtain the execution sequence after dividing the character string. The following is a pseudo code example:
Figure SMS_1
the code can be flattened through the operation, the code is extruded to a layer, the code reading difficulty is increased, meanwhile, the code can obtain the real execution sequence only in the dynamic debugging process, and the code cracking difficulty is improved.
(3) Grammar is ugly. On the premise of not changing the function of the code, the common grammar is confused into the unusual grammar, for example, the original code fun (a) =return (c=a+b), the processed grammar can be written into the function fun (a) =return (sum (a+b)), and the control can interfere with the reading of the existing JavaScript code after the control is turned over.
(4) And (5) code compression. Redundant spaces and line breaks are compressed and annotations are deleted. The method name coding dictionary is used for replacing a longer method name with a shorter name irrelevant to the method effect, so that the size can be reduced, and the reading difficulty can be improved.
In this embodiment, the code confusion in step S200 and the parameter confusion in step S100 are encryption confusion from two parallel dimensions, so as to achieve the purposes of method confusion and parameter confusion, thereby increasing the difficulty of reverse cracking.
S300, after the server acquires the request, carrying out consistency check according to the extracted MD5 encryption key and the AES encryption ciphertext, returning normal data to the client after the verification is passed, and otherwise rejecting the request or returning dirty data.
The website server extracts the encryption parameters of the request link, extracts the encryption parameters in the headers, symmetrically restores the parameters in the headers to obtain a time stamp and a random number salt when the request is made, and judges the authenticity of the request by verifying the consistency of the encryption values of the time stamp and the salt and md5 in the request link.
According to the anti-crawler method based on encryption confusion protection of website resources, codes of request data are written into JavaScript, and MD5 encrypted time stamps and salt values are added to requested links; AES encryption is carried out on the requested parameters, and the JavaScript codes are confused. The invention utilizes the MD5 algorithm to carry out message digest authentication and combines the symmetric AES algorithm to encrypt the request data, and finally confuses the JavaScript, thereby ensuring the data security and timeliness.
Example 2
Corresponding to the above embodiment 1, this embodiment proposes an anti-crawler system for protecting website resources based on encryption confusion, where the system includes:
the parameter encryption module is used for adding an MD5 encryption key in a request link address when the client sends an HTTP request to the server, wherein the MD5 encryption key is obtained by encrypting a time stamp with a random salt value by using an MD5 message digest algorithm, and adding an AES encryption ciphertext in the head of the request, wherein the AES encryption ciphertext is obtained by encrypting a request parameter containing the time stamp and the random salt value by using an AES encryption algorithm, and then adding a random data character string;
the code confusion module is used for carrying out confusion on JavaScript codes of the HTTP request;
and the verification module is used for carrying out consistency verification according to the extracted MD5 encryption key and the AES encryption ciphertext after the server acquires the request, returning normal data to the client after the verification is passed, and rejecting the request or returning dirty data if the verification is passed.
Further, the code obfuscation module is specifically configured to:
performing constant confusion and operation confusion processing on the JavaScript code;
carrying out control flow flattening treatment on JavaScript codes;
carrying out grammar ugging treatment on JavaScript codes;
and compressing the JavaScript code.
Further, the verification module is specifically configured to:
the website server extracts the MD5 encryption key of the request link, extracts the AES encryption ciphertext in the request header, symmetrically restores the extracted AES encryption ciphertext to obtain a time stamp and a random salt value when the request is made, and judges the authenticity of the request by verifying the consistency of the encryption values of the time stamp and the random salt value and the MD5 encryption key in the request link.
The functions executed by each component in the anti-crawler system for protecting website resources based on encryption confusion provided by the embodiment of the present invention are described in detail in the above embodiment 1, so that redundant description is omitted here.
Example 3
In correspondence with the above-described embodiments, this embodiment proposes a computer storage medium containing one or more program instructions for executing the method as in embodiment 1 by an anticreeper system for protecting web site resources based on encryption confusion.
While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Claims (8)

1. An anti-crawler method for protecting website resources based on encryption confusion, which is characterized by comprising the following steps:
when a client sends an HTTP request to a server, adding an MD5 encryption key in a request link address, wherein the MD5 encryption key is obtained by encrypting a time stamp with a random salt value by using an MD5 message digest algorithm, and adding an AES encryption ciphertext in the head of the request, wherein the AES encryption ciphertext is obtained by encrypting a request parameter containing the time stamp and the random salt value by using an AES encryption algorithm, and then adding a random data character string;
mixing the JavaScript codes of the HTTP request;
after the server acquires the request, consistency verification is carried out according to the extracted MD5 encryption key and the AES encryption ciphertext, normal data is returned to the client after verification is passed, and otherwise, the request is refused or dirty data is returned.
2. The method for protecting web site resources against crawlers based on encryption confusion according to claim 1, wherein the method for confusing JavaScript codes of the HTTP request specifically comprises:
performing constant confusion and operation confusion processing on the JavaScript code;
carrying out control flow flattening treatment on JavaScript codes;
carrying out grammar ugging treatment on JavaScript codes;
and compressing the JavaScript code.
3. The method for protecting web site resources against crawlers based on encryption confusion according to claim 1, wherein the server performs consistency check according to the extracted MD5 encryption key and AES encryption ciphertext after obtaining the request, specifically comprising:
the website server extracts the MD5 encryption key of the request link, extracts the AES encryption ciphertext in the request header, symmetrically restores the extracted AES encryption ciphertext to obtain a time stamp and a random salt value when the request is made, and judges the authenticity of the request by verifying the consistency of the encryption values of the time stamp and the random salt value and the MD5 encryption key in the request link.
4. The method for protecting web site resources against crawlers based on encryption confusion as claimed in claim 1, wherein the method for obtaining AES encrypted ciphertext comprises:
a. the plaintext is transmitted into an AES encryption algorithm in the form of parameters;
b. executing a key expansion algorithm to obtain a round key;
c. executing a byte substitution algorithm;
d. executing a line shifting algorithm;
e. performing a row blending algorithm;
f. performing a round key addition algorithm;
g. judging whether the number of the loop iterations is reached, if so, turning to h to obtain a ciphertext C, otherwise, turning to C;
h. executing the pseudo-random number generating function to obtain a random data character string G;
i. and outputting the final ciphertext C+G.
5. An anti-crawler system for protecting web site resources based on encryption confusion, the system comprising:
the parameter encryption module is used for adding an MD5 encryption key in a request link address when the client sends an HTTP request to the server, wherein the MD5 encryption key is obtained by encrypting a time stamp with a random salt value by using an MD5 message digest algorithm, and adding an AES encryption ciphertext in the head of the request, wherein the AES encryption ciphertext is obtained by encrypting a request parameter containing the time stamp and the random salt value by using an AES encryption algorithm, and then adding a random data character string;
the code confusion module is used for carrying out confusion on JavaScript codes of the HTTP request;
and the verification module is used for carrying out consistency verification according to the extracted MD5 encryption key and the AES encryption ciphertext after the server acquires the request, returning normal data to the client after the verification is passed, and rejecting the request or returning dirty data if the verification is passed.
6. The anti-crawler system for protecting web site resources based on encryption confusion as recited in claim 5, wherein the code confusion module is specifically configured to:
performing constant confusion and operation confusion processing on the JavaScript code;
carrying out control flow flattening treatment on JavaScript codes;
carrying out grammar ugging treatment on JavaScript codes;
and compressing the JavaScript code.
7. The anti-crawler system for protecting website resources based on encryption confusion as claimed in claim 5, wherein the verification module is specifically configured to:
the website server extracts the MD5 encryption key of the request link, extracts the AES encryption ciphertext in the request header, symmetrically restores the extracted AES encryption ciphertext to obtain a time stamp and a random salt value when the request is made, and judges the authenticity of the request by verifying the consistency of the encryption values of the time stamp and the random salt value and the MD5 encryption key in the request link.
8. A computer storage medium having one or more program instructions embodied therein for performing the method of any of claims 1-4 by an anti-crawler system that protects web site resources based on encryption confusion.
CN202310187885.XA 2023-02-22 2023-02-22 Anti-crawler method and system for protecting website resources based on encryption confusion Active CN116366231B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310187885.XA CN116366231B (en) 2023-02-22 2023-02-22 Anti-crawler method and system for protecting website resources based on encryption confusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310187885.XA CN116366231B (en) 2023-02-22 2023-02-22 Anti-crawler method and system for protecting website resources based on encryption confusion

Publications (2)

Publication Number Publication Date
CN116366231A true CN116366231A (en) 2023-06-30
CN116366231B CN116366231B (en) 2023-11-24

Family

ID=86917943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310187885.XA Active CN116366231B (en) 2023-02-22 2023-02-22 Anti-crawler method and system for protecting website resources based on encryption confusion

Country Status (1)

Country Link
CN (1) CN116366231B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105704149A (en) * 2016-03-24 2016-06-22 国网江苏省电力公司电力科学研究院 Safety protection method for power mobile application
CN110891065A (en) * 2019-12-03 2020-03-17 西安博达软件股份有限公司 Token-based user identity auxiliary encryption method
CN112653695A (en) * 2020-12-21 2021-04-13 浪潮卓数大数据产业发展有限公司 Method and system for realizing crawler resistance
CN113872992A (en) * 2021-11-03 2021-12-31 管芯微技术(上海)有限公司 Method for realizing strong security authentication of remote Web access in BMC system
US11218317B1 (en) * 2021-05-28 2022-01-04 Garantir LLC Secure enclave implementation of proxied cryptographic keys
CN115567297A (en) * 2022-09-26 2023-01-03 中国银行股份有限公司 Cross-site request data processing method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105704149A (en) * 2016-03-24 2016-06-22 国网江苏省电力公司电力科学研究院 Safety protection method for power mobile application
CN110891065A (en) * 2019-12-03 2020-03-17 西安博达软件股份有限公司 Token-based user identity auxiliary encryption method
CN112653695A (en) * 2020-12-21 2021-04-13 浪潮卓数大数据产业发展有限公司 Method and system for realizing crawler resistance
US11218317B1 (en) * 2021-05-28 2022-01-04 Garantir LLC Secure enclave implementation of proxied cryptographic keys
CN113872992A (en) * 2021-11-03 2021-12-31 管芯微技术(上海)有限公司 Method for realizing strong security authentication of remote Web access in BMC system
CN115567297A (en) * 2022-09-26 2023-01-03 中国银行股份有限公司 Cross-site request data processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
不走小道: "javascript代码混淆的原理", 《HTTPS://BLOG.CSDN.NET/QQ_21531681/ARTICLE/DETAILS/108437907》, pages 1 - 2 *

Also Published As

Publication number Publication date
CN116366231B (en) 2023-11-24

Similar Documents

Publication Publication Date Title
JP6924739B2 (en) Mitigation of offline ciphertext-only attacks
JP6257754B2 (en) Data protection
CN105049400B (en) S box is split in whitepack implementation to prevent from attacking
Ali et al. A novel improvement with an effective expansion to enhance the MD5 hash function for verification of a secure E-document
CN110008745B (en) Encryption method, computer equipment and computer storage medium
JP2004534333A (en) Integrated protection method and system for distributed data processing in computer networks
Ertaul et al. Novel obfuscation algorithms for software security
Sleem et al. TestU01 and Practrand: Tools for a randomness evaluation for famous multimedia ciphers
Bhandari et al. Enhancement of MD5 Algorithm for Secured Web Development.
Rajba et al. Information hiding using minification
Mohammed et al. Advancing cloud image security via AES algorithm enhancement techniques
Rajba et al. Data hiding using code obfuscation
CN116366231B (en) Anti-crawler method and system for protecting website resources based on encryption confusion
Ahvanooey et al. CovertSYS: A systematic covert communication approach for providing secure end-to-end conversation via social networks
CN114244518A (en) Digital signature confusion encryption method and device, computer equipment and storage medium
Ciobanu et al. SCONeP: Steganography and Cryptography approach for UDP and ICMP
Islam et al. Trojan bio-hacking of DNA-sequencing pipeline
CN112307519B (en) Hierarchical verifiable query system based on selective leakage
CN114978714B (en) RISC-V based lightweight data bus encryption safe transmission method
CN117077180B (en) Lesu encrypted data recovery feasibility assessment and processing device, method, electronic equipment and storage medium
CN113360859B (en) Python interpreter-based encrypted file security control method and device
CN116527236B (en) Information change verification method and system for encryption card
Schinzel Unintentional and Hidden Information Leaks in Networked Software Applications
JP6752347B1 (en) Information processing equipment, computer programs and information processing methods
Rojasree Performance Analysis of Intelligent Key Cryptography (IKC) System

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant