CN109543454A - A kind of anti-crawler method and relevant device - Google Patents

A kind of anti-crawler method and relevant device Download PDF

Info

Publication number
CN109543454A
CN109543454A CN201910077327.1A CN201910077327A CN109543454A CN 109543454 A CN109543454 A CN 109543454A CN 201910077327 A CN201910077327 A CN 201910077327A CN 109543454 A CN109543454 A CN 109543454A
Authority
CN
China
Prior art keywords
web page
server
client
page contents
font file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910077327.1A
Other languages
Chinese (zh)
Other versions
CN109543454B (en
Inventor
康铭海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910077327.1A priority Critical patent/CN109543454B/en
Publication of CN109543454A publication Critical patent/CN109543454A/en
Application granted granted Critical
Publication of CN109543454B publication Critical patent/CN109543454B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The embodiment of the invention discloses a kind of anti-crawler method and relevant devices, it include: server when detecting the first information request that client is sent, the first web page contents are obtained first, and first web page contents are handled according to preset character mapping ruler to obtain the second web page contents, the character mapping ruler corresponds to multiple font files;Then second web page contents and the corresponding flag code of the multiple font file are sent to the client;Then the second information request that the client is sent is received, second information request carries the flag code;At least one font file in the multiple font file finally is sent to the client, at least one described font file is used to indicate the client and shows first web page contents according to second web page contents.Using the embodiment of the present invention, the validity of anti-crawler can be improved, save the cost of anti-crawler.

Description

A kind of anti-crawler method and relevant device
Technical field
The present invention relates to field of communication technology more particularly to a kind of anti-crawler method and relevant devices.
Background technique
At present there are countless web crawlers in network, web crawlers is a kind of net for auto-browsing WWW Network robot, it can preserve the page accessed.Criminal obtains a large amount of web site contents using crawler and carries out Profiteering, this causes great threat to the safety of the private data of netizen.In existing anti-crawler technology, server is to net Page content carries out being then forwarded to client (such as browser) after encrypting/encoding, and client is then needed to the web page contents received It is decrypted.However, decoding algorithm is easy since decoding algorithm is to be write in page script file in clear text manner It is obtained by crawlers, so that anti-crawler can not be effectively realized.And in such a way that front end is decoded, to client Certain performance cost will be caused, when needing decoded data volume larger, is easy to cause webpage Caton.
Summary of the invention
The present invention provides a kind of anti-crawler method and relevant device, and the validity of anti-crawler can be improved, save anti-crawler Cost.
On the one hand, the embodiment of the invention provides a kind of anti-crawler methods, comprising:
Server obtains the first web page contents when detecting the first information request that client is sent;
The server is handled to obtain the second net according to preset character mapping ruler to first web page contents Page content, the character mapping ruler correspond to multiple font files;
The server sends second web page contents and the corresponding mark of the multiple font file to the client Remember code;
The server receives the second information request that the client is sent, and second information request carries the mark Remember code;
The server sends at least one font file in the multiple font file to the client, it is described extremely A few font file is used to indicate the client and shows first web page contents according to second web page contents.
Wherein, second information request carries font format information;
Before the server sends at least one font file in the multiple font file to the client, also Include:
The server searches multiple font files corresponding with the flag code from database, and the database includes The corresponding relationship of the flag code and the multiple font file;
The server chooses at least one described word according to the font format information from the multiple font file Body file.
Wherein, the character mapping ruler includes the mapping relations between multiple first characters and multiple second characters;
The server is when detecting the first information request that client is sent, before obtaining the first web page contents, also Include:
The server generates the corresponding scalable vector graphics of each first character in the multiple first character;
The server generates the multiple font text according to the scalable vector graphics and the character mapping ruler Part.
Wherein, the server to the client send at least one font file in the multiple font file it Before, further includes:
The server determines whether current time is in the default validity period of the flag code;
The server executes described to client transmission when the current time is in the default validity period The operation of at least one font file in the multiple font file.
Wherein, the server to the client send at least one font file in the multiple font file it Before, further includes:
The server determines the cumulative frequency for receiving the flag code;
The server executes described to the multiple font text of client transmission when the cumulative frequency is zero The operation of at least one font file in part.
Wherein, the server is handled to obtain according to preset character mapping ruler to first web page contents Second web page contents include:
The server determines the sensitive content in first web page contents;
The server carries out transcoding according to the character mapping ruler, to the sensitive content;
The server is using the sensitive content by first web page contents after transcoding as in second webpage Hold.
On the other hand, the embodiment of the invention provides another anti-crawler methods, comprising:
User end to server sends first information request, and the first information request is used to indicate the server and obtains First web page contents simultaneously are handled to obtain in the second webpage according to preset character mapping ruler to first web page contents Hold, the character mapping ruler corresponds to multiple font files;
The client receives second web page contents that the server is sent and the multiple font file is corresponding Flag code;
The client sends the second information request to the server, and second information request carries the label Code;
The client receives at least one font file in the multiple font file that the server is sent;
The client shows first webpage according at least one described font file and second web page contents Content.
Wherein, second information request further includes font format information, and the font format information is used to indicate described Server chooses at least one described font file from the multiple font file.
Wherein, the client is according at least one described font file and second web page contents, shows described the One web page contents include:
The client chooses the font format phase supported with the client from least one described font file Matched target font file;
The client is shown in first webpage according to the target font file and second web page contents Hold.
On the other hand, the embodiment of the invention provides a kind of servers, comprising:
Module is obtained, for obtaining the first web page contents when detecting the first information request that client is sent;
Transcoding module is handled to obtain the to first web page contents for according to preset character mapping ruler Two web page contents, the character mapping ruler correspond to multiple font files;
Sending module, for corresponding to client transmission second web page contents and the multiple font file Flag code;
Receiving module, the second information request sent for receiving the client, second information request carry institute State flag code;
The sending module is also used to send at least one font text in the multiple font file to the client Part, at least one described font file are used to indicate the client and show first webpage according to second web page contents Content.
Wherein, second information request carries font format information;
The sending module, is also used to:
Multiple font files corresponding with the flag code are searched from database, the database includes the flag code With the corresponding relationship of the multiple font file;
According to the font format information, at least one described font file is chosen from the multiple font file.
Wherein, the character mapping ruler includes the mapping relations between multiple first characters and multiple second characters;
The server further includes generation module, is used for:
Generate the corresponding scalable vector graphics of each first character in the multiple first character;
According to the scalable vector graphics and the character mapping ruler, the multiple font file is generated.
Wherein, the sending module is also used to:
Determine whether current time is in the default validity period of the flag code;
When the current time is in the default validity period, execute described to the multiple word of client transmission The operation of at least one font file in body file.
Wherein, the sending module is also used to:
Determine the cumulative frequency for receiving the flag code;
When the cumulative frequency is zero, execution is described to be sent in the multiple font file at least to the client The operation of one font file.
Wherein, the transcoding module is also used to:
Determine the sensitive content in first web page contents;
According to the character mapping ruler, transcoding is carried out to the sensitive content;
Using the sensitive content by first web page contents after transcoding as second web page contents.
On the other hand, the embodiment of the invention provides a kind of clients, comprising:
Sending module, for sending first information request to server, the first information request is used to indicate the clothes Business device obtains the first web page contents and is handled to obtain the to first web page contents according to preset character mapping ruler Two web page contents, the character mapping ruler correspond to multiple font files;
Receiving module, second web page contents and the multiple font file pair sent for receiving the server The flag code answered;
The sending module is also used to send the second information request to the server, and second information request carries The flag code;
The receiving module is also used to receive at least one word in the multiple font file that the server is sent Body file;
Display module, for according at least one described font file and second web page contents, display described first Web page contents.
Wherein, second information request further includes font format information, and the font format information is used to indicate described Server chooses at least one described font file from the multiple font file.
Wherein, the display module is also used to:
The target that the font format supported with the client matches is chosen from least one described font file Font file;
According to the target font file and second web page contents, first web page contents are shown.
On the other hand, the embodiment of the invention provides a kind of servers, comprising: processor, memory and communication bus, In, for realizing connection communication between processor and memory, processor executes the program stored in memory and uses communication bus Step in a kind of anti-crawler method that above-mentioned first aspect offer is provided.
On the other hand, the embodiment of the invention provides a kind of clients, comprising: processor, memory and communication bus, In, for realizing connection communication between processor and memory, processor executes the program stored in memory and uses communication bus Step in a kind of anti-crawler method that above-mentioned second aspect offer is provided.
The another aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer-readable storage A plurality of instruction is stored in medium, described instruction is suitable for being loaded as processor and executing method described in above-mentioned various aspects.
The another aspect of the embodiment of the present invention provides a kind of computer program product comprising instruction, when it is in computer When upper operation, so that computer executes method described in above-mentioned various aspects.
Implement the embodiment of the present invention, server obtains the when detecting the first information request that client is sent first One web page contents, and first web page contents are handled to obtain in the second webpage according to preset character mapping ruler Hold, the character mapping ruler corresponds to multiple font files;Then second web page contents and institute are sent to the client State the corresponding flag code of multiple font files;Then the second information request that the client is sent, second information are received Request carries the flag code;At least one font file in the multiple font file finally is sent to the client, At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents Hold.The validity of anti-crawler can be improved, save the cost of anti-crawler, to promote user experience.
Detailed description of the invention
Technical solution in order to illustrate the embodiments of the present invention more clearly or in background technique below will be implemented the present invention Attached drawing needed in example or background technique is illustrated.
Fig. 1 is a kind of information interaction system schematic diagram provided in an embodiment of the present invention;
Fig. 2 is a kind of flow diagram of anti-crawler method provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of the anti-crawler method of another kind provided in an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram that the first web page contents are shown according to the second web page contents provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of server provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of client provided in an embodiment of the present invention;
Fig. 7 is the structural schematic diagram of another server provided in an embodiment of the present invention;
Fig. 8 is the structural schematic diagram of another client provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.
Referring to Figure 1, Fig. 1 is a kind of structural schematic diagram of information interaction system provided in an embodiment of the present invention.The information Interactive system includes client and server.Client can be browser.Server can be Web server, for storing A large amount of webpage, data and information.Wherein, Node.js middle layer can also be increased between server and client side.In general, Node.js middle layer may operate on server, for assisting client and server processing business.Client can be to clothes Business device sends information request, and the address information or mark for the information (such as webpage) which accesses needed for can carrying are believed Breath, the identity information of client and version information etc..Server is after the information request for receiving client transmission, Ke Yixian Connection is established with client, wherein can be, but not limited to establish connection according to ICP/IP protocol and client;Then according to information The information found is simultaneously sent to client by the entrained address information of request or identification information, information needed for searching client End.Client can then show the information received so that user checks.During server sends information to client, Crawler can obtain the information simultaneously, to cause the leakage of privacy of user data.In order to solve this problem, the embodiment of the present invention Provide following solution.
Fig. 2 is referred to, Fig. 2 is a kind of flow diagram of anti-crawler method provided in an embodiment of the present invention, this method packet It includes but is not limited to following steps:
S201, server obtain the first web page contents when detecting the first information request that client is sent.
In the specific implementation, the network address of the first web page contents can be carried in first information request, as unified resource is fixed Position symbol (Uniform Resource Locator, URL).Then server can request according to the first information in the network that carries Location finds corresponding first web page contents from database.
Server is before the first information request that detection client is sent, it may be predetermined that for in the first webpage Hold the character mapping ruler handled, including: firstly, the character for needing transcoding is determined, for convenience of describing each need The character of transcoding is wanted to be known as the first character.In a practical situation, user need privacy information to be protected generally include phone number, ID card No., bank's card number and important account name, therefore can be, but not limited to 10 numbers of 0-9,26 capitalization English Text is female and 26 small English alphabets in each character be determined as the first character.It then, will be in multiple first characters Each first character Random Maps at one be different from first character the second character, the second character can be letter, number, Chinese character, character string and additional character (such as #).Finally the mapping relations between multiple first characters and multiple second characters are saved For character mapping ruler.
It should be noted that can be adjusted according to practical application scene and user demand to the multiple first character It is whole.For example, user also requires to protect name as privacy information, then the multiple first character can also include name The middle higher Chinese character of the frequency of occurrences, such as " Lee ", " bright ", " king ".
For the succinct mapping relations illustrated between multiple first characters and multiple second characters, it is assumed that multiple first words Symbol includes 1,2,3, a, A, B.Then 5 can be mapped to by 1,2 be mapped to A, 3 be mapped to t, a is mapped to 3, mapping A It is mapped to b at # and by B, corresponding character mapping ruler is as shown in table 1.
According to above-mentioned example it is found that the first character of each of multiple first characters can be different from Random Maps at one Second character of first character, it can determine various characters mapping ruler.For example, 1 can also be reflected in the above example It penetrates into W, be mapped to P for 2, the mapping relations of other characters are constant.Malice crawler cracks character mapping ruler in order to prevent, service Device can randomly select one or more kinds of character mapping rulers save as preset character mapping ruler in case With, while preset character mapping ruler can also be updated according to predeterminated frequency (such as 1 minute/time).
1. character mapping ruler of table
After determining preset character mapping ruler, corresponding word can also be generated according to every kind of character mapping ruler Body file, wherein in view of the font format that different clients is supported is different, therefore every kind of font lattice can be directed to Formula (such as eot format, woff format, ttf format), generates a font file.Specifically, each first word can be firstly generated The scalable vector graphics (scalable vector graphics, svg) of symbol obtain the corresponding svg text of each first character Part, wherein can use Adobe Illustrator CS6 or sketch software for character by way of drawing character path It is fabricated to corresponding svg file;Then according to scalable vector graphics and character mapping ruler, multiple font files are generated, In, can first character mapping ruler be indicated and be saved with json language, obtain corresponding json file;Again by each The svg file of one character and json file input font generating platform (such as iconfont.cn), font generating platform will export Corresponding font file.
It optionally, can be by running Node.js middle layer generation font file on the server.Wherein, Node.js Middle layer can be, but not limited to generate font file by operation following code, wherein base-Charset and NewCharset respectively indicates multiple first characters by indicating in preset characters mapping ruler and multiple second characters form Array.
S202, the server are handled to obtain according to preset character mapping ruler to first web page contents Second web page contents, the character mapping ruler correspond to multiple font files.
It, can be in order to improve the validity of anti-crawler in the specific implementation, when there are a variety of preset character mapping rulers It randomly selects and one such first web page contents is handled to obtain the second web page contents.In addition, anti-in order to improve Crawler and information transfer efficiency can determine the sensitive content in the first web page contents, such as ID card No., phone number first Code and account name;Then transcoding is carried out to the sensitive content according to character mapping ruler, and by the sensitive content by after transcoding First web page contents are as the second web page contents, wherein the first web page contents can be a hypertext markup language (hypertext markup language, html) text.
Such as: the first web page contents are as follows:
Wherein, the first web page contents include the telephone number " 19926419137 " of user, then server is mapped according to character Rule carries out word for word mapping to " 19926419137 " and obtains " 93375693920 ", to obtain the second web page contents:
S203, the server sends second web page contents to the client and the multiple font file is corresponding Flag code.
In the specific implementation, the hand although still available second web page contents of crawler at this time, in the second web page contents The sensitive contents such as machine number, ID card No. are false content, have achieved the purpose that prevent privacy of user leaking data.Together When, it is obtained in the first webpage in order to avoid crawler gets character mapping ruler to carry out inversion code to the second web page contents Hold, is used in the embodiment of the present invention by the fusion of character mapping ruler in font file, and the flag code of font file is passed Client is defeated by so as to the method for user end to server request font file, basic principle is that crawler can not be from font text Character mapping ruler is released in part.Wherein, flag code (being denoted as token) can be the word for the random length that server generates at random Symbol string or character string, such as bgu67st.
S204, the server receive the second information request that the client is sent, and second information request carries The flag code.
S205, the server send at least one font file in the multiple font file to the client, At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents Hold.
In the specific implementation, server can search multiple font files corresponding with the flag code, institute from database State the corresponding relationship that database includes the flag code Yu the multiple font file.Wherein, server can generate every kind After the corresponding multiple font files of character mapping ruler, token is generated at random, and it is corresponding then to establish every kind of character mapping ruler The corresponding relationship of multiple font files and a kind of token, and in the database by corresponding relationship storage.Wherein, token can be with It is the character string of random length.
Such as: save three kinds of preset character mapping rulers in server: EoU2.json, kKMA.json and ND84.json.Wherein, the corresponding font file of EoU2.json includes apple.eot, pear.ttf and grape.woff, this three The corresponding token of a font file is 4d3a7cf9.The corresponding font file of kKMA.json include dog.eot, cat.ttf and Sheep.woff, the corresponding token of these three font files are faac3db2.The corresponding font file of ND84.json includes Cake.eot, rice.ttf and meat.woff, the corresponding token of these three font files are 718ffc36.Therefore, Ke Yi Mapping table as shown in Table 2 is saved in database.
The mapping table -1 of table 2. font file and token
Font file token
apple.eot、pear.ttf、grape.woff 4d3a7cf9
dog.eot、cat.ttf、sheep.woff faac3db2
cake.eot、rice.ttf、meat.woff 718ffc36
For another example: save three kinds of preset character mapping rulers in server: EoU2.json, kKMA.json and ND84.json.Wherein, the corresponding font file of EoU2.json includes EoU2.eot, EoU2.ttf and EoU2.woff, these three The corresponding token of font file is 4d3a7cf9.The corresponding font file of kKMA.json include kKMA.eot, kKMA.ttf and KKMA.woff, the corresponding token of these three font files are faac3db2.The corresponding font file of ND84.json includes ND84.eot, ND84.ttf and ND84.woff, the corresponding token of these three font files are 718ffc36.Therefore, Ke Yi Mapping table as shown in table 3 is saved in database, wherein font file pair can be searched from mapping table according to token first The filename answered, then according to filename lookup font file from memory space shared by database.
The mapping table -2 of table 3. font file and token
Filename token
EoU2 4d3a7cf9
kKMA faac3db2
ND84 718ffc36
Wherein, the multiple font file can be all sent to client by server.Client then can be according to certainly The font format that body is supported therefrom selects a kind of font file, then parses the second web page contents and combines selected font File renders the second web page contents, so that the web page contents finally shown are identical as the first web page contents.
It optionally, can also include font format information in second information request, which is used for table Show the font format that the client is supported, such as eot, ttf and woff.Then server can be searched from database first Multiple font files corresponding with the flag code;Then according to the font format information, from the multiple font file Choose at least one described font file.For example, including font format information eot and flag code in the second information request 4d3a7cf9.According to table 2, server can find 4d3a7cf9 corresponding font file apple.eot, pear.ttf and Apple.eot is sent to client then according to font format information eot by grape.woff.
In embodiments of the present invention, server obtains the when detecting the first information request that client is sent first One web page contents, and first web page contents are handled to obtain in the second webpage according to preset character mapping ruler Hold, the character mapping ruler corresponds to multiple font files;Then second web page contents and institute are sent to the client State the corresponding flag code of multiple font files;Then the second information request that the client is sent, second information are received Request carries the flag code;At least one font file in the multiple font file finally is sent to the client, At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents Hold.The validity of anti-crawler can be improved, save the cost of anti-crawler, to promote user experience.
Fig. 3 is referred to, Fig. 3 is the flow diagram of the anti-crawler method of another kind provided in an embodiment of the present invention, this method Including but not limited to following steps:
S301, user end to server send first information request.
In the specific implementation, the network address of the first web page contents, such as URL can be carried in first information request.It can be with Configuration information and version information including the client etc..
S302, server obtain the first web page contents.
In the specific implementation, server can be believed according to the first information request received entrained network address or mark Breath, searches corresponding first web page contents from database.
S303, server are handled to obtain second according to preset character mapping ruler to first web page contents Web page contents, the character mapping ruler correspond to multiple font files.This step is identical as the S202 in a upper embodiment, this step Suddenly it repeats no more.
S304, server send the flag code of second web page contents and the multiple font file to client.This Step is identical as the S203 in a upper embodiment, this step repeats no more.
S305, user end to server send the second information request, and second information request carries the flag code.
Optionally, font format information can also be carried in second information request, which indicates should The font format that client is supported, such as eot, ttf and woff.
S306, server verify second information request.Wherein, if second information request verification at Function then executes S307, if second information request verification failure, ends at this step, and no longer execute and carry out following streams Journey.
In the specific implementation, token can be set in server only has over a period to come in order to enhance the validity of anti-crawler Effect.Therefore, server can determine whether current time is in the default validity period of the flag code, wherein current time can Think that server receives the time of second information request.If current time is in the default validity period, it is determined that institute It states the second information request to verify successfully, if current time is not at the default validity period, it is determined that second information request Verification failure.Wherein it is possible in the database by the storage of default validity period of each flag code.
For example, the mapping table stored in database is as shown in table 4, wherein 4d3a7cf9, faac3db2 and 718ffc36 are equal Before 2018-07-19 18:33:12 effectively.The time that server receives the second information request is 2018-07-19 18: 33:01, the token which carries are faac3db2.Because 2018-07-19 18:33:01 is less than The validity period 2018-07-19 18:33:12 of faac3db2, so determining that second information request verifies successfully.
The mapping table -3 of table 4. font file and token
Font file token Validity period
apple.eot、pear.ttf、grape.woff 4d3a7cf9 2018-07-19 18:33:12
dog.eot、cat.ttf、sheep.woff faac3db2 2018-07-19 18:33:12
cake.eot、rice.ttf、meat.woff 718ffc36 2018-07-19 18:33:12
Optionally, server can be primary effective with setting flag code, i.e., client is only using the token for the first time When to server solicited message, server is responded accordingly, when reusing the token to server solicited message, clothes Device be engaged in using the information request as invalidation request processing.Therefore, server can determine first receives the tired of the flag code Product number, wherein do not include that this receives the flag code in the cumulative frequency.When the cumulative frequency is zero, determine This time to receive the flag code for the first time, so that it is determined that second information request verifies successfully.When the cumulative frequency is not Zero, it is determined that the second information request verification failure.
Optionally, server can update each flag code after completing to the response of the second information request of client And/or character mapping ruler.
S307, server send at least one font file in the multiple font file to client.
In the specific implementation, second information is asked if not including font format information in second information request The corresponding multiple font files of the flag code of carrying are asked all to be sent to client.If in second information request including font Format information is then chosen and the font lattice from the corresponding multiple font files of flag code that second information request carries The font file that formula information matches is sent to client.
S308, client show first net according at least one described font file and second web page contents Page content.
In the specific implementation, if in at least one font file including two and more than two font files, i.e. client The carrying of font format information in second information request is not sent to server by end, then client is first from receiving At least one font file in choose the target font file that the font format supported with the client matches.If described It only include a font file at least one font file, i.e. client carries font format information in second information Server is sent in request, then client is by least one described font file as target font file;Then according to mesh Font file and second web page contents are marked, show the first web page contents.Wherein, client can be first to the second web page contents Parsed, recycle cascading style sheets (Cascading Style Sheets, CSS) and target font file to parsing after Second web page contents are rendered, so that web page contents shown by client are identical as the first web page contents.
Such as: as shown in figure 4, the web page contents that client receives are the transcoding of the requested web page contents of the client As a result, wherein the value of sensitive content " phone " is " 93375693-920 " in the web page contents received, and in the client The value for holding in requested web page contents " phone " is " 19926419137 ".In practical render process, client can root According to the corresponding font file of character mapping ruler for carrying out transcoding, " 93375693920 " are shown as real information " 19926419137 ", the i.e. web page contents of actual displayed are identical as the requested web page contents of client.
In embodiments of the present invention, server obtains the when detecting the first information request that client is sent first One web page contents;Then according to preset character mapping ruler, the first web page contents are handled to obtain the second web page contents, The character mapping ruler corresponds to multiple font files;Secondly server sends the second web page contents and multiple fonts to client The corresponding flag code of file;Then user end to server sends the second information request, and the second information request carries flag code;Most Server verifies the second information request afterwards, if verifying successfully, sends in the multiple font file to client At least one font file.Client receives at least one font file that server is sent, and according at least one received A font file and the second web page contents show the first web page contents.Wherein, the verification of the second information request can be prevented from marking Note code is stolen, prevention font file reveals the possibility for causing character mapping ruler strongly to be cracked.To further improve The validity of anti-crawler.
It is above-mentioned to illustrate the method for the embodiment of the present invention, the relevant device of the embodiment of the present invention is provided below.
Refer to Fig. 5, Fig. 5 is a kind of structural schematic diagram of server provided in an embodiment of the present invention, which can be with Include:
Module 501 is obtained, for obtaining the first web page contents when detecting the first information request that client is sent.
In the specific implementation, the network address of the first web page contents, such as URL can be carried in first information request.Obtain mould Block 501 can request according to the first information in the network address that carries, corresponding first web page contents are found from database.
Server is before the first information request that detection client is sent, it may be predetermined that for in the first webpage Hold the character mapping ruler handled.Therefore, server can also include generation module, be used for: firstly, determination needs transcoding Character, each character for needing transcoding is known as the first character for convenience of description.In a practical situation, user needs to be protected Privacy information generally includes phone number, ID card No., bank's card number and important account name, therefore can be, but not limited to Each character in 10 numbers of 0-9,26 capitalization English letters and 26 small English alphabets is determined as the first word Symbol.Then, the first character Random Maps of each of the multiple first character are different from the of first character at one Two characters, the second character can be letter, number, Chinese character, character string and additional character (such as #).Finally by multiple first characters Mapping relations between multiple second characters save as character mapping ruler.
It should be noted that can be adjusted according to practical application scene and user demand to the multiple first character It is whole.For example, user also requires to protect name as privacy information, then the multiple first character can also include name The middle higher Chinese character of the frequency of occurrences, such as " Lee ", " bright ", " king ".
According to above-mentioned mapping method it is found that the first character of each of multiple first characters can with Random Maps at one not It is same as the second character of first character, it can determine various characters mapping ruler.For example, can also incite somebody to action in the above example 1 is mapped to A, is mapped to 5 for 2, and the mapping relations between other characters are constant.Malice crawler cracks character mapping rule in order to prevent Then, can randomly select one or more kinds of character mapping rulers save as preset character mapping ruler in case With, while preset character mapping ruler can also be updated according to predeterminated frequency (such as 1 minute/time).
After determining preset character mapping ruler, generation module can also be generated according to every kind of character mapping ruler Corresponding font file, wherein in view of the font format that different clients is supported is different, therefore can be for every Kind font format (such as eot format, woff format, ttf format), generates a font file.Specifically, it can firstly generate every The svg file of a first character, wherein can use Adobe Illustrator CS6 or sketch software and pass through character The mode for drawing character path is fabricated to corresponding svg file;Then raw according to scalable vector graphics and character mapping ruler At multiple font files, wherein can first character mapping ruler be indicated and be saved with json language, obtained corresponding Json file;Again (such as by the svg file of each first character and json file input font generating platform Iconfont.cn), font generating platform will export corresponding font file.Wherein, generation module can also be by each first word The svg file of symbol, multiple first characters, multiple second characters and character mapping ruler, which are transferred to, to be run on the server Node.js middle layer, to indicate that Node.js middle layer generates font file.
Transcoding module 502, for being handled to obtain to first web page contents according to preset character mapping ruler Second web page contents, the character mapping ruler correspond to multiple font files.
It, can be in order to improve the validity of anti-crawler in the specific implementation, when there are a variety of preset character mapping rulers It randomly selects and one such first web page contents is handled to obtain the second web page contents.In addition, anti-in order to improve Crawler and information transfer efficiency can determine the sensitive content in the first web page contents, such as ID card No., phone number first Code and account name;Then transcoding is carried out to the sensitive content according to character mapping ruler, and by the sensitive content by after transcoding First web page contents are as the second web page contents.
Sending module 503, for sending second web page contents and the multiple font file pair to the client The flag code answered.
In the specific implementation, the character string for the random length that the flag code (being denoted as token) can be randomly generated, such as bgu67st。
Receiving module 504, the second information request sent for receiving the client, second information request carry The flag code.
Sending module 503 is also used to send at least one font text in the multiple font file to the client Part, at least one described font file are used to indicate the client and show first webpage according to second web page contents Content.
In the specific implementation, server can search multiple font files corresponding with the flag code, institute from database State the corresponding relationship that database includes the flag code Yu the multiple font file.Wherein, server can generate every kind After the corresponding multiple font files of character mapping ruler, token is generated at random, and it is corresponding then to establish every kind of character mapping ruler The corresponding relationship of multiple font files and a kind of token, and in the database by corresponding relationship storage.Wherein it is possible to by institute It states multiple font files and is all sent to client
Optionally, font format information can also be carried in second information request, which is used for table Show the font format that the client is supported, such as eot, ttf and woff.Then sending module 503 can be first from database It is middle to search multiple font files corresponding with the flag code;Then according to the font format information, from the multiple font At least one described font file is chosen in file.For example, including font format information eot and flag code in the second information request 4d3a7cf9.According to table 2, server can find 4d3a7cf9 corresponding font file apple.eot, pear.ttf and Apple.eot is sent to client then according to font format information eot by grape.woff.
Optionally, sending module 503 is sending at least one font in the multiple font file to the client Before file, second information request can also be verified, be executed if verifying successfully to the client and send institute State the operation of at least one font file in multiple font files.Wherein it is possible to which token is arranged only to be had over a period to come Effect, then sending module 503 determines whether current time is in the default validity period of the flag code first, wherein current time The time of second information request can be received for server.If current time is in the default validity period, it is determined that Second information request verifies successfully, if current time is not at the default validity period, it is determined that second information is asked Verification is asked to fail.Wherein it is possible in the database by the storage of default validity period of each flag code.
Optionally, can be with setting flag code to be primary effective, i.e., client only uses the token to service for the first time When device solicited message, server is responded accordingly, and when reusing the token to server solicited message, server will The information request is as invalidation request processing.Therefore, sending module 503 can determine the accumulation for receiving the flag code first Number, wherein do not include that this receives the flag code in the cumulative frequency.When the cumulative frequency is zero, this is determined Secondary is to receive the flag code for the first time, so that it is determined that second information request verifies successfully.When the cumulative frequency is not zero, Then determine the second information request verification failure.
Optionally, after completing to the response of the second information request of client, sending module 503 can be with update mark Code;
Optionally, after completing to the response of the second information request of client.Generation module can update preset word Accord with mapping ruler.
In embodiments of the present invention, server obtains the when detecting the first information request that client is sent first One web page contents, and first web page contents are handled to obtain in the second webpage according to preset character mapping ruler Hold, the character mapping ruler corresponds to multiple font files;Then second web page contents and institute are sent to the client State the corresponding flag code of multiple font files;Then the second information request that the client is sent, second information are received Request carries the flag code;Last server verifies the second information request, if verifying successfully, sends to client At least one font file described at least one font file in the multiple font file is used to indicate the client root First web page contents are shown according to second web page contents.Can be improved the validity of anti-crawler, save anti-crawler at This, to promote user experience.
Refer to Fig. 6, Fig. 6 is a kind of structural schematic diagram of client provided in an embodiment of the present invention, which can be with Include:
Sending module 601, for sending first information request to server, the first information request is used to indicate described Server obtains the first web page contents and is handled to obtain to first web page contents according to preset character mapping ruler Second web page contents, the character mapping ruler correspond to multiple font files.
In the specific implementation, the network address of the first web page contents, such as URL can be carried in first information request.It can be with Configuration information and version information including client etc..
Receiving module 602, for receiving second web page contents and the multiple font text that the server is sent The corresponding flag code of part.
Sending module 601 is also used to send the second information request to the server, and second information request carries institute State flag code.
Receiving module 602 is also used to receive at least one word in the multiple font file that the server is sent Body file.
It optionally, can also include font format information in second information request, which indicates should The font format that client is supported, such as eot, ttf and woff are used to indicate server and select from the multiple font file Take at least one described font file.
Display module 603, for according at least one described font file and second web page contents, showing described the One web page contents.
In the specific implementation, being sent if including two and more than two font files in at least one font file Font format information is not carried and is sent to server in second information request by module 601, then display module 603 is first The target font that the font format supported with the client matches first is chosen from least one font file received File.If only including a font file in at least one font file, i.e. sending module 601 takes font format information Band is sent to server in second information request, then display module 603 is by least one described font file as mesh Mark font file;Then according to target font file and second web page contents, the first web page contents are shown.Wherein, it shows Module 603 can first parse the second web page contents, recycle CSS and target font file to the second webpage after parsing Content is rendered, so that the web page contents shown by client are identical as the first web page contents.
In embodiments of the present invention, client sends first information request to server first, to indicate the server It obtains the first web page contents and first web page contents is handled to obtain the second net according to preset character mapping ruler Page content, the character mapping ruler correspond to multiple font file instruction servers and obtain the first web page contents and to the first webpage Content;Then second web page contents and the corresponding flag code of the multiple font file that the server is sent are received; Then the second information request is sent to the server, second information request carries the flag code;Finally to the clothes Business device sends the second information request, and second information request carries the flag code, and according at least one font text Part and second web page contents show first web page contents.The validity of anti-crawler can effectively be enhanced, reduce counter climb The cost of worm.
Fig. 7 is referred to, Fig. 7 is the structural schematic diagram of another server provided in an embodiment of the present invention.As shown, should Server may include: at least one processor 701, at least one communication interface 702, at least one processor 703 and at least One communication bus 704.
Wherein, processor 701 can be central processor unit, general processor, digital signal processor, dedicated integrated Circuit, field programmable gate array or other programmable logic device, transistor logic, hardware component or it is any Combination.It, which may be implemented or executes, combines various illustrative logic blocks, module and electricity described in the disclosure of invention Road.The processor is also possible to realize the combination of computing function, such as combines comprising one or more microprocessors, number letter Number processor and the combination of microprocessor etc..Communication bus 704 can be Peripheral Component Interconnect standard PCI bus or extension work Industry normal structure eisa bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for indicate, It is only indicated with a thick line in Fig. 7, it is not intended that an only bus or a type of bus.Communication bus 704 is used for Realize the connection communication between these components.Wherein, the communication interface 702 of equipment is used for and other nodes in the embodiment of the present invention Equipment carries out the communication of signaling or data.Memory 703 may include volatile memory, such as non-volatile dynamic random is deposited Take memory (Nonvolatile Random Access Memory, NVRAM), phase change random access memory (Phase Change RAM, PRAM), magnetic-resistance random access memory (Magetoresistive RAM, MRAM) etc., can also include non- Volatile memory, for example, at least a disk memory, Electrical Erasable programmable read only memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), flush memory device, such as anti-or flash memory (NOR Flash memory) or anti-and flash memory (NAND flash memory), semiconductor devices, such as solid state hard disk (Solid State Disk, SSD) etc..Memory 703 optionally can also be that at least one is located remotely from the storage of aforementioned processor 701 Device.Batch processing code is stored in memory 703, and processor 701 executes the program in memory 703:
When detecting the first information request that client is sent, the first web page contents are obtained;
According to preset character mapping ruler, first web page contents are handled to obtain the second web page contents, institute It states character mapping ruler and corresponds to multiple font files;
Second web page contents and the corresponding flag code of the multiple font file are sent to the client;
The second information request that the client is sent is received, second information request carries the flag code;
At least one font file in the multiple font file, at least one described font are sent to the client File is used to indicate the client and shows first web page contents according to second web page contents.
Optionally, second information request carries font format information;
Processor 701 is also used to perform the following operations step:
Multiple font files corresponding with the flag code are searched from database, the database includes the flag code With the corresponding relationship of the multiple font file;
According to the font format information, at least one described font file is chosen from the multiple font file.
Optionally, the character mapping ruler includes the mapping relations between multiple first characters and multiple second characters;
Processor 701 is also used to perform the following operations step:
Generate the corresponding scalable vector graphics of each first character in the multiple first character;
According to the scalable vector graphics and the character mapping ruler, the multiple font file is generated.
Optionally, processor 701 is also used to perform the following operations step:
Determine whether current time is in the default validity period of the flag code;
When the current time is in the default validity period, execute described to the multiple word of client transmission The operation of at least one font file in body file.
Optionally, processor 701 is also used to perform the following operations step:
Determine the cumulative frequency for receiving the flag code;
When the cumulative frequency is zero, execution is described to be sent in the multiple font file at least to the client The operation of one font file.
Optionally, processor 701 is also used to perform the following operations step:
Determine the sensitive content in first web page contents;
According to the character mapping ruler, transcoding is carried out to the sensitive content;
Using the sensitive content by first web page contents after transcoding as second web page contents.
Further, processor can also be matched with memory and communication interface, executed and taken in foregoing invention embodiment The operation of business device.
Fig. 8 is referred to, Fig. 8 is the structural schematic diagram of another client provided in an embodiment of the present invention, the client packet Include processor 801, communication interface 802, memory 803 and communication bus 804.
Wherein, processor 801 can be the various types of processors being mentioned above.Communication bus 804 can be peripheral hardware Component connection standard PCI bus or expanding the industrial standard structure eisa bus etc..The bus can be divided into address bus, data Bus, control bus etc..Only to be indicated with a thick line in Fig. 8, it is not intended that an only bus or one kind convenient for indicating The bus of type.Communication bus 804 is for realizing the connection communication between these components.Wherein, equipment in the embodiment of the present application Communication interface 802 be used to carry out the communication of signaling or data with other node devices.Memory 803, which can be, to be mentioned above Various types of memories.Memory 803 optionally can also be that at least one is located remotely from the storage of aforementioned processor 801 dress It sets.Batch processing code is stored in memory 803, and processor 801 executes in memory 803 performed by above-mentioned communication equipment Program:
First information request is sent to server, the first information request is used to indicate the server and obtains the first net Page content simultaneously handles first web page contents according to preset character mapping ruler to obtain the second web page contents, described Character mapping ruler corresponds to multiple font files;
Receive second web page contents and the corresponding flag code of the multiple font file that the server is sent;
The second information request is sent to the server, second information request carries the flag code;
Receive at least one font file in the multiple font file that the server is sent;
According at least one described font file and second web page contents, first web page contents are shown.
Optionally, processor 801 is also used to perform the following operations step:
The target that the font format supported with the client matches is chosen from least one described font file Font file;
According to the target font file and second web page contents, first web page contents are shown.
Further, processor can also be matched with memory and communication interface, execute visitor in foregoing invention embodiment The operation at family end.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or It partly generates according to process or function described in the embodiment of the present invention.The computer can be general purpose computer, dedicated meter Calculation machine, computer network or other programmable devices.The computer instruction can store in computer readable storage medium In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid State Disk (SSD)) etc..
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects It is described in detail.All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in Within protection scope of the present invention.

Claims (15)

1. a kind of anti-crawler method, which is characterized in that the described method includes:
Server obtains the first web page contents when detecting the first information request that client is sent;
The server is handled to obtain in the second webpage according to preset character mapping ruler to first web page contents Hold, the character mapping ruler corresponds to multiple font files;
The server sends second web page contents and the corresponding flag code of the multiple font file to the client;
The server receives the second information request that the client is sent, and second information request carries the label Code;
The server sends at least one font file in the multiple font file to the client, and described at least one A font file is used to indicate the client and shows first web page contents according to second web page contents.
2. the method as described in claim 1, which is characterized in that second information request carries font format information;
Before the server sends at least one font file in the multiple font file to the client, also wrap It includes:
The server searches multiple font files corresponding with the flag code from database, and the database includes described The corresponding relationship of flag code and the multiple font file;
The server chooses at least one font text according to the font format information from the multiple font file Part.
3. the method as described in claim 1, which is characterized in that the character mapping ruler include multiple first characters with it is multiple Mapping relations between second character;
The server is when detecting the first information request that client is sent, before the first web page contents of acquisition, further includes:
The server generates the corresponding scalable vector graphics of each first character in the multiple first character;
The server generates the multiple font file according to the scalable vector graphics and the character mapping ruler.
4. the method as described in claim 1, which is characterized in that the server sends the multiple font to the client Before at least one font file in file, further includes:
The server determines whether current time is in the default validity period of the flag code;
The server executes described to described in client transmission when the current time is in the default validity period The operation of at least one font file in multiple font files.
5. the method as described in claim 1, which is characterized in that the server sends the multiple font to the client Before at least one font file in file, further includes:
The server determines the cumulative frequency for receiving the flag code;
The server executes described into the multiple font file of client transmission when the cumulative frequency is zero At least one font file operation.
6. the method according to claim 1 to 5, which is characterized in that the server is mapped according to preset character advises Then, first web page contents are handled to obtain the second web page contents include:
The server determines the sensitive content in first web page contents;
The server carries out transcoding according to the character mapping ruler, to the sensitive content;
The server is using the sensitive content by first web page contents after transcoding as second web page contents.
7. a kind of anti-crawler method, which is characterized in that the described method includes:
User end to server sends first information request, and the first information request is used to indicate the server and obtains first Web page contents simultaneously handle first web page contents according to preset character mapping ruler to obtain the second web page contents, institute It states character mapping ruler and corresponds to multiple font files;
The client receives second web page contents and the corresponding mark of the multiple font file that the server is sent Remember code;
The client sends the second information request to the server, and second information request carries the flag code;
The client receives at least one font file in the multiple font file that the server is sent;
The client is shown in first webpage according at least one described font file and second web page contents Hold.
8. the method for claim 7, which is characterized in that second information request further includes font format information, institute It states font format information and is used to indicate the server and choose at least one described font file from the multiple font file.
9. the method for claim 7, which is characterized in that the client is according at least one described font file and institute The second web page contents are stated, show that first web page contents include:
The client chooses the font format supported with the client from least one described font file and matches Target font file;
The client shows first web page contents according to the target font file and second web page contents.
10. a kind of server, which is characterized in that the server includes:
Module is obtained, for obtaining the first web page contents when detecting the first information request that client is sent;
Transcoding module, for being handled to obtain the second net to first web page contents according to preset character mapping ruler Page content, the character mapping ruler correspond to multiple font files;
Sending module, for sending second web page contents and the corresponding label of the multiple font file to the client Code;
Receiving module, the second information request sent for receiving the client, second information request carry the mark Remember code;
The sending module is also used to send at least one font file in the multiple font file to the client, At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents Hold.
11. server as claimed in claim 10, which is characterized in that the sending module is also used to:
Determine whether current time is in the default validity period of the flag code;
When the current time is in the default validity period, execute described to the multiple font text of client transmission The operation of at least one font file in part.
12. server as claimed in claim 10, which is characterized in that the sending module is also used to:
Determine the cumulative frequency for receiving the flag code;
When the cumulative frequency is zero, execute described at least one of the multiple font file of client transmission The operation of font file.
13. such as the described in any item servers of claim 10-12, which is characterized in that the transcoding module is also used to:
Determine the sensitive content in first web page contents;
According to the character mapping ruler, transcoding is carried out to the sensitive content;
Using the sensitive content by first web page contents after transcoding as second web page contents.
14. a kind of client, which is characterized in that the client includes:
Sending module, for sending first information request to server, the first information request is used to indicate the server It obtains the first web page contents and first web page contents is handled to obtain the second net according to preset character mapping ruler Page content, the character mapping ruler correspond to multiple font files;
Receiving module, second web page contents and the multiple font file for receiving the server transmission are corresponding Flag code;
The sending module is also used to send the second information request to the server, described in second information request carries Flag code;
The receiving module is also used to receive at least one font text in the multiple font file that the server is sent Part;
Display module, for showing first webpage according at least one described font file and second web page contents Content.
15. client as claimed in claim 14, which is characterized in that second information request further includes font format letter Breath, the font format information are used to indicate the server and choose at least one described font from the multiple font file File.
CN201910077327.1A 2019-01-25 2019-01-25 Anti-crawler method and related equipment Active CN109543454B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910077327.1A CN109543454B (en) 2019-01-25 2019-01-25 Anti-crawler method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910077327.1A CN109543454B (en) 2019-01-25 2019-01-25 Anti-crawler method and related equipment

Publications (2)

Publication Number Publication Date
CN109543454A true CN109543454A (en) 2019-03-29
CN109543454B CN109543454B (en) 2022-07-12

Family

ID=65838481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910077327.1A Active CN109543454B (en) 2019-01-25 2019-01-25 Anti-crawler method and related equipment

Country Status (1)

Country Link
CN (1) CN109543454B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110166465A (en) * 2019-05-27 2019-08-23 北京达佳互联信息技术有限公司 Processing method, device, server and the storage medium of access request
CN110399737A (en) * 2019-07-26 2019-11-01 博雅创智(天津)科技有限公司 A kind of web site contents guard method of non-intrusion type
CN110414221A (en) * 2019-07-11 2019-11-05 东软集团股份有限公司 Data processing method, device, storage medium and electronic equipment
CN110620657A (en) * 2019-08-23 2019-12-27 上海科技发展有限公司 Webpage word processing method, system and device
CN110851682A (en) * 2019-10-17 2020-02-28 上海易点时空网络有限公司 Text anti-crawler method, server and display terminal
CN111008348A (en) * 2019-11-28 2020-04-14 盛业信息科技服务(深圳)有限公司 Anti-crawler method, terminal, server and computer readable storage medium
CN111291397A (en) * 2020-02-09 2020-06-16 成都神殿科技有限责任公司 Webpage data anti-crawling encryption method
CN111723263A (en) * 2020-06-19 2020-09-29 北京同邦卓益科技有限公司 Webpage data processing method, device, equipment and storage medium
CN111901332A (en) * 2020-07-27 2020-11-06 北京百川盈孚科技有限公司 Webpage content reverse crawling method and system
CN112084388A (en) * 2020-08-07 2020-12-15 广州力挚网络科技有限公司 Data encryption method and device, electronic equipment and storage medium
CN114650164A (en) * 2022-01-21 2022-06-21 企知道网络技术有限公司 Website data anti-stealing method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110191664A1 (en) * 2010-02-04 2011-08-04 At&T Intellectual Property I, L.P. Systems for and methods for detecting url web tracking and consumer opt-out cookies
WO2012051370A1 (en) * 2010-10-13 2012-04-19 Bitstream, Inc. System and method for displaying complex scripts with a cloud computing architecture
CN103955632A (en) * 2014-05-07 2014-07-30 百度在线网络技术(北京)有限公司 Encryption display method and device for webpage words
CN104899212A (en) * 2014-03-05 2015-09-09 腾讯科技(深圳)有限公司 Webpage display method, server and system
CN106027564A (en) * 2016-07-08 2016-10-12 携程计算机技术(上海)有限公司 Method and device for detecting security of anti-crawler strategy
CN106095918A (en) * 2016-06-06 2016-11-09 山东科技大学 A kind of acquisition methods of the protected exponent data of network based on OCR technique
CN107818108A (en) * 2016-09-13 2018-03-20 阿里巴巴集团控股有限公司 A kind of webpage rendering intent, apparatus and system
CN109241391A (en) * 2018-09-20 2019-01-18 四川长虹电器股份有限公司 A kind of anti-crawler method climbed of solution font

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110191664A1 (en) * 2010-02-04 2011-08-04 At&T Intellectual Property I, L.P. Systems for and methods for detecting url web tracking and consumer opt-out cookies
WO2012051370A1 (en) * 2010-10-13 2012-04-19 Bitstream, Inc. System and method for displaying complex scripts with a cloud computing architecture
CN104899212A (en) * 2014-03-05 2015-09-09 腾讯科技(深圳)有限公司 Webpage display method, server and system
CN103955632A (en) * 2014-05-07 2014-07-30 百度在线网络技术(北京)有限公司 Encryption display method and device for webpage words
CN106095918A (en) * 2016-06-06 2016-11-09 山东科技大学 A kind of acquisition methods of the protected exponent data of network based on OCR technique
CN106027564A (en) * 2016-07-08 2016-10-12 携程计算机技术(上海)有限公司 Method and device for detecting security of anti-crawler strategy
CN107818108A (en) * 2016-09-13 2018-03-20 阿里巴巴集团控股有限公司 A kind of webpage rendering intent, apparatus and system
CN109241391A (en) * 2018-09-20 2019-01-18 四川长虹电器股份有限公司 A kind of anti-crawler method climbed of solution font

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110166465B (en) * 2019-05-27 2022-01-25 北京达佳互联信息技术有限公司 Access request processing method, device, server and storage medium
CN110166465A (en) * 2019-05-27 2019-08-23 北京达佳互联信息技术有限公司 Processing method, device, server and the storage medium of access request
CN110414221A (en) * 2019-07-11 2019-11-05 东软集团股份有限公司 Data processing method, device, storage medium and electronic equipment
CN110399737A (en) * 2019-07-26 2019-11-01 博雅创智(天津)科技有限公司 A kind of web site contents guard method of non-intrusion type
CN110399737B (en) * 2019-07-26 2023-05-02 博雅创智(天津)科技有限公司 Non-invasive website content protection method
CN110620657A (en) * 2019-08-23 2019-12-27 上海科技发展有限公司 Webpage word processing method, system and device
CN110851682A (en) * 2019-10-17 2020-02-28 上海易点时空网络有限公司 Text anti-crawler method, server and display terminal
CN111008348A (en) * 2019-11-28 2020-04-14 盛业信息科技服务(深圳)有限公司 Anti-crawler method, terminal, server and computer readable storage medium
CN111291397A (en) * 2020-02-09 2020-06-16 成都神殿科技有限责任公司 Webpage data anti-crawling encryption method
CN111723263A (en) * 2020-06-19 2020-09-29 北京同邦卓益科技有限公司 Webpage data processing method, device, equipment and storage medium
CN111723263B (en) * 2020-06-19 2024-04-05 北京同邦卓益科技有限公司 Webpage data processing method, device, equipment and storage medium
CN111901332A (en) * 2020-07-27 2020-11-06 北京百川盈孚科技有限公司 Webpage content reverse crawling method and system
CN112084388A (en) * 2020-08-07 2020-12-15 广州力挚网络科技有限公司 Data encryption method and device, electronic equipment and storage medium
CN112084388B (en) * 2020-08-07 2024-04-30 广州力挚网络科技有限公司 Data encryption method and device, electronic equipment and storage medium
CN114650164A (en) * 2022-01-21 2022-06-21 企知道网络技术有限公司 Website data anti-stealing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109543454B (en) 2022-07-12

Similar Documents

Publication Publication Date Title
CN109543454A (en) A kind of anti-crawler method and relevant device
US9241004B1 (en) Alteration of web documents for protection against web-injection attacks
JP5600160B2 (en) Method and system for identifying suspected phishing websites
EP3345114B1 (en) Disabling malicious browser extensions
CN104168293B (en) The method and system of suspicious fishing webpage are recognized with reference to local content rule base
CN108449316B (en) Anti-crawler method, server and client
CN105205080B (en) Redundant file method for cleaning, device and system
CN104346471A (en) Method, device and system for determining to-be-pushed application based on geological position information
Chai et al. An explainable multi-modal hierarchical attention model for developing phishing threat intelligence
CN110166465A (en) Processing method, device, server and the storage medium of access request
CN111008348A (en) Anti-crawler method, terminal, server and computer readable storage medium
CN107547524A (en) A kind of page detection method, device and equipment
US8332821B2 (en) Using encoding to detect security bugs
CN115664859B (en) Data security analysis method, device, equipment and medium based on cloud printing scene
CN107239701A (en) Recognize the method and device of malicious websites
CN110210211A (en) A kind of method of data protection and calculate equipment
KR20220152167A (en) A system and method for detecting phishing-domains in a set of domain name system(dns) records
CN111881337A (en) Data acquisition method and system based on Scapy framework and storage medium
CN113810375B (en) Webshell detection method, device and equipment and readable storage medium
CN104978423A (en) Website type detection method and apparatus
CN114282204A (en) Method, device, equipment and medium for determining user access micro application authority
CN110119483A (en) Display methods, device, terminal device and the storage medium of multimedia file
CN117040804A (en) Network attack detection method, device, equipment, medium and program product for website
CN111355709A (en) Data verification method and device, electronic equipment and computer readable storage medium
US8234412B2 (en) Method and system for transmitting compacted text data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant