CN110069688A - Page display method, server, storage medium and the device of anti-crawler - Google Patents

Page display method, server, storage medium and the device of anti-crawler Download PDF

Info

Publication number
CN110069688A
CN110069688A CN201910205704.5A CN201910205704A CN110069688A CN 110069688 A CN110069688 A CN 110069688A CN 201910205704 A CN201910205704 A CN 201910205704A CN 110069688 A CN110069688 A CN 110069688A
Authority
CN
China
Prior art keywords
houseclearing
crawler
converted
page
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910205704.5A
Other languages
Chinese (zh)
Inventor
刘晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Urban Construction Technology Shenzhen Co Ltd
Original Assignee
Ping An Urban Construction Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Urban Construction Technology Shenzhen Co Ltd filed Critical Ping An Urban Construction Technology Shenzhen Co Ltd
Priority to CN201910205704.5A priority Critical patent/CN110069688A/en
Publication of CN110069688A publication Critical patent/CN110069688A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results

Abstract

The invention discloses the page display method of anti-crawler, server, storage medium and devices.Target flow data is parsed in the present invention, to obtain node to be presented, judge in node to be presented whether to include default switch signs, when including default switch signs, based on corresponding the first houseclearing stored in the form of text of querying node to be presented, first houseclearing is converted to the second houseclearing stored with graphic form, the second houseclearing is sent to client, so that client carries out anti-crawler displaying to the second houseclearing.Significantly, the present invention is suitable for big data application scenarios, it is converted just because of by houseclearing to be presented by textual form for graphic form, the text information of pictorial information ratio is less susceptible to directly be crawled by web crawler to information, especially it is directed to universal network crawlers, to achieve the effect that anti-crawler shows, also just solve be easier to be crawled by web crawler to the technical issues of.

Description

Page display method, server, storage medium and the device of anti-crawler
Technical field
The present invention relates to web page display technical field more particularly to the page display method of anti-crawler, server, storage Jie Matter and device.
Background technique
In view of often showing various types of houseclearings in house website, for example, renting a house for inquiring or handling And the houses such as information of real estate, building details and neighboring picture are often shown in the letting agency website of dealing house matters Information, still, most letting agency website are often directly shown in the form of text when showing houseclearing.
It is contemplated that web crawler easily crawls existing information in the form of text, this allows for house website In houseclearing easily crawled by web crawler to can not preferably protect houseclearing.So when showing information It there is technical issues that be easier to be crawled by web crawler.
Above content is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that above content is existing skill Art.
Summary of the invention
The main purpose of the present invention is to provide the page display method of anti-crawler, server, storage medium and device, purports Solve to be easier to when showing information to be crawled by web crawler to the technical issues of.
To achieve the above object, the present invention provides a kind of page display method of anti-crawler, the page exhibition of the anti-crawler Show method the following steps are included:
When receiving the page presentation instruction that client is sent, target to be output is determined according to page presentation instruction Flow data;
The target flow data is parsed, to obtain the node to be presented for including in the target flow data;
Judge in the node to be presented whether to include default switch signs;
It is corresponding based on the querying node to be presented when including the default switch signs in the node to be presented The first houseclearing stored in the form of text;
First houseclearing is converted to the second houseclearing stored with graphic form;
Second houseclearing is sent to the client so that the client to second houseclearing into The anti-crawler of row shows.
Preferably, described when receiving the page presentation instruction that client is sent, it is instructed and is determined according to the page presentation After target flow data to be output, the page display method of the anti-crawler further include:
The target flow data is intercepted;
It is described to be converted to first houseclearing with after the second houseclearing of graphic form storage, it is described counter to climb The page display method of worm further include:
First houseclearing is replaced with into second houseclearing;
It is described that second houseclearing is sent to the client, so that the client believes second house Breath carries out anti-crawler displaying, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the mesh Second houseclearing is extracted from the target flow data when marking flow data, and is climbed to second houseclearing is counter Worm shows.
Preferably, described when receiving the page presentation instruction that client is sent, it is instructed and is determined according to the page presentation After target flow data to be output, the page display method of the anti-crawler further include:
The target flow data is intercepted;
It is described to be converted to first houseclearing with after the second houseclearing of graphic form storage, it is described counter to climb The page display method of worm further include:
Second houseclearing is saved into default session object;
It is described that second houseclearing is sent to the client, so that the client believes second house Breath carries out anti-crawler displaying, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the mesh The default session object is determined according to the target flow data when marking flow data, and obtains and is saved in the default session object The second houseclearing, anti-crawler displaying is carried out to second houseclearing.
Preferably, it is described second houseclearing is saved into default session object after, the page of the anti-crawler Face methods of exhibiting further include:
Destination node is created, and the write-in object identity corresponding with the default session object in the destination node;
Node to be presented in the target flow data is replaced with into the destination node;
The target flow data of the output in interception state is to the client, so that the client is receiving The default session object is determined according to the target flow data when stating target flow data, and is obtained in the default session object The second houseclearing saved carries out anti-crawler displaying to second houseclearing, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the mesh The object identity is extracted from the destination node when marking flow data, and houseclearing request is generated according to the object identity, And houseclearing request is fed back into server;
When receiving houseclearing request, corresponding default session object is inquired according to the object identity, from Second houseclearing is extracted in the default session object;
Second houseclearing is sent to the client so that the client to second houseclearing into The anti-crawler of row shows.
Preferably, it is described in the node to be presented include the default switch signs when, be based on the section to be presented Point is inquired after corresponding the first houseclearing stored in the form of text, the page display method of the anti-crawler further include:
Judge in the page presentation instruction whether to include presetting template identification to be converted;
The page presentation instruction in include it is described preset template identification to be converted when, preset mould to be converted according to described Panel sign determines corresponding displaying template to be converted;
It is described that first houseclearing is converted to the second houseclearing stored with graphic form, comprising:
First houseclearing is converted to the second room stored with graphic form based on the displaying template to be converted Room information.
Preferably, it is described judge in page presentation instruction whether include after presetting template identification to be converted, it is described The page display method of anti-crawler further include:
It does not include described in the page presentation instruction when presetting template identification to be converted, based on default initial presentation mould First houseclearing is converted to the second houseclearing stored with graphic form by plate, and is executed described by second room Room information is sent to the client, so that the client carries out the step of anti-crawler shows to second houseclearing.
Preferably, described be converted to first houseclearing based on the displaying template to be converted is deposited with graphic form Second houseclearing of storage, comprising:
Region division is carried out to page area locating for first houseclearing, to obtain page subregion;
Corresponding displaying item to be converted is inquired according to the page subregion in the displaying template to be converted, and from institute It states and reads font style, font size to be converted, font color to be converted and back to be converted to be converted in displaying item to be converted Scape pattern;
Based on the font style to be converted, the font size to be converted, the font color to be converted and described Backstyle to be converted converts the first houseclearing in the page subregion, is deposited with obtaining with graphic form Second house sub-information of storage;
Second house sub-information is spliced, to obtain spliced the second house letter with graphic form storage Breath.
In addition, to achieve the above object, the present invention also proposes that a kind of server, the server include memory, processing Device and the page presentation program for being stored in the anti-crawler that can be run on the memory and on the processor, the anti-crawler Page presentation program the step of being arranged for carrying out the page display method of anti-crawler as described above.
In addition, to achieve the above object, the present invention also proposes a kind of storage medium, counter climb is stored on the storage medium Anti- crawler as described above is realized when the page presentation program of the page presentation program of worm, the anti-crawler is executed by processor Page display method the step of.
In addition, to achieve the above object, the present invention also proposes a kind of page presentation device of anti-crawler, the anti-crawler Page presentation device includes:
Receiving module is instructed, for referring to according to the page presentation when receiving the page presentation instruction that client is sent It enables and determines target flow data to be output;
Flow data parsing module is wrapped for parsing to the target flow data with obtaining in the target flow data The node to be presented contained;
Node judgment module, for judging in the node to be presented whether to include default switch signs;
Houseclearing enquiry module is based on institute when for including the default switch signs in the node to be presented State corresponding the first houseclearing stored in the form of text of querying node to be presented;
Info conversion module is believed for being converted to first houseclearing with the second house of graphic form storage Breath;
Houseclearing display module, for second houseclearing to be sent to the client, so that the client End carries out anti-crawler displaying to second houseclearing.
In the present invention in displayed page, first judge in the node to be presented in target flow data to be output whether include There are default switch signs, when including default switch signs in determining node to be presented, can be stored in the form of text to corresponding The first houseclearing converted, to obtain the second houseclearing stored with graphic form, and client show this Two houseclearings.It is apparent that being converted in the present invention just because of by houseclearing to be presented by textual form for picture shape Formula, the text information of pictorial information ratio are less susceptible to directly be crawled to information by web crawler, to reach anti-crawler exhibition The effect shown, also just solve be easier to be crawled by web crawler when showing information to the technical issues of.
Detailed description of the invention
Fig. 1 is the server architecture schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of the page display method first embodiment of the anti-crawler of the present invention;
Fig. 3 is the flow diagram of the page display method second embodiment of the anti-crawler of the present invention;
Fig. 4 is the flow diagram of the page display method 3rd embodiment of the anti-crawler of the present invention;
Fig. 5 is the structural block diagram of the page presentation device first embodiment of the anti-crawler of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
Referring to Fig.1, Fig. 1 is the server architecture schematic diagram for the hardware running environment that the embodiment of the present invention is related to.
As shown in Figure 1, the server may include: processor 1001, such as CPU, communication bus 1002, user interface 1003, network interface 1004, memory 1005.Wherein, communication bus 1002 is for realizing the connection communication between these components. User interface 1003 may include display screen (Display), optional user interface 1003 can also include standard wireline interface, Wireless interface, the wireline interface for user interface 1003 can be USB interface in the present invention.Network interface 1004 optionally may be used To include standard wireline interface and wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory, can also To be stable memory (non-volatile memory), such as magnetic disk storage.Memory 1005 optionally can also be Independently of the storage device of aforementioned processor 1001.
It will be understood by those skilled in the art that structure shown in Fig. 1 does not constitute the restriction to server, may include Than illustrating more or fewer components, certain components or different component layouts are perhaps combined.
As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium Believe module, Subscriber Interface Module SIM and the page presentation program of anti-crawler.
In server shown in Fig. 1, network interface 1004 is mainly used for connecting background server, with the background service Device carries out data communication;User interface 1003 is mainly used for connecting peripheral hardware;The server calls storage by processor 1001 The page presentation program of the anti-crawler stored in device 1005, and execute the page presentation side of anti-crawler provided in an embodiment of the present invention Method.
Based on above-mentioned hardware configuration, the embodiment of the page display method of the anti-crawler of the present invention is proposed.
Referring to Fig. 2, Fig. 2 is the flow diagram of the page display method first embodiment of the anti-crawler of the present invention.
In the first embodiment, the anti-crawler page display method the following steps are included:
Step S10: it when receiving the page presentation instruction that client is sent, is determined according to page presentation instruction to defeated Target flow data out.
It is understood that convenience is crawled in view of text information is with higher for web crawler, Probability to information is crawled by web crawler in order to reduce, the text information that can be intended to show when showing information is converted to Pictorial information, it is contemplated that the text information of pictorial information ratio is less susceptible to be read directly out information, to improve the peace of information Quan Xing.
It should be understood that the executing subject of the present embodiment is server;For client by user's operation, client can be a People's computer can also be the browser run in PC.For example, when user's operation browser is intended to open some page, Information exchange will be carried out by browser and server, to realize in the browser side successful presentation page.
It in the concrete realization, will be first for example, receiving the page presentation instruction that user sends in browser side in server Determine the pointed page data for showing in a browser of page presentation instruction, i.e. target flow data.
Step S20: parsing the target flow data, to obtain the section to be presented for including in the target flow data Point.
It is understood that the format of target flow data can be JavaScript object numbered musical notation (JavaScript Object Notation, JSON) format, JSON format is a kind of data interchange format of lightweight.Specifically, JSON format is a kind of The node format of class extensible markup language (eXtensible Markup Language, XML) format, so, target stream It will include multiple nodes in data.
Step S30: judge in the node to be presented whether to include default switch signs.
It should be understood that turning picture behaviour for the ease of executable text is defined and rapidly found out in target flow data The information of work can add a special joint, for identifying the information of display format to be converted.Also, it can in the special joint A mark is preset, for being distinguished with other nodes in target flow data.
In the concrete realization, switch signs are preset to exist in the form of web page tag.
Step S40: it when in the node to be presented including the default switch signs, is looked into based on the node to be presented Ask corresponding the first houseclearing stored in the form of text.
In the concrete realization, if the real-time node read from target flow data is node A, then, judgement is saved It whether there is the prespecified default switch signs in point A, however, it is determined that there are the default switch signs, then will be to corresponding letter Breath executes the operation that text turns picture;Default switch signs if it does not exist, then without carrying out the operation that text turns picture.
Step S50: first houseclearing is converted to the second houseclearing stored with graphic form.
It, can will associated with node A first it is understood that when including default switch signs in determining node A Houseclearing is determined as information to be converted, and executes conversion operation to the first houseclearing stored in the form of text, to turn Swap out the second houseclearing stored with graphic form.
It should be understood that the information content recorded in the first houseclearing and the second houseclearing is identical, the area of the two It is not display format, is first textual form, another is then graphic form.In addition, the information content of the first houseclearing can Including different types of houseclearings such as information of real estate, building details, building picture and building neighboring pictures.
Step S60: being sent to the client for second houseclearing, so that the client is to second room Room information carries out anti-crawler displaying.
It is understood that the picture after conversion can be sent to browser completing conversion operation, so that browser is aobvious The houseclearing of diagram sheet form, rather than the houseclearing of textual form.Exactly because the houseclearing of display is graphic form Rather than textual form, so, increase that web crawler crawls houseclearing crawls difficulty, to improve information Safety.
In the present embodiment in displayed page, first judge whether wrap in the node to be presented in target flow data to be output Default switch signs have been included, when including default switch signs in determining node to be presented, can have been deposited in the form of text to corresponding First houseclearing of storage is converted, and to obtain the second houseclearing stored with graphic form, and showing in client should Second houseclearing.It is apparent that being schemed just because of by houseclearing to be presented by textual form conversion in the present embodiment Sheet form, the text information of pictorial information ratio are less susceptible to directly be crawled to information by web crawler, climb to reach counter Worm show effect, also just solve be easier to be crawled by web crawler when showing information to the technical issues of.
Referring to Fig. 3, Fig. 3 is the flow diagram of the page display method second embodiment of the anti-crawler of the present invention, based on upper First embodiment shown in Fig. 2 is stated, proposes the second embodiment of the page display method of the anti-crawler of the present invention.
It is described when receiving the page presentation instruction that client is sent in second embodiment, referred to according to the page presentation It enables after determining target flow data to be output, the page display method of the anti-crawler further include:
The target flow data is intercepted;
It is described to be converted to first houseclearing with after the second houseclearing of graphic form storage, it is described counter to climb The page display method of worm further include:
First houseclearing is replaced with into second houseclearing;
It is described that second houseclearing is sent to the client, so that the client believes second house Breath carries out anti-crawler displaying, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the mesh Second houseclearing is extracted from the target flow data when marking flow data, and is climbed to second houseclearing is counter Worm shows.
It is understood that getting the acquisition modes of the second houseclearing after conversion for browser, it can be used and repair Change the mode of target flow data to realize.
In the concrete realization, after server side has determined target flow data to be output, the target fluxion will first be intercepted According to not exporting the target flow data.Then, secondary modification can be carried out to the target flow data, specifically, can be by target stream The first houseclearing in data is revised as the second houseclearing.It finally, will when formally exporting target flow data to browser Target flow data after output modifications, so that browser can read out the of the graphic form from modified target flow data Two houseclearings.
It should be understood that the second houseclearing can be encoded with the coding form of base64.
Further, after the step S10, the page display method of the anti-crawler further include:
Step S101: the target flow data is intercepted.
It is understood that the acquisition modes of the second houseclearing after conversion are got for browser, in addition to that can adopt It is outer to realize with the mode of modification target flow data, it can not also directly modify target flow data and to introduce session (Session) right As realizing.
After the step S50, the page display method of the anti-crawler further include:
Step S501: second houseclearing is saved into default session object.
In the concrete realization, after server side has determined target flow data to be output, the target fluxion will first be intercepted According to not exporting the target flow data.Then, in server side, the second houseclearing after conversion can be saved into Session In object.
The step S60, comprising:
Step S601: target flow data of the output in interception state is to the client, so that the client is connecing The default session object is determined according to the target flow data when receiving the target flow data, and obtains the default session The second houseclearing saved in object carries out anti-crawler displaying to second houseclearing.
It is understood that it is final, when formally exporting target flow data to browser, it can first determine Session object, It takes from Session object again the second houseclearing, which is carried out in the web interface of browser Display.
Further, it is described second houseclearing is saved into default session object after, the anti-crawler Page display method further include:
Destination node is created, and the write-in object identity corresponding with the default session object in the destination node;
Node to be presented in the target flow data is replaced with into the destination node;
The target flow data of the output in interception state is to the client, so that the client is receiving The default session object is determined according to the target flow data when stating target flow data, and is obtained in the default session object The second houseclearing saved carries out anti-crawler displaying to second houseclearing, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the mesh The object identity is extracted from the destination node when marking flow data, and houseclearing request is generated according to the object identity, And houseclearing request is fed back into server;, receive the houseclearing request when, according to the object identity Corresponding default session object is inquired, second houseclearing is extracted from the default session object;
Second houseclearing is sent to the client so that the client to second houseclearing into The anti-crawler of row shows.
It is understood that obtain the second houseclearing according to the mode for introducing Session object, by second room After room information preservation to Session object, a destination node for being different from node to be presented can be first created, it can be by the destination node Then node B is written, in order to according to object mark in object identity, that is, SessionId of Session object by referred to as node B The Session object is found in knowledge.Moreover, object identity in node B and the second house being saved in Session object letter Breath can the coding form of base64 encode.
It is not directly to write still it should be understood that can modify to target flow data after creating egress B The second houseclearing after entering conversion, but node to be presented is replaced with into node B.Finally, target flow data is being exported to clear It lookes at after device, browser can directly read out object identity from node B, and then, house letter can be written in object identity by browser In breath request, to request the second houseclearing to server.The object identity in houseclearing request is got in server side Afterwards, corresponding Session object can be inquired according to object identity, and the second houseclearing is extracted from Session object, The second houseclearing is fed back into browser by server side again this moment, so that browser is shown.It is apparent that this mode Second houseclearing is not write direct in target flow data, reduce the operating quantity for target flow data, also improved aobvious Show efficiency.Also, since the information in Session object can take at any time, also improve response efficiency.
The acquisition modes for getting the houseclearing after conversion for browser in the present embodiment give two kinds specifically Implementation can be used the mode of modification target flow data to realize, can also introduce session (Session) object to realize.
Referring to Fig. 4, Fig. 4 is the flow diagram of the page display method 3rd embodiment of the anti-crawler of the present invention, based on upper First embodiment shown in Fig. 2 is stated, proposes the 3rd embodiment of the page display method of the anti-crawler of the present invention.
In 3rd embodiment, after the step S40, the page display method of the anti-crawler further include:
Step S401: judge in the page presentation instruction whether to include presetting template identification to be converted.
It is understood that conversion module can be loaded, additionally also in the conversion operation that text turns picture to introduce not Houseclearing after making conversion after same conversion module has otherness, and the picture that can cater to otherness shows demand.
It should be understood that can add preset template identification to be converted and it is corresponding with template identification to be converted is preset to Conversion shows template, specifically, default Template Map relationship can be arranged in server side, includes in default Template Map relationship Preset template identification to be converted and the corresponding relationship to be converted shown between template.
Step S402: the page presentation instruction in include it is described preset template identification to be converted when, according to described pre- If template identification to be converted determines corresponding displaying template to be converted.
It is understood that read out it is pre-set preset template identification to be converted after, can be in default Template Map Basis presets template identification to be converted and determines corresponding displaying template to be converted in relationship.
The step S50, comprising:
Step S502: first houseclearing is converted to based on the displaying template to be converted and is stored with graphic form The second houseclearing.
In the concrete realization, when text information turns pictorial information, displaying template to be converted can be loaded, thus with to be converted Conversion operation is realized based on displaying template.As for displaying template to be converted, may include in displaying template to be converted font style, The subitems such as font size, font color and backstyle, so that conversion operation implements flow path switch.
Further, described to judge in the page presentation instruction whether to include institute after presetting template identification to be converted State the page display method of anti-crawler further include:
It does not include described in the page presentation instruction when presetting template identification to be converted, based on default initial presentation mould First houseclearing is converted to the second houseclearing stored with graphic form by plate, and is executed described by second room Room information is sent to the client, so that the client carries out the step of anti-crawler shows to second houseclearing.
It is understood that is selected if it does not exist presets template identification to be converted, the displaying mould of default may also set up The i.e. default initial presentation template of formula, text information will be big to preset default font pattern, default font in initial presentation template Picture conversion operation is realized based on the subitems such as small, default font color and default transparent background pattern.
Further, described to be converted to first houseclearing with graphic form based on the displaying template to be converted Second houseclearing of storage, comprising:
Region division is carried out to page area locating for first houseclearing, to obtain page subregion;
Corresponding displaying item to be converted is inquired according to the page subregion in the displaying template to be converted, and from institute It states and reads font style, font size to be converted, font color to be converted and back to be converted to be converted in displaying item to be converted Scape pattern;
Based on the font style to be converted, the font size to be converted, the font color to be converted and described Backstyle to be converted converts the first houseclearing in the page subregion, is deposited with obtaining with graphic form Second house sub-information of storage;
Second house sub-information is spliced, to obtain spliced the second house letter with graphic form storage Breath.
It is understood that in view of may there are the texts of different typesettings in different regions in the first houseclearing Word information, so, some region of text information can be first locked, then convert to the text information in the region, to realize Various forms of picture display modes are realized for the text information of different zones in the same page.
In the concrete realization, if may be simply referred to as son there are 3 page subregions in a webpage locating for the first houseclearing Region A, subregion B and subregion C, and different exhibitions is respectively provided in displaying template to be converted for this 3 sub-regions Aspect, for example, the corresponding font style of subregion A is the Song typeface, font size is No. four, font color is black and background sample Formula is transparent background;The corresponding font style of subregion B is regular script, font size is small No. four, font color be black and Backstyle is green background;The corresponding font style of subregion C is imitation Song-Dynasty-Style typeface, font size is No. four, font color is ash Color and backstyle are transparent background.
It is understood that can then implement text respectively in view of the different corresponding displaying items of subregion is different and turn figure The conversion operation of piece finally, then the pictorial information converted out is spelled again according to the page location information of page subregion It connects, also just obtains complete second houseclearing.
Template is shown by introducing in the present embodiment, it, will be based on showing template when implementation text turns picture operation Conversion operation is completed, may be implemented the picture display mode of differentiation.
In addition, the embodiment of the present invention also proposes a kind of storage medium, the page of anti-crawler is stored on the storage medium The page exhibition of anti-crawler as described above is realized when the page presentation program of presentation program, the anti-crawler is executed by processor The step of showing method.
In addition, the embodiment of the present invention also proposes a kind of page presentation device of anti-crawler, the anti-crawler referring to Fig. 5 Page presentation device includes:
Receiving module 10 is instructed, when for instructing in the page presentation for receiving client transmission, according to the page presentation Instruction determines target flow data to be output.
It is understood that convenience is crawled in view of text information is with higher for web crawler, Probability to information is crawled by web crawler in order to reduce, the text information that can be intended to show when showing information is converted to Pictorial information, it is contemplated that the text information of pictorial information ratio is less susceptible to be read directly out information, to improve the peace of information Quan Xing.
It should be understood that client, by user's operation, it can also be to run on PC that client, which can be PC, Interior browser.For example, the page of browser and the anti-crawler will be passed through when user's operation browser is intended to open some page Show that device carries out information exchange, to realize in the browser side successful presentation page.
In the concrete realization, for example, the page presentation device in the anti-crawler receives what user sent in browser side Page presentation instruction will first determine the pointed page data for being used to show in a browser of page presentation instruction, i.e. mesh Mark flow data.
Flow data parsing module 20, for being parsed to the target flow data, to obtain in the target flow data The node to be presented for including.
It is understood that the format of target flow data can be JavaScript object numbered musical notation (JavaScript Object Notation, JSON) format, JSON format is a kind of data interchange format of lightweight.Specifically, JSON format is a kind of The node format of class extensible markup language (eXtensible Markup Language, XML) format, so, target stream It will include multiple nodes in data.
Node judgment module 30, for judging in the node to be presented whether to include default switch signs.
It should be understood that turning picture behaviour for the ease of executable text is defined and rapidly found out in target flow data The information of work can add a special joint, for identifying the information of display format to be converted.Also, it can in the special joint A mark is preset, for being distinguished with other nodes in target flow data.
In the concrete realization, switch signs are preset to exist in the form of web page tag.
Houseclearing enquiry module 40 is based on when for including the default switch signs in the node to be presented Corresponding the first houseclearing stored in the form of text of the querying node to be presented.
In the concrete realization, if the real-time node read from target flow data is node A, then, judgement is saved It whether there is the prespecified default switch signs in point A, however, it is determined that there are the default switch signs, then will be to corresponding letter Breath executes the operation that text turns picture;Default switch signs if it does not exist, then without carrying out the operation that text turns picture.
Info conversion module 50 is believed for being converted to first houseclearing with the second house of graphic form storage Breath.
It, can will associated with node A first it is understood that when including default switch signs in determining node A Houseclearing is determined as information to be converted, and executes conversion operation to the first houseclearing stored in the form of text, to turn Swap out the second houseclearing stored with graphic form.
It should be understood that the information content recorded in the first houseclearing and the second houseclearing is identical, the area of the two It is not display format, is first textual form, another is then graphic form.In addition, the information content of the first houseclearing can Including different types of houseclearings such as information of real estate, building details, building picture and building neighboring pictures.
Houseclearing display module 60, for second houseclearing to be sent to the client, so that the visitor Family end carries out anti-crawler displaying to second houseclearing.
It is understood that the picture after conversion can be sent to browser completing conversion operation, so that browser is aobvious The houseclearing of diagram sheet form, rather than the houseclearing of textual form.Exactly because the houseclearing of display is graphic form Rather than textual form, so, increase that web crawler crawls houseclearing crawls difficulty, to improve information Safety.
In the present embodiment in displayed page, first judge whether wrap in the node to be presented in target flow data to be output Default switch signs have been included, when including default switch signs in determining node to be presented, can have been deposited in the form of text to corresponding First houseclearing of storage is converted, and to obtain the second houseclearing stored with graphic form, and showing in client should Second houseclearing.It is apparent that being schemed just because of by houseclearing to be presented by textual form conversion in the present embodiment Sheet form, the text information of pictorial information ratio are less susceptible to directly be crawled to information by web crawler, climb to reach counter Worm show effect, also just solve be easier to be crawled by web crawler when showing information to the technical issues of.
In one embodiment, the page presentation device of the anti-crawler further include:
First-class data interception module, for being intercepted to the target flow data;
Information replacement module, for first houseclearing to be replaced with second houseclearing;
The houseclearing display module 60 is also used to export the target flow data in interception state to the client End, so that the client extracts the second house letter when receiving the target flow data from the target flow data Breath, and anti-crawler displaying is carried out to second houseclearing.
In one embodiment, the page presentation device of the anti-crawler further include:
Second data interception module, for being intercepted to the target flow data;
Information preservation module, for saving second houseclearing into default session object;
The houseclearing display module 60 is also used to export the target flow data in interception state to the client End, so that the client determines the default session pair according to the target flow data when receiving the target flow data As, and the second houseclearing saved in the default session object is obtained, anti-crawler exhibition is carried out to second houseclearing Show.
In one embodiment, the page presentation device of the anti-crawler further include:
Node replacement module, for creating destination node, and write-in and the default session pair in the destination node As corresponding object identity;Node to be presented in the target flow data is replaced with into the destination node;
The houseclearing display module 60 is also used to export the target flow data in interception state to the client End, so that the client extracts the object identity, root when receiving the target flow data from the destination node Houseclearing request is generated according to the object identity, and houseclearing request is fed back into server;It is described receiving When houseclearing is requested, corresponding default session object is inquired according to the object identity, is mentioned from the default session object Take second houseclearing;Second houseclearing is sent to the client, so that the client is to described Two houseclearings carry out anti-crawler displaying.
In one embodiment, the page presentation device of the anti-crawler further include:
Template query module, for judging in the page presentation instruction whether to include presetting template identification to be converted;? Include described in the page presentation instruction when presetting template identification to be converted, presets template identification to be converted according to described and determine Corresponding displaying template to be converted;
The info conversion module 50 is also used to convert first houseclearing based on the displaying template to be converted For the second houseclearing stored with graphic form.
In one embodiment, the page presentation device of the anti-crawler further include:
Template switch module, for the page presentation instruction in do not include it is described preset template identification to be converted when, First houseclearing is converted to the second houseclearing stored with graphic form based on default initial presentation template, and is held Row is described to be sent to the client for second houseclearing, so that the client carries out second houseclearing The step of anti-crawler shows.
In one embodiment, the info conversion module 50 is also used to page pool locating for first houseclearing Domain carries out region division, to obtain page subregion;It is inquired in the displaying template to be converted according to the page subregion Corresponding displaying item to be converted, and read from the displaying item to be converted font style to be converted, font size to be converted, to Convert font color and backstyle to be converted;Based on the font style to be converted, font size to be converted, described Font color to be converted and the backstyle to be converted carry out the first houseclearing in the page subregion Conversion, to obtain the second house sub-information stored with graphic form;Second house sub-information is spliced, to obtain Spliced the second houseclearing stored with graphic form.
The other embodiments or specific implementation of the page presentation device of anti-crawler of the present invention can refer to above-mentioned each Embodiment of the method, details are not described herein again.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or system.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.If listing equipment for drying Unit claim in, several in these devices, which can be, to be embodied by the same item of hardware.Word first, Second and the use of third etc. do not indicate any sequence, can be title by these word explanations.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, computer, clothes Business device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of page display method of anti-crawler, which is characterized in that the page display method of the anti-crawler includes following step It is rapid:
When receiving the page presentation instruction that client is sent, target fluxion to be output is determined according to page presentation instruction According to;
The target flow data is parsed, to obtain the node to be presented for including in the target flow data;
Judge in the node to be presented whether to include default switch signs;
It is corresponding with text based on the querying node to be presented when including the default switch signs in the node to be presented First houseclearing of this form storage;
First houseclearing is converted to the second houseclearing stored with graphic form;
Second houseclearing is sent to the client, so that the client carries out instead second houseclearing Crawler shows.
2. the page display method of anti-crawler as described in claim 1, which is characterized in that described to receive what client was sent When page presentation instructs, after determining target flow data to be output according to page presentation instruction, the page of the anti-crawler Face methods of exhibiting further include:
The target flow data is intercepted;
It is described by first houseclearing be converted to graphic form storage the second houseclearing after, the anti-crawler Page display method further include:
First houseclearing is replaced with into second houseclearing;
It is described that second houseclearing is sent to the client so that the client to second houseclearing into The anti-crawler of row shows, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the target stream Second houseclearing is extracted when data from the target flow data, and anti-crawler exhibition is carried out to second houseclearing Show.
3. the page display method of anti-crawler as described in claim 1, which is characterized in that described to receive what client was sent When page presentation instructs, after determining target flow data to be output according to page presentation instruction, the page of the anti-crawler Face methods of exhibiting further include:
The target flow data is intercepted;
It is described by first houseclearing be converted to graphic form storage the second houseclearing after, the anti-crawler Page display method further include:
Second houseclearing is saved into default session object;
It is described that second houseclearing is sent to the client so that the client to second houseclearing into The anti-crawler of row shows, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the target stream The default session object is determined according to the target flow data when data, and obtains saved in the default session object Two houseclearings carry out anti-crawler displaying to second houseclearing.
4. the page display method of anti-crawler as claimed in claim 3, which is characterized in that described by second houseclearing After saving into default session object, the page display method of the anti-crawler further include:
Destination node is created, and the write-in object identity corresponding with the default session object in the destination node;
Node to be presented in the target flow data is replaced with into the destination node;
The target flow data of the output in interception state is to the client, so that the client is receiving the mesh The default session object is determined according to the target flow data when marking flow data, and obtains and is saved in the default session object The second houseclearing, anti-crawler displaying is carried out to second houseclearing, comprising:
Target flow data of the output in interception state is to the client, so that the client is receiving the target stream The object identity is extracted when data from the destination node, houseclearing request is generated according to the object identity, and will The houseclearing request feeds back to server;
When receiving houseclearing request, corresponding default session object is inquired according to the object identity, from described Second houseclearing is extracted in default session object;
Second houseclearing is sent to the client, so that the client carries out instead second houseclearing Crawler shows.
5. the page display method of anti-crawler as described in claim 1, which is characterized in that described in the node to be presented When including the default switch signs, based on corresponding the first house letter stored in the form of text of the querying node to be presented After breath, the page display method of the anti-crawler further include:
Judge in the page presentation instruction whether to include presetting template identification to be converted;
The page presentation instruction in include it is described preset template identification to be converted when, preset template mark to be converted according to described Know and determines corresponding displaying template to be converted;
It is described that first houseclearing is converted to the second houseclearing stored with graphic form, comprising:
First houseclearing is converted to based on the displaying template to be converted and is believed with the second house of graphic form storage Breath.
6. the page display method of anti-crawler as claimed in claim 5, which is characterized in that the judgement page presentation refers to It whether include the page display method of the anti-crawler after presetting template identification to be converted in order further include:
It does not include described in the page presentation instruction when presetting template identification to be converted, it will based on default initial presentation template First houseclearing is converted to the second houseclearing stored with graphic form, and executes described by second house letter Breath is sent to the client, so that the client carries out the step of anti-crawler shows to second houseclearing.
7. the page display method of anti-crawler as claimed in claim 5, which is characterized in that described to be based on the displaying to be converted First houseclearing is converted to the second houseclearing stored with graphic form by template, comprising:
Region division is carried out to page area locating for first houseclearing, to obtain page subregion;
Corresponding displaying item to be converted is inquired according to the page subregion in the displaying template to be converted, and from it is described to Conversion, which is shown in item, reads font style, font size to be converted, font color to be converted and background sample to be converted to be converted Formula;
Based on the font style to be converted, the font size to be converted, the font color to be converted and described wait turn It changes backstyle to convert the first houseclearing in the page subregion, to obtain with graphic form storage Second house sub-information;
Second house sub-information is spliced, to obtain spliced the second houseclearing stored with graphic form.
8. a kind of server, which is characterized in that the server includes: memory, processor and is stored on the memory And the page presentation program of anti-crawler can be run on the processor, the page presentation program of the anti-crawler is by the processing The step of page display method of the anti-crawler as described in any one of claims 1 to 7 is realized when device executes.
9. a kind of storage medium, which is characterized in that the page presentation program of anti-crawler is stored on the storage medium, it is described anti- The page of the anti-crawler as described in any one of claims 1 to 7 is realized when the page presentation program of crawler is executed by processor The step of methods of exhibiting.
10. a kind of page presentation device of anti-crawler, which is characterized in that the page presentation device of the anti-crawler includes:
Receiving module is instructed, for instructing according to the page presentation true when receiving the page presentation instruction that client is sent Fixed target flow data to be output;
Flow data parsing module includes to obtain in the target flow data for parsing to the target flow data Node to be presented;
Node judgment module, for judging in the node to be presented whether to include default switch signs;
Houseclearing enquiry module, when for including the default switch signs in the node to be presented, based on it is described to Show corresponding the first houseclearing stored in the form of text of querying node;
Info conversion module, for first houseclearing to be converted to the second houseclearing stored with graphic form;
Houseclearing display module, for second houseclearing to be sent to the client, so that the client pair Second houseclearing carries out anti-crawler displaying.
CN201910205704.5A 2019-03-16 2019-03-16 Page display method, server, storage medium and the device of anti-crawler Pending CN110069688A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910205704.5A CN110069688A (en) 2019-03-16 2019-03-16 Page display method, server, storage medium and the device of anti-crawler

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910205704.5A CN110069688A (en) 2019-03-16 2019-03-16 Page display method, server, storage medium and the device of anti-crawler

Publications (1)

Publication Number Publication Date
CN110069688A true CN110069688A (en) 2019-07-30

Family

ID=67365313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910205704.5A Pending CN110069688A (en) 2019-03-16 2019-03-16 Page display method, server, storage medium and the device of anti-crawler

Country Status (1)

Country Link
CN (1) CN110069688A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851682A (en) * 2019-10-17 2020-02-28 上海易点时空网络有限公司 Text anti-crawler method, server and display terminal
CN111008348A (en) * 2019-11-28 2020-04-14 盛业信息科技服务(深圳)有限公司 Anti-crawler method, terminal, server and computer readable storage medium
CN111506917A (en) * 2020-03-31 2020-08-07 北京三快在线科技有限公司 Page display method, display device, readable storage medium and electronic equipment
CN111683098A (en) * 2020-06-10 2020-09-18 创新奇智(成都)科技有限公司 Anti-crawler method and device, electronic equipment and storage medium
CN111723263A (en) * 2020-06-19 2020-09-29 北京同邦卓益科技有限公司 Webpage data processing method, device, equipment and storage medium
CN112202824A (en) * 2020-12-07 2021-01-08 杭州筋斗腾云科技有限公司 Network resource access processing method and device, server and terminal

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615776A (en) * 2015-02-27 2015-05-13 北京奇艺世纪科技有限公司 Method and device for providing information to be displayed
CN104933138A (en) * 2015-06-16 2015-09-23 携程计算机技术(上海)有限公司 Webpage crawler system and webpage crawling method
CN106599001A (en) * 2015-10-20 2017-04-26 中国电信股份有限公司 Webpage content acquisition method and system
WO2017161743A1 (en) * 2016-03-22 2017-09-28 乐视控股(北京)有限公司 Webpage display method and device
WO2017190641A1 (en) * 2016-05-03 2017-11-09 北京京东尚科信息技术有限公司 Crawler interception method and device, server terminal and computer readable medium
CN108040278A (en) * 2017-11-30 2018-05-15 深圳市雷鸟信息科技有限公司 Active page processing method, server, system and the storage medium of smart television
CN108959619A (en) * 2018-07-17 2018-12-07 武汉市冰盒网络科技有限公司 Content screen method, user equipment, storage medium and device
CN109408701A (en) * 2018-11-08 2019-03-01 网易(杭州)网络有限公司 A kind of web crawlers crawls the methods of exhibiting and device in path

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615776A (en) * 2015-02-27 2015-05-13 北京奇艺世纪科技有限公司 Method and device for providing information to be displayed
CN104933138A (en) * 2015-06-16 2015-09-23 携程计算机技术(上海)有限公司 Webpage crawler system and webpage crawling method
CN106599001A (en) * 2015-10-20 2017-04-26 中国电信股份有限公司 Webpage content acquisition method and system
WO2017161743A1 (en) * 2016-03-22 2017-09-28 乐视控股(北京)有限公司 Webpage display method and device
WO2017190641A1 (en) * 2016-05-03 2017-11-09 北京京东尚科信息技术有限公司 Crawler interception method and device, server terminal and computer readable medium
CN108040278A (en) * 2017-11-30 2018-05-15 深圳市雷鸟信息科技有限公司 Active page processing method, server, system and the storage medium of smart television
CN108959619A (en) * 2018-07-17 2018-12-07 武汉市冰盒网络科技有限公司 Content screen method, user equipment, storage medium and device
CN109408701A (en) * 2018-11-08 2019-03-01 网易(杭州)网络有限公司 A kind of web crawlers crawls the methods of exhibiting and device in path

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YONGH701: "PHP文字转图片功能原理与实现方法分析", 《HTTPS://WWW.JB51.NET/ARTICLE/122558.HTM脚本之家》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851682A (en) * 2019-10-17 2020-02-28 上海易点时空网络有限公司 Text anti-crawler method, server and display terminal
CN111008348A (en) * 2019-11-28 2020-04-14 盛业信息科技服务(深圳)有限公司 Anti-crawler method, terminal, server and computer readable storage medium
CN111506917A (en) * 2020-03-31 2020-08-07 北京三快在线科技有限公司 Page display method, display device, readable storage medium and electronic equipment
CN111683098A (en) * 2020-06-10 2020-09-18 创新奇智(成都)科技有限公司 Anti-crawler method and device, electronic equipment and storage medium
CN111683098B (en) * 2020-06-10 2022-12-23 创新奇智(成都)科技有限公司 Anti-crawler method and device, electronic equipment and storage medium
CN111723263A (en) * 2020-06-19 2020-09-29 北京同邦卓益科技有限公司 Webpage data processing method, device, equipment and storage medium
CN111723263B (en) * 2020-06-19 2024-04-05 北京同邦卓益科技有限公司 Webpage data processing method, device, equipment and storage medium
CN112202824A (en) * 2020-12-07 2021-01-08 杭州筋斗腾云科技有限公司 Network resource access processing method and device, server and terminal

Similar Documents

Publication Publication Date Title
CN110069688A (en) Page display method, server, storage medium and the device of anti-crawler
Billi et al. A unified methodology for the evaluation of accessibility and usability of mobile applications
US8997081B1 (en) Analytics for mobile applications
JP4913777B2 (en) Web page distribution system
CN105843646B (en) Start the method and apparatus of application
US20030224339A1 (en) Method and system for presenting online courses
CN103631865B (en) Webpage generating method and equipment
KR102187053B1 (en) Sever and method for providing book information
CN110297980A (en) Methods of exhibiting, device, the server of material
CN107315682A (en) Test browser-safe method, device, storage medium and electronic equipment
CN111596852B (en) Content editing method, system, computer readable storage medium and terminal device
CN109246138A (en) Resource access method and device, VPN terminal and medium based on Virtual Private Network
CN103678343A (en) Method and device for prompting webpage loading progress
CN109271160A (en) Active rule construction method, device and computer system, medium
CN112287255A (en) Page construction method and device, computing equipment and computer readable storage medium
CN107330087A (en) Pagefile generation method and device
CN110889069A (en) Resource access platform based on web online learning
CN111241435B (en) Method and device for loading picture elements
Taraldsvik Exploring the Future: is HTML5 the solution for GIS Applications on the World Wide Web
CN110784543B (en) Application widget module and access and push method thereof
JP2005157880A (en) Information processor, information processing method, its recording medium and its program
CN107027056A (en) A kind of desktop collocation method, server and client
CN112328940A (en) Method and device for embedding transition page into webpage, computer equipment and storage medium
Mendez The Missing Link
KR20100010533A (en) Method and system for providing design of signboard

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190730