CN110348182A - A kind of method and apparatus of web document watermark insertion - Google Patents
A kind of method and apparatus of web document watermark insertion Download PDFInfo
- Publication number
- CN110348182A CN110348182A CN201910434936.8A CN201910434936A CN110348182A CN 110348182 A CN110348182 A CN 110348182A CN 201910434936 A CN201910434936 A CN 201910434936A CN 110348182 A CN110348182 A CN 110348182A
- Authority
- CN
- China
- Prior art keywords
- data
- content
- page
- watermark
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000003780 insertion Methods 0.000 title claims abstract description 18
- 230000037431 insertion Effects 0.000 title claims abstract description 18
- 230000004044 response Effects 0.000 claims abstract description 51
- 238000012986 modification Methods 0.000 claims abstract description 16
- 230000004048 modification Effects 0.000 claims abstract description 16
- 238000009877 rendering Methods 0.000 claims abstract description 10
- 230000008569 process Effects 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 5
- 238000004458 analytical method Methods 0.000 claims description 4
- 239000000126 substance Substances 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 10
- 239000003795 chemical substances by application Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 241000408521 Lucida Species 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000007639 printing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 235000015170 shellfish Nutrition 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 206010033799 Paralysis Diseases 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012966 insertion method Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- GOLXNESZZPUPJE-UHFFFAOYSA-N spiromesifen Chemical compound CC1=CC(C)=CC(C)=C1C(C(O1)=O)=C(OC(=O)CC(C)(C)C)C11CCCC1 GOLXNESZZPUPJE-UHFFFAOYSA-N 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/16—Program or content traceability, e.g. by watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/0021—Image watermarking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
- G06T2201/0062—Embedding of the watermark in text images, e.g. watermarking text documents using letter skew, letter distance or row distance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2201/00—General purpose image data processing
- G06T2201/005—Image watermarking
- G06T2201/0065—Extraction of an embedded watermark; Reliable detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Technology Law (AREA)
- Computer Security & Cryptography (AREA)
- Multimedia (AREA)
- General Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Processing (AREA)
Abstract
The present invention relates to a kind of method and apparatus of web document watermark insertion.This method comprises: step 1, when user's logging in network browser accesses the Web application system page, intercepts and captures the http data packet of Web application;Step 2 parses the http response head data packet in the http data packet intercepted, obtains http response content-data;Step 3 parses http response content-data, is embedded in watermark information by modification page content data;Page file after insertion watermark is returned to browser and carries out normal content of pages parsing and content of pages rendering by step 4.The text tracing information vision being embedded in the present invention is invisible, avoids the dominant digital watermark bring vision interference of existing screen;It is that carrier is embedded in that watermark information, which is by the word content in web document, simultaneously, is not easy to be wiped by hand, and can resist screen shot or screenshotss, therefore security performance is higher.
Description
Technical field
The invention belongs to digital watermarking and Information Hiding Techniques field, be related to a kind of web document watermark insertion method and
Device.
Background technique
With E-Government, e-commerce and the continuous development of paperless office technology and quickly propel, disparate networks
Application system has obtained popularity application.Such as in terms of mobile office, people have utilized intelligent terminal to realize office people
The types of applications Seamless integration- such as member and the existing OA office system of enterprise, asset system, production management system, marketing management system,
It can be carried out enterprise's office whenever and wherever possible by mobile terminal device realization, such as notice, news, material items, file etc. is clear
It lookes at, short message receiving-transmitting and mobile enquiry etc..Network application system greatly improves the circulation treatment effeciency of information, while also depositing
In many leakage of information hidden danger, such as: printing, duplicating, screen to sensitive information in electronic document and the application system page
The behaviors such as screenshotss, screen shot.
Currently, both at home and abroad for the safety protecting method for carrying out leakage of data by modes such as screen shot, screen shots
Research specifically includes that
1. the bright watermark of screen.Show visible bright watermark information to achieve the purpose that warning and warning on the screen.It is so-called
Bright watermark be show a kind of mode of watermark by text information, such as having to explicitly show in web document " forbid clap shield " or
Person is in clear text manner shown in user information in computer screen.This is that the screen message being most widely used at present is prevented divulging a secret
Method.
2. screen two dimensional code.Show watermark information by way of two dimensional code.Watermark default hidden is in the screen lower right corner
In image in 2 D code block, two-dimentional code block can be scanned by wechat or other two-dimensional code scanning tools, after scanning
Pop up configured digital watermarking content.
3. screen vector dot matrix watermark.Distant dot matrix is shown in screen, is represented by interlattice arrangement mode
Watermark information.Practical is the mode of a kind of " slight type " label to show watermark, is almost equal to " stealthy watermark ".If there is screen
Curtain is taken pictures or screenshotss, once the leakage of a state or party secret occurs, can be divulged a secret by the vector watermark information quick lock in divulged a secret on photo
Person.And for printing, watermark audit information is also added, can not only be locked afterwards according to the irregular watermark on print paper
Blabber can also have found " violation " Print auditing information in advance, prevent in advance.
4. being superimposed picture watermark.The additional visible picture text of Overlapping display in the web document image that screen is shown
Part, all watermark informations are embedded into picture file.It can be by adjusting the size, color and position of picture, by visual impact
Degree drop is minimum.It can be from the transparency of main modulation picture, picture watermark stretch mode and picture watermark location etc..
In conclusion some research institutions and product manufacturer both domestic and external are mainly based upon the bright watermark of screen and vector at present
The mode of dot matrix realizes the insertion and extraction of web document screen watermark information, but all there is following technologies for all technologies
Problem:
1. watermark is visible, visual effect is poor.Whether the mode of the bright watermark of screen or vector dot matrix is all view
Feel visible, one layer of additional " light " shading has been covered on similar computer screen.If it is desired to stronger attack operation is resisted, document view
Feel that the modification degree of effect can be bigger.
2. safety is lower, easily wiped by hand.Since watermark information is visible, can be easy to using PS tool
It gets rid of, therefore the safety of watermark information is relatively low.
3. secondary or multiple reproduction cannot be resisted.Be not highly resistant to any angle, exposure it is uneven, it is remote and
Attack Digital Watermarking mode under the shooting conditions such as moire fringes interference.
For carry a large amount of sensitive personal informations, trade secret information, being related to the application system of state secret information, urgently
More advanced technological means is needed to be protected, in the case where not influencing existing user experience and application system is handled, effectively
The behavior of ground deterrence and tracing information leakage.
Summary of the invention
The method and apparatus that the present invention provides a kind of web document watermark insertion pass through under the premise of vision is sightless
Necessary page text content is obtained to the Context resolution of web document, and modifies page text content using digital watermarking algorithm
It is embedded in watermark information, the visual effect to solve page watermark in the prior art is poor, safety is low and anti-reproduction ability
The problems such as weak.
Inventive conception is that: firstly, when user inputs the network address access network application system page in a browser,
The strategy issued according to server-side intercepts the http data of specified Web application;To the http response head intercepted
Data packet is parsed, so that filtering screening obtains HTML and css (Cascading Style Sheets) formatted data file;
By the page data parsing to HTML and css type, the content of text attribute information that content of pages is shown, such as font are obtained
Type and font size etc.;Html page file data, which is modified, by the text watermarking algorithm replaced based on vector font library is embedded in water
Official seal breath;Finally, modified html page file, which is returned to browser, carries out normal Context resolution and page rendering.From
And a kind of method and apparatus for having obtained web document watermark insertion.
A kind of embedding grammar of web document page watermark in the present invention includes the following steps: in technical solution
Step 1 is intercepted and captured when user's logging in network browser accesses the Web application system page by data interception mode
The http data packet of Web application;
Step 2 parses the http response head data packet in the http data packet intercepted, obtains HTTP
Response contents data;
Step 3 parses http response content-data, is embedded in watermark information by modification page content data;
Step 4, will be embedded in the page file after watermark return to web browser carry out the parsing of normal content of pages and
Content of pages rendering.
Preferably, the http data packet, particular content includes: responsive trip, head response and response body.
Preferably, the data interception mode, including client HOOK mode and server agent intercepts mode.
Preferably, the data interception, opportunity of data interception be divided into page load according to the process that the page is loaded into before,
After load neutralizes load.
Preferably, the acquisition http response content-data, by http response head (Response Headers)
Content-Type obtain current page content MIME (Multipurpose Internet Mail Extensions) class
Type, therefrom filter screen selects the data of HTML and css format.
Preferably, described parse http response content-data, solved by the page data to css type
Analysis obtains the text attribute information that content of pages is shown, is parsed by the page data to HTML type, obtains page text
Content of text information in shelves.
Preferably, the page data to HTML type parses, concrete mode includes unstructured content solution
Analysis parses based on DOM (Document Object Model) node and is based on piecemeal Context resolution.
Preferably, the text attribute information that the acquisition content of pages is shown, specifically:
It traverses HTML node and text information is judged whether it is according to HML label substance;
If checking css style information in label comprising text information;
It checks whether comprising font-family attribute in css pattern, and reads the font type and word shown for the page
Body size attribute information.
Preferably, described be embedded in watermark information by modification page content data, specifically:
Step1 generates watermark information bit string;
Step2, the session data packet for parsing page request obtains attribute information, and needs to modify according to attribute determination
HTML and css file;
Step3 traverses html and css document content information, searches font-family and font attribute, and carry out watermark
The session data packet that covering is intercepted and captured is saved after font replacement;
Step4, the label object of traversal page HTML obtains the text information that the page is shown, and is replaced by watermark character
It is embedded in watermark information;
Html page file after insertion watermark is replaced original document, and final updated data packet by Step5.
Based on the same inventive concept, the present invention also provides a kind of web document watermark embedding devices, comprising:
Data Packet Seize module: it is responsible for passing through data when user's logging in network browser accesses the web application system page
Interception mode intercepts and captures the http data packet of Web application;
Resolve packet module: the http response head being responsible in the http data packet obtained to Data Packet Seize module
Data packet is parsed, and http response content-data is obtained;
Watermark information is embedded in module: the http response content-data for being responsible for obtaining resolve packet module parses,
Watermark information is embedded in by modification page content data;
Browser processing module: it is responsible for carrying out the page file after watermark information insertion resume module in the normal page
Hold parsing and content of pages rendering.
Beneficial effects of the present invention are as follows:
Due in the present invention, being embedded in text tracing information in the page documents that Web application system is shown, such as
The finger print information of reader and temporal information etc..When the web document sensitive information in terminal screen is copied by reader by screen
Mobile phone/camera outside shellfish, screen recording, the screen shot or screen screens way of output such as take pictures, record a video is divulged a secret thing
After part, by extracting watermark information from the picture file after the output of the screen of intercepting and capturing, reaches and source of divulging a secret is exported to screen picture
The purpose of head verification, the final security management and control realized to screen shot behavior.
It is aobvious so as to avoid existing screen since in the present invention, the text tracing information vision of insertion is invisible
Property digital watermark bring vision interference;It is that carrier is embedded in that watermark information, which is by the word content in web document, simultaneously,
It is not easy to be wiped by hand, and screen shot or screenshotss can be resisted, therefore security performance is higher.
Detailed description of the invention
Fig. 1 is a kind of implementation process diagram of web document watermark embedding method as described in the examples;
Fig. 2 is HTTP request under normal circumstances and respective process schematic diagram;
Fig. 3 is the request and respective process schematic diagram in the case of proxy server is added;
Fig. 4 is the web document schematic diagram normally shown;
Fig. 5 is the schematic diagram that web document shown in Fig. 4 is embedded in watermark;
Fig. 6 is a kind of structural schematic diagram of web document watermark embedding device as described in the examples.
Specific embodiment
Fig. 1 is a kind of implementation process diagram of web document page watermark embedding method as described in the examples.
S101 intercepts and captures Web by data interception mode when user's logging in network browser accesses the Web application system page
The http data packet of application.
In order to when web browser opens Web application system page documents Real-time embedding document trace to the source information, just
It needs to intercept and capture corresponding content-data before browser normally shows page documents, and is embedded in watermark information after modifying.By
It is http protocol in what is applied mostly in internal office work Web application system at present, therefore, it is necessary to intercept and capture Web application first
Http data includes that particular content includes: responsive trip, head response and response body.
Responsive trip is generally made of protocol version, status code and its description.Such as: " HTTP/1.1 200OK ", wherein assisting
Version HTTP/1.1 or HTTP/1.0 are discussed, 200 be exactly its status code, and OK is then its description.
HTTP-Version: the version of server Http protocol is indicated;
Status-Code: the responsive state code that server is sent back to is indicated;
Reason-Phrase CRLF: the text description of state code is indicated.
Head response is used to describe the description of the essential information and data of server, and server is retouched by these data
Information is stated, can notify how client handles the data for waiting for a moment its loopback.Common head response field meanings:
Content-Encoding: type of coding used in resource response;
Content-type: the mime type of Current Content;
Date: the date of response;
Server: the WEB server used;
Transfer-Encoding: block transmission coding, is one of http data transmission mechanism.
The message body that response body is just in response to is exactly to return to clear data if it is clear data, if request is HTML page
Face, then what is returned is exactly HTML code, it is exactly JS code if it is JS, so etc.
The http data packet mode for intercepting and capturing Web application includes client HOOK mode and server agent intercepts side
Formula.
Client is taken to refer to the network access API of the web browser process mode for carrying out HOOK, according to server-side
The strategy issued intercepts the http data of specified Web application.Data interception task is placed on each visitor by its advantage
Family end is executed, and server overhead is reduced.Simultaneously this method also have certain disadvantages: 1) number of client deployment compared with
It is more, heavy workload;2) the process consumption processing time is intercepted, FTP client FTP performance is reduced, has to client machine configuration surroundings
It is certain to require;3) compliance is higher, blocking module need with commonly kill virus or protection capacity of safety protection software keep compatibility, it is no
It then will receive destruction.Server agent interception mode is realized based on proxy server technology, client and server it
Between one of outpost is set, after client first sends request data, proxy server can intercept data packet, agency
Server pretends to be client to send data to server again;Similarly, server returns to response data, and proxy server can also incite somebody to action
Data interception is returned again to client.After using proxy server, all HTTP requests all need first to be dealt into agency service
Device after then being repacked by proxy server, then issues target (reality) server, and response is also in this way, proxy server
The response for coming from target (reality) server is first received, issues client computer after then packing.Such as Fig. 2 is HTTP under normal circumstances
Request and respective process schematic diagram, Fig. 3 are the request and respective process schematic diagram in the case of proxy server is added.
The advantages of server agent interception mode are as follows: 1) each client installation hook procedure is not needed, it only need to be in client
Hold authorized agency's server address;It is uniformly processed by proxy server, abnormal problem is easily checked.But which also has centainly
The shortcomings that: 1) all clients data packet request require by proxy server, and concurrency is big, to meet user response speed
Degree is very high to server configuration requirement;If 2) proxy server is abnormal or paralysis, cause network browser pages data without
Method normal load is shown.
In the present solution, we used the http data packet capturings that client HOOK mode carries out Web application.
In addition, the opportunity of data interception according to the process that the page is loaded into be divided into the page load before, load neutralize load after.
Refer to the opportunity intercepted before the page content of user is explained by web browser before page load.It is this to block
Cutting can complete to intercept modification when user does not see rendering result.This mode complexity is higher, needs that number will be intercepted
It after carrying out self-analytic data according to packet, then modifies, returns to modified data packet and loaded to web browser.
It is usually that code is placed on suitable position in the page in page load, which need to rely on the insertion point of code.
It is usually handle relatively easy after page load after page load, web browser, which has parsed, to be loaded
Required dom tree can be implanted into the operation of some DOM, the variation of the Lai Shixian page.
Since the present invention will be independently of general solution be provided except web browser, the opportunity of data interception is page
Face loads previous mode.
S102 carries out parsing to the http response head data packet intercepted and obtains page content data.
The parsing obtains page content data, by http response head (Response Headers)
Content-Type obtains the mime type of current page content, and therefrom filter screen selects the data of HTML and css format.
S103 parses http response content-data, is embedded in watermark information by modification page content data.
It when being parsed to http response content-data, is parsed, is obtained in the page by the page data to css type
The text attribute information for holding display is parsed by the page data to HTML type, obtains text content information in page documents.
The concrete mode for parsing the page data of HTML type includes unstructured content parsing, based on DOM node parsing
Be based on piecemeal Context resolution.
Obtain the detailed process for the text attribute information that content of pages is shown are as follows:
It traverses HTML node and text information is judged whether it is according to HML label substance;
If checking css style information in label comprising text information;
It checks whether comprising font-family attribute in css pattern, and reads the font type and word shown for the page
Body size attribute information.
The detailed process of watermark information is embedded in by modification page content data are as follows:
Step1 generates watermark information bit string.
Watermark bit string should include crucial item of information, such as: 1) computer name;2) computer MAC Address;3) computer
CPU sequence number;4) time etc..
Step2, the session data packet for parsing page request obtains attribute information, and needs to modify according to attribute determination
HTML and css file.
Step3 traverses html and css document content information, searches font-family and font attribute, and carry out watermark
The session data packet that covering is intercepted and captured is saved after font replacement.
After getting font-family and font attribute, reads and default preferred font type information, and judge this font class
Whether type is watermark font type.If so, being inserted into corresponding watermark font name on the head of attribute information.This alternative
When client lacks watermark font, page text style information still can be correctly shown, example:
font-family:'Microsoft YaHei',SimSun,SimHei,"STHeiti Light",STHeiti,"
Lucida Grande",Tahoma,Arial,Helvetica,sans-serif。
Replacement are as follows:
font-family:'WM FontName','Microsoft YaHei',SimSun,SimHei,"STHeiti
Light",STHeiti,"Lucida Grande",Tahoma,Arial,Helvetica,sans-serif。
After the completion of replacement, file corresponding to covering data packet is saved.
Step4, the label object of traversal page HTML obtains the text information that the page is shown, and is replaced by watermark character
It is embedded in watermark information.Fig. 4 is the web document schematic diagram normally shown.Fig. 5 is that web document shown in Fig. 4 is embedded in showing for watermark
It is intended to.
For the continuity and integrality for guaranteeing the watermark information being embedded in the content of text that web browser is shown, traversal
After complete html tag information, legitimate verification is carried out to the tag attributes comprising text information.Only legal label substance is
It is embedded in watermark information.The method of verifying is as follows:
√ verifies whether tag attributes are hiding (style=" display:none ");
Whether √ verifying label information, which passes through javascript script, is modified or hides situation.
Html page file after insertion watermark is replaced original document, and final updated data packet by Step5.
Page file after insertion watermark is returned to web browser and carries out normal content of pages parsing and page by S104
The rendering of face content.
Web browser parses html file in order, constructs dom tree, is being resolved to external css and js (java
Script) when file, request downloading resource is initiated to server, if downloading css file, then resolver can be while downloading
Continue to parse subsequent HTML to construct dom tree, then when downloading js file and executing it, resolver can stop the solution to HTML
Analysis.
It is (big that more specific location information of each rendering objects in visible area can be obtained by calculation in last web browser
Small and position), each pixel information calculated is drawn on the screen, the page is presented on web browser window.
As shown in fig. 6, based on the same inventive concept, the present invention also provides a kind of web document watermark embedding device, packets
It includes:
Data Packet Seize module 1: it is responsible for passing through number when user's logging in network browser accesses the Web application system page
The http data packet of Web application is intercepted and captured according to interception mode;
Resolve packet module 2: it is responsible for http response head in the http data packet obtained to Data Packet Seize module
Data packet carries out parsing and obtains http response content-data;
Watermark information is embedded in module 3: the http response content-data for being responsible for obtaining resolve packet module parses,
Watermark information is embedded in by modification page content data;
Browser processing module 4: it is responsible for for watermark information being embedded in the normal page of page file progress after resume module
Context resolution and content of pages rendering.
Due in the present invention, being embedded in text tracing information in the page documents that Web application system is shown, such as
The finger print information of reader and temporal information etc..When the web document sensitive information in terminal screen is copied by reader by screen
Mobile phone/camera outside shellfish, screen recording, the screen shot or screen screens way of output such as take pictures, record a video is divulged a secret thing
After part, by extracting watermark information from the picture file after the output of the screen of intercepting and capturing, reaches and source of divulging a secret is exported to screen picture
The purpose of head verification, the final security management and control realized to screen shot behavior.
It is aobvious so as to avoid existing screen since in the present invention, the text tracing information vision of insertion is invisible
Property digital watermark bring vision interference;It is that carrier is embedded in that watermark information, which is by the word content in web document, simultaneously,
It is not easy to be wiped by hand, and screen shot or screenshotss can be resisted, therefore security performance is higher.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
For example it is based on method of the present invention, the http protocol of Web application is intercepted and captured by network browser card mode
Data packet, rather than pass through client HOOK mode and server agent intercepts mode.Network browser card mode is a kind of
The data interception mode of active, detailed process are as follows: develop in a browser and load certain functional modules;When user's logging in network
When browser accesses the Web application system page, network browser card directly acquires the http data packet of Web application;It is right
Http response head data packet in the http data packet of acquisition is parsed, and page content data http response content is obtained
Data;Http response content-data is parsed, watermark information is embedded in by modification page content data;After watermark being embedded in
Page file carry out the parsing of normal content of pages and content of pages rendering.The advantages of which be the cost of Data acquisition most
It is small, stability highest, but the disadvantage is that must be bound with specific web browser software, do not have versatility.
For example it is based on method of the present invention, while supporting the HTTPS agreement in internal office work Web application system.By
In in HTTPS protocol procedures, all data communications carry out information encryption, so needing to carry out at web browser end
Operation of the present invention is executed again after corresponding decryption processing.
For example it is based on method of the present invention, it is parsed to http response content-data, and pass through the modification page
Content-data be embedded in watermark information when, not only support content of text data in watermark information insertion, can also support picture,
Watermark information is embedded in the obj ect files such as video and graphic.In addition watermark information is either invisible information, or visual
Bright watermark information.
Claims (10)
1. a kind of web document watermark embedding method, which comprises the following steps:
When user's logging in network browser accesses the Web application system page, the HTTP of Web application is intercepted and captured by data interception mode
Protocol data packet;
Http response head data packet in the http data packet intercepted is parsed, http response content number is obtained
According to;
Http response content-data is parsed, watermark information is embedded in by modification page content data;
Page file after insertion watermark is returned into web browser and carries out normal content of pages parsing and content of pages wash with watercolours
Dye.
2. the method according to claim 1, wherein the http data packet includes: responsive trip, head response
With response body.
3. the method according to claim 1, wherein the data interception mode, including client HOOK mode
With server agent intercepts mode.
4. the method according to claim 1, wherein the process that the opportunity of the data interception is loaded into according to the page
Be divided into the page load before, load neutralize load after.
5. the method according to claim 1, wherein the acquisition http response content-data, is to pass through HTTP
Content-Type in head response obtains the mime type of current page content, and therefrom filter screen selects HTML and css format
Data.
6. being logical the method according to claim 1, wherein described parse http response content-data
The page data parsing to css type is crossed, the text attribute information that content of pages is shown is obtained;Pass through the page to HTML type
Data are parsed, and the content of text information of page documents is obtained.
7. according to the method described in claim 6, it is characterized in that, the page data to HTML type parses, packet
Include: unstructured content parsing is parsed based on DOM node and based on piecemeal Context resolution.
8. according to the method described in claim 6, it is characterized in that, described obtain the text attribute information that shows of content of pages,
Include:
It traverses HTML node and text information is judged whether it is according to HML label substance;
If checking css style information in label comprising text information;
It whether checks in css pattern comprising font-family attribute, and reads the font type shown for the page and font is big
Small attribute information.
9. the method according to claim 1, wherein described be embedded in watermark letter by modification page content data
Breath, comprising:
Generate watermark information bit string;
The session data packet for parsing page request obtains attribute information, and the HTML and css for needing to modify are determined according to attribute
File;
Html and css document content information is traversed, searches font-family and font attribute, and after carrying out watermark font replacement
Save the session data packet that covering is intercepted and captured;
The label object of traversal page HTML obtains the text information that the page is shown, and passes through watermark character replacement insertion watermark letter
Breath;
Html page file after insertion watermark is replaced into original document, and final updated data packet.
10. a kind of web document watermark embedding device characterized by comprising
Data Packet Seize module: it is responsible for passing through data interception when user's logging in network browser accesses the web application system page
Mode intercepts and captures the http data packet of Web application;
Resolve packet module: the http response head data being responsible in the http data packet obtained to Data Packet Seize module
Packet is parsed, and http response content-data is obtained;
Watermark information is embedded in module: the http response content-data for being responsible for obtaining resolve packet module parses, and passes through
It modifies page content data and is embedded in watermark information;
Browser processing module: it is responsible for for watermark information being embedded in the normal content of pages solution of page file progress after resume module
Analysis and content of pages rendering.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910434936.8A CN110348182A (en) | 2019-05-23 | 2019-05-23 | A kind of method and apparatus of web document watermark insertion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910434936.8A CN110348182A (en) | 2019-05-23 | 2019-05-23 | A kind of method and apparatus of web document watermark insertion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110348182A true CN110348182A (en) | 2019-10-18 |
Family
ID=68174307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910434936.8A Pending CN110348182A (en) | 2019-05-23 | 2019-05-23 | A kind of method and apparatus of web document watermark insertion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110348182A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111641701A (en) * | 2020-05-25 | 2020-09-08 | 深信服科技股份有限公司 | Data protection method and device, equipment and storage medium |
CN111698236A (en) * | 2020-06-05 | 2020-09-22 | 浙江华途信息安全技术股份有限公司 | Method and system for preventing leakage of browser |
CN111698237A (en) * | 2020-06-05 | 2020-09-22 | 浙江华途信息安全技术股份有限公司 | Method and system for adding watermark to WEB page |
CN112003873A (en) * | 2020-08-31 | 2020-11-27 | 成都安恒信息技术有限公司 | HTTP (hyper text transport protocol) traffic defense method and system for resisting DDoS (distributed denial of service) attack |
CN112100583A (en) * | 2020-09-23 | 2020-12-18 | 上海英方软件股份有限公司 | Method and device for generating Web visual watermark |
CN112688858A (en) * | 2020-12-18 | 2021-04-20 | 合肥高维数据技术有限公司 | Mail sending method and device |
CN112954019A (en) * | 2021-01-28 | 2021-06-11 | 浙江华途信息安全技术股份有限公司 | Watermark method and system based on reverse proxy technology |
CN113296773A (en) * | 2021-05-28 | 2021-08-24 | 北京思特奇信息技术股份有限公司 | Copyright marking method and system for cascading style sheet |
CN113806697A (en) * | 2021-09-22 | 2021-12-17 | 北京明朝万达科技股份有限公司 | Watermark adding method and system under proxy mode |
-
2019
- 2019-05-23 CN CN201910434936.8A patent/CN110348182A/en active Pending
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111641701A (en) * | 2020-05-25 | 2020-09-08 | 深信服科技股份有限公司 | Data protection method and device, equipment and storage medium |
CN111698236A (en) * | 2020-06-05 | 2020-09-22 | 浙江华途信息安全技术股份有限公司 | Method and system for preventing leakage of browser |
CN111698237A (en) * | 2020-06-05 | 2020-09-22 | 浙江华途信息安全技术股份有限公司 | Method and system for adding watermark to WEB page |
CN112003873A (en) * | 2020-08-31 | 2020-11-27 | 成都安恒信息技术有限公司 | HTTP (hyper text transport protocol) traffic defense method and system for resisting DDoS (distributed denial of service) attack |
CN112003873B (en) * | 2020-08-31 | 2022-04-19 | 成都安恒信息技术有限公司 | HTTP (hyper text transport protocol) traffic defense method and system for resisting DDoS (distributed denial of service) attack |
CN112100583A (en) * | 2020-09-23 | 2020-12-18 | 上海英方软件股份有限公司 | Method and device for generating Web visual watermark |
CN112688858A (en) * | 2020-12-18 | 2021-04-20 | 合肥高维数据技术有限公司 | Mail sending method and device |
CN112954019A (en) * | 2021-01-28 | 2021-06-11 | 浙江华途信息安全技术股份有限公司 | Watermark method and system based on reverse proxy technology |
CN112954019B (en) * | 2021-01-28 | 2023-04-28 | 浙江华途信息安全技术股份有限公司 | Watermarking method and system based on reverse proxy technology |
CN113296773A (en) * | 2021-05-28 | 2021-08-24 | 北京思特奇信息技术股份有限公司 | Copyright marking method and system for cascading style sheet |
CN113806697A (en) * | 2021-09-22 | 2021-12-17 | 北京明朝万达科技股份有限公司 | Watermark adding method and system under proxy mode |
CN113806697B (en) * | 2021-09-22 | 2023-09-01 | 北京明朝万达科技股份有限公司 | Watermark adding method and system in proxy mode |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110348182A (en) | A kind of method and apparatus of web document watermark insertion | |
CN105631355B (en) | A kind of data processing method and device | |
US11593484B2 (en) | Proactive browser content analysis | |
US11403373B2 (en) | Systems and methods for adding watermarks using an embedded browser | |
US7281272B1 (en) | Method and system for copyright protection of digital images | |
US9241004B1 (en) | Alteration of web documents for protection against web-injection attacks | |
US9118712B2 (en) | Network communication system with improved security | |
US11797636B2 (en) | Intermediary server for providing secure access to web-based services | |
US9596132B1 (en) | Virtual sandboxing for supplemental content | |
US20070220599A1 (en) | Client-side extensions for use in connection with HTTP proxy policy enforcement | |
US20100257354A1 (en) | Software based multi-channel polymorphic data obfuscation | |
US20030028801A1 (en) | System and method for preventing unauthorized copying of electronic documents | |
EP2847686A1 (en) | Enhanced document and event mirroring for accessing content | |
KR100843450B1 (en) | System and method for digital rights management using a standard rendering engine | |
CN105631359A (en) | Control method and device of webpage operation | |
Rauti et al. | Browser extension-based man-in-the-browser attacks against Ajax applications with countermeasures | |
US20120102541A1 (en) | Method and System for Generating an Enforceable Security Policy Based on Application Sitemap | |
Bao et al. | Cross-site scripting attacks on android hybrid applications | |
CN111698237A (en) | Method and system for adding watermark to WEB page | |
CN112954019B (en) | Watermarking method and system based on reverse proxy technology | |
CN117313759A (en) | Method, device, equipment and storage medium for data security transmission | |
Kerschbaumer et al. | Towards precise and efficient information flow control in web browsers | |
CN116028901A (en) | Watermark embedding method, device, equipment and storage medium | |
US11240267B1 (en) | Identifying and blocking fraudulent websites | |
Oh et al. | Secret Web Browser Structural Designs for Web Service Capsulation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |