FR92003 0036 1 METHOD AND SYSTEM FOR ANALYSING AND FILTERING HTTPS TRAFFIC IN CORPORATE NETWORKS
Technical field of the invention
The present invention is directed to computer networks and more particularly to a method, system and computer program for transparently analysing and filtering, in an Internet Protocol (IP) network, HyperText Transfer Protocol Secure (HTTPS) traffic as if it was HyperText Transfer Protocol (HTTP) traffic.
Background of the invention
Internet The Internet is a global network of computers and computers networks (the "Net"). The Internet connects computers that use a variety of different operating systems or languages, including UNIX, DOS, Windows, Macintosh, and others. To facilitate and allow the communication among these various systems and languages, the Internet uses a language referred to as TCP/IP ("Transmission Control Protocol/Internet Protocol"). TCP/IP protocol supports three basic applications on the Internet : • transmitting and receiving electronic mail, • logging into remote computers "Telnet"), and • transferring files and programs from one computer to another ("FTP" or "File Transfer Protocol").
TCP/IP
The TCP/IP protocol suite is named for two of the most important protocols: • a Transmission Control Protocol (TCP), and • an Internet Protocol (IP).
Another name for it is the Internet Protocol Suite. The more common term TCP/IP is used to refer to the entire protocol suite. The first design goal of TCP/IP is to build an interconnection of networks that provide universal communication services: an internetwork, or internet. Each physical network has its own technology dependent communication interface, in the form of a programming interface that provides basic
FR92003 0036 2 communication functions running between the physical network and the user applications.
The architecture of the physical networks is hidden from the user. The second goal of
TCP/IP is to interconnect different physical networks to form what appears to the user to be one large network. TCP is a transport layer protocol providing end to end data transfer. It is responsible for providing a reliable exchange of information between 2 computer systems. Multiple applications can be supported simultaneously over one TCP connection between two computer systems.
IP is an internetwork layer protocol hiding the physical network architecture bellow it. Part of the communicating messages between computers is a routing function that ensures that messages will be correctly directed within the network to be delivered to their destination.
IP provides this routing function. An IP message is called an IP Datagram.
Application Level protocols are used on top of TCP/IP to transfer user and application data from one origin computer system to one destination computer system. Such Application Level protocols are for instance File Transfer Protocol (FTP), Telnet, Gopher, HyperText
Transfer Protocol (HTTP).
World Wide Web
With the increasing size and complexity of the Internet, tools have been developed to help find information on the network, often called navigators or navigation systems. Navigation systems that have been developed include standards such as Archie, Gopher and WAIS. The World Wide Web ("WWW" or "the Web") is a recent superior navigation system. The Web is : • an Internet-based navigation system, • an information distribution and management system for the Internet, and • a dynamic format for communicating on the Web.
The Web seamlessly, for the use, integrates format of information, including still images, text, audio and video. A user on the Web using a graphical user interface (GUI) may transparently communicate with different host computers on the system, and different system applications (including FTP and Telnet), and different information formats for files and documents including, for example, text, sound and graphics.
FR92003 O036
Browser
After receipt, the Web client formats and presents the data or activates an ancillary application such a sound player to present the data. To do this, the server or the client determines the various types of data received. The Web Client is also referred to as the Web Browser, since it in fact browses documents retrieved from the Web Server.
HyterText Transfer Protocol (HTTP)
HTTP is the protocol used in most systems on the Internet to transfer Web pages. In particular, HTTP is the standard protocol for non secure access to the Web. HTTP is described in the Request For Comment (RFC) 1945 (HTTP Version 1.0) and RFC 2616 (HTTP version 1.1 ).
Each Web page that appears on client monitors of the Web, may appear as a complex document that integrates, for example, text, images, sounds and animation. Each such page may also contain hyperlinks to other Web documents so that a user at a client computer using a mouse may click on icons and may activate hyperlink jumps to a new page (which is a graphical representation of another document file) on the same or a different Web server.
A Web server is a software program on a Web host computer that answers requests from Web clients, typically over the Internet. All Web use a language or protocol to communicate with Web clients which is called HyperText Transfer Protocol ("HTTP"). All types of data can be exchanged among Web servers and clients using this protocol, including Hyper Text Markup Language ("HTML"), graphics, sound and video. HTML describes the layout, contents and hyperlinks of the documents and pages. Web clients when browsing : • convert user specified commands into HTTP requests, • connect to the appropriate Web server to get information, and • wait for a response. The response from the server can be the requested document or an error message.
After the document or an error message is returned, the connection between the Web client and the Web server is closed. First version of HTTP is a stateless protocol. That is with HTTP, there is no continuous connection between each client and each server. The Web client using HTTP receives a
FR92003 0036
response as HTML data or other data. This description applies to version 1.0 of HTTP protocol, while the new version 1.1 break this barrier of stateless protocol by keeping the connection between the server and client alive under certain conditions.
HyperText Transfer Protocol Secure (HTTPS) HTTPS is the extension to the HTTP protocol used in most systems on Internet to transfer Web pages in a secure way. In particular, HTTPS is the standard protocol for secure access to the Web. HTTPS is an encapsulation of the HTTP protocol within a TLS / SSL (Transport Layer Security / Secure Sockets Layer) connection.
The primary goal of the Secure Sockets Layer (SSL) protocol is to provide a private channel between communicating applications, which ensures privacy of data, authentication of the partners, and integrity. SSL is composed of two layers : • At the lower layer, a protocol for transferring data using a variety of predefined cipher and authentication combinations, called the SSL Record Protocol. • At the upper level, a protocol for initial authentication and transfer of encryption keys, called the SSL Handshake Protocol.
An SSL session is initiated as follows : • On the client (browser) the user requests a document with a special URL that commences "https:" instead of "http:". • The client code recognizes the SSL request and establishes a connection through TCP to the SSL code on the server. • The client then initiates the SSL handshake phase, using the SSL Record Protocol as a carrier.
The SSL addresses the following security issues : • Privacy : After the symmetric key is established in the initial handshake, the messages are encrypted using this key. • Integrity : Messages contain a message authentication code (MAC) ensuring the message integrity. • Authentication : During the handshake, the client authenticates the server using an asymmetric or public key. It can also be based on certificates. SSL requires each message to be encrypted and decrypted and therefore has a high performance and resource overhead.
FR92003 0036
The SSL protocol is located at the top of the transport layer. SSL is also a layered protocol itself. It simply takes the data from the application layer, reformats it and retransmits it to the transport layer. SSL handles a message as follows : • The sender performs the following tasks : • takes the message from the upper layer; • fragments the data to manageable blocks; • optionally compresses the data; • applies a Message Authentication Code (MAC); • encrypts the data; • transmits the result to the lower layer. • The receiver performs the following tasks : • takes the data from the lower layer; • decrypts; • verifies the data with the negotiated MAC key; • decompresses the data if compression was used; • reassembles the message; • transmits the message to the upper layer.
The Transport Layer Security (TLS) is the evolution of the SSL protocol and is described in RFC 2246. TLS is a secure communication protocol to be used between the application and transport layers. In particular, TLS is used to secure HTTP protocol (among others) communications.
Intranet
Some companies use the same mechanism as the Web to communicate inside their own corporation. In this case, this mechanism is called an "Intranet". These companies use the same networking/transport protocols and locally based Web servers to provide access to vast amount of corporate information in a cohesive fashion. As this data may be private to the corporation, and because the members of the company still need to have access to public Web information, to avoid that people not belonging to the company can access to this private Intranet coming from the public Internet, they protect the access to their network by using a special equipment called a Firewall.
FR92003 0036 6 Firewall
A Firewall protects one or more computers with Internet connections from access by external computers connected to the Internet. A Firewall is a network configuration, usually created by hardware and software, that forms a boundary between networked computers within the Firewall from those outside the Firewall. The computers within the Firewall form a secure sub-network with internal access capabilities and shared resources not available from the outside computers.
Often, the access to both internal and external computers is controlled by a single machine, said machine comprising the Firewall. Since the computer, on which the Firewall is, directly interacts with the Internet, strict security measures against unwanted access from external computers are required.
A Firewall is commonly used to protect information such as electronic mail and data files within a physical building or organisation site. A Firewall reduces the risk of intrusion by unauthorised people from the Internet. The same security measures can limit or require special software for people inside the Firewall who wish to access information on the outside. A Firewall can be configured using "Proxies" or "Socks" to control the access to information from each side of the Firewall.
Proxy Server
A Proxy, also called Application Level Gateway, provides higher level control on the traffic between two networks in that the contents of a particular service can be monitored and filtered according to the network security policy. Therefore, for any desired application, corresponding code must be installed on the Proxy in order to manage that specific service passing through said Proxy. A Proxy acts as a server to the client and as a client to the destination server. A virtual connection is established between the client and the server. Though the Proxy seems to be transparent from the point of view of the client and the server, the Proxy is capable of monitoring and filtering any specific type of data, such as commands, before sending it to the destination. For example, an FTP (File Transfer
Protocol) server is permitted to be accessed from outside. In order to protect the server from any possible attacks the FTP Proxy in the Firewall can be configured to deny PUT and MPUT commands.
A Proxy server is an application-specific relay server that runs on the host that connects a secure and non-secure network. The purpose of a Proxy server is to control exchange of
FR92003 0036 7 data between the two networks at an application level instead of an IP level. By using a Proxy server, it is possible to disable IP routing between the secure and the non-secure network for the application protocol the Proxy server is able to handle, but still be able to exchange data between the networks by relaying it in the Proxy server. In order for any client to be able to access the Proxy server, the client software must be specifically modified. In other words, the client and server software must support the proxy connection.
Compared with IP filtering, Proxies provide much more comprehensive logging based on the application data of the connections. For example, an HTTP Proxy can log the URLs (Uniform Resource Locators) visited by users. Another feature of Proxies is that they can use strong user authentication.
HTTP Proxy Server
HTTP and HTTPS Proxies are described in HTTP RFCs. A HTTP Proxy is used to: • access the Internet from an network which does not have direct access to it. • accelerating access to the Internet by caching contents.
Several other functions can be performed in the Proxy when accessing HTTP contents (virus scanning, content filtering, etc.). Many of these functions can also be performed without a Proxy Server, but it is not possible (whether using a proxy or not) to perform any filtering or scanning of HTTPS traffic in the current state of the art.
A HTTP Proxy typically runs in conjunction with Firewall software and allows an access to the Internet. The Proxy Server : • waits for a request (for example a HTTP request) from inside the Firewall, • forwards the request to the remote server outside the Firewall, • reads the response, and • sends the response back to the client.
A single computer can run multiple servers, each server connection identified with a port number. A Proxy server, like an HTTP Proxy server or a FTP Proxy server, occupies a port. Typically, a connection uses standardised port numbers for each protocol (for example, HTTP = 80 and FTP = 21 ). That is why an end user has to select a specific port number for each defined Proxy server. Web Browsers usually let the end user set the host name and port number of the Proxy servers in a customisable panel. Protocols such as HTTP, FTP, Gopher, WAIS, and Security can usually have designated Proxies. Proxies
FR92003 0036 8 are generally preferred over Socks for their ability to perform caching, high-level logging, and access control, because they provide a specific connection for each network service protocol.
HTTPS Traffic Filtering There is a need in corporate network and more particularly in Intranet networks to analyze incoming traffic in order to:
• prevent viruses and malicious data from entering into the corporate network.
• filter and log contents accessible from the corporate network.
In the particular case of Web pages, there are commercial and non commercial products that analyze HTTP/FTP traffic, but at the moment there is no known solution for analyzing the HTTPS traffic on connections. The main reason is that HTTPS has been designed precisely to prevent such analysis.
A filtering of the HTTPS traffic is required by many corporations (banks, big and small companies, ...) and government and there is no real solution available at the present time to satisfy such demand.
Objects of the invention
It is an object of the present invention to analyze HTTPS traffic in corporate networks and more particularly in Intranet networks.
It is another object of the present invention to intercept and decrypt, in a first step, the HTTPS traffic on a connection and to scan and filter, in a second step, said traffic.
It is another object of the present invention to make said scanning and filtering method and system available to systems supporting HTTP traffic.
It is another object of the present invention to enable a Proxy to determine which Certificate Authorities can be considered as valid within the corporate network, regardless of the configuration of the clients (browsers) within said Intranet network.
FR92003 0036
It is a further object of the present invention to transparently scan and filter HTTPS traffic without modifying the communication protocols and while preserving the security of the traffic.
Summary of the invention
The present invention as claimed in independent claims, discloses a system, method and computer program, in a proxy system connecting a secure and a non secure IP (Internet Protocol) network, for analysing and/or filtering HTTPS (HyperText Transfer Protocol Secure) traffic. The method comprises the steps of:
• receiving from a client within a secure IP network, a connection request for establishing a SSL session with a server within a non secure IP network;
• establishing a SSL session with the client; said step of establishing a SSL session comprising the step of sending to the client a fake certificate for identifying itself as the server;
• receiving HTTPS traffic from the client; • converting said HTTPS traffic into HTTP traffic.
Further embodiments of the invention are provided in the appended dependent claims.
The foregoing, together with other objects, features, and advantages of this invention can be better appreciated with reference to the following specification, claims and drawings.
Brief description of the drawings
The novel and inventive features believed characteristics of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objects and advantages thereof, will best be understood by reference to the following detailed description of an illustrative detailed embodiment when read in conjunction with the accompanying drawings, wherein:
FR92003 0036 10
• Figure 1 shows a corporate Local Area Network (LAN) accessible from the Internet network through a system comprising a Firewall and a HTTPS Proxy server according to prior art.
• Figure 2 is a flow chart describing the method of handling HTTPS traffic according to prior art.
• Figure 3 shows a corporate Local Area Network (LAN) accessible from the Internet network through a system comprising a Firewall and a HTTPS Proxy server according to the present invention.
• Figure 4 is a flow chart describing the method of intercepting HTTPS traffic according to the present invention.
• Figure 5 shows a corporate Local Area Network (LAN) accessible from the Internet network through a system comprising a Firewall and a Chained Proxy according to the present invention.
• Figure 6 is a flow chart describing the method of intercepting HTTPS traffic using a chained Proxy according to the present invention.
Preferred embodiment of the invention
The present invention is directed to a method and system for filtering HTTPS traffic and more particularly to a method and system for translating (or converting) this traffic from HTTPS into HTTP so that said traffic can be filtered in a standard manner by already existing HTTP products.
Current proxy servers just forward the HTTPS traffic to its destination. It simply acts as a forwarder for the lower level TCP connection. Common Proxy servers cannot decrypt the data they forward. The present invention adds a new function to these Proxy servers for analysing HTTPS traffic on connections.
FR92003 0036 11
The method, according to the present invention, consists, for the Proxy server, of forwarding the content of the HTTPS requests in an unusual way, by automatically generating a new certificate for the requested destination server. This new certificate is faked. It is signed by a corporate internal Certificate Authority (CA). The new certificate will be included in the response sent by the proxy to the client during the SSL session establishment (according to the SSL protocol, the destination server, which in this case will be the proxy server, always identifies itself using a certificate). The request is then transparently forwarded to the destination server as a normal or standard HTTP Proxy server does.
To prevent clients (Browsers) from detecting this "man-in-the middle attack", the internal corporate Certificate Authority (CA) used to sign the "fake" certificates, must be included in the list of CAs recognized by the clients (Browsers) located in the corporate network (said corporate network is preferably an Intranet network).
The present invention provides the following additional benefits: • Virus and malicious code in HTTPS traffic can be detected.
• HTTPS traffic and URL (Uniform Resource Locator) on connections can be scanned and filtered by the Proxy server.
• Web pages accessed from the Intranet network using HTTPS can be logged in the Proxy server. In conventional proxies the HTTPS URL is encrypted and can not be seen by the proxy.
• Only valid HTTPS requests are allowed across the Proxy server, and not any kind of traffic, as it is the case with current HTTPS Proxy servers. With conventional Proxies, once the connection is established by means of a "Connect" request, the traffic can no longer be controlled since said traffic is encrypted according to the SSL protocol. The advantage of the proposed scenario is that the Proxy server can decrypt the traffic and check whether encrypted HTTP requests are valid or not.
• Certificate Authorities (that sign the Internet Web pages accessed from the Intranet network) can be checked by the Proxy server, since the Proxy server establishes the final SSL connection to the destination server and receives the certificate of the destination server. This checking prevents the clients (Browsers) from accepting invalid Certificate Authorities (CAs) and avoids attacks against clients within the Intranet network. Such attacks may consist, for instance, in inviting an end user with low
FR92003 0036 12 knowledge of HTTPS security implications, to press a button in a message in order to access a service. This message may be, for instance, a Browser alert about a self signed certificate presented by the destination server. This kind of attack is based on the following principle: the hacker misleads the "victim" to do the job that he cannot do himself.
Client authentication using Secure Sockets Layer (SSL) certificates cannot be performed with the method and system according to the present invention, as the private key of the client is not known by the Proxy server. Consequently the Proxy server cannot present the client's certificate to the destination server.
HTTPS Proxy Server according to prior art
Figure 1 shows a typical implementation of a corporate Local Area Network (LAN) (103) (Secure Internal Network) accessible from the Internet network (100) (Untrusted External Network) through a system comprising a Firewall (101 ) and a HTTP/HTTPS Proxy server (102). The Proxy server (102), also called Application Level Gateway, is physically connected to :
• the Firewall (101 ). Datagrams (all traffic) coming to and from the Internet (100) are received after having been filtered by the Firewall (101) on a IP/port basis.
• the corporate LAN network (103). As the corporate servers and clients (104) are not directly connected to the Internet network (100), it is possible to define the corporate LAN network (1O3) by a range of private IP addresses, although this is not a requirement.
The client (104) performs a "Connect" request to the HTTPS Proxy server, which establishes a TCP connection to the requested server/port (105). The HTTPS Proxy responds, then, to confirm the establishment of the TCP connection and from this time, data exchanged between the client (104) and the server (105) are simply forwarded. There is no way to analyze the HTTP traffic, as said HTTP traffic is encrypted using the SSL protocol. SSL being a layer 5 protocol, only application data (HTTP requests, in this case) is encrypted (the IP and TCP headers are not encrypted)
FR92003 0036 13
Method for forwarding HTTPS traffic according to prior art
As described in Figure 2, the method of forwarding HTTPS traffic across a conventional Proxy server comprises the following steps :
• Step 201. The client (Browser) sends to the Proxy server a connection request (CONNECT request) for establishing a TCP connection with the destination server. This connection request has typically following syntax: "CONNECTwww.ibm.com:443 HTTP/1.0".
• Step 202. The Proxy server establishes a TCP connection with the destination server.
• Step 203. The Proxy server replies to the client that the TCP connection with the destination server is established. The reply has typically the following syntax: "HTTP/1.1 200 Connection established". Once the TCP connection is established, the Proxy server will just forward the traffic exchanged between the client and the destination server.
• Step 204. The client sends a SSL session request for establishing a SSL session between him and the destination server across the newly created TCP connection.
• Steps 205-206. Once the SSL session is established between the client and the destination server, the Proxy server can no longer control (scan and filter) the traffic exchanged between the client and the destination server since this traffic is encrypted using the HTTPS protocol. Typical HTTP requests have the following syntax: "GET / HTTP/1.0".
HTTPS Proxy Server according to the present invention
Figure 3 shows a corporate Local Area Network (LAN) (Secure Internal Network) (303) accessible from the Internet network (Untrusted External Network) (300) through a system comprising a Firewall (301) and a HTTPS Proxy system (302) according to the present invention. The HTTPS Proxy system (302) is physically connected to :
• a Firewall (301 ). Datagrams (all traffic) coming to and from the Internet (300) are received after having been filtered by the Firewall (301 ).
FR92003 0036 14
• a corporate LAN network (303). As the corporate servers and clients (304) are not directly connected to the Internet network (300), it is possible to define the corporate LAN network (303) by a range of private IP addresses, although this is not a requirement for the present invention.
The HTTPS Proxy system (302) comprises :
• a internal corporate Certificate Authority (CA) (305) for generating "fake" certificates.
• means (306) for translating (or decrypting) HTTPS traffic into HTTP traffic without being detected as a "man in the middle" by clients (Browsers).
• a conventional HTTP Filter (307) for filtering HTTP traffic.
The HTTPS Proxy system forwards the content of HTTPS requests in an unusual way: for each connection request, a new certificate for the destination server (308), signed by a corporate internal Certificate Authority (CA) (205), is automatically generated (a certificate previously cached can also be used). Once the SSL session is established between the client (304) and the proxy using this "fake" certificate, and the client has sent the HTTP request encrypted using this SSL session, this HTTP request is then transparently forwarded to the destination server (308) in the same manner as a normal or standard HTTP Proxy server would do.
Method for filtering HTTPS traffic according to the present invention
As described in Figure 4, the method of filtering HTTPS traffic according to the present invention, comprises the following steps :
• Step 401. The client within the corporate LAN network, sends a connection request (CONNECT request) to the HTTPS Proxy system in order to access a secure (HTTPS) Web page in a destination server connected to the Internet. The CONNECT request is a HTTP command used to request the HTTPS Proxy system to establish a TCP connection to a destination server. Typically, a CONNECT request has the following syntax and comprises the following parameters : "CONNECT www.ibm.com:443 HTTP/1.0". In this particular case, the CONNECT request is sent by the client to create a TCP connection with the destination server www.ibm.com, port 443. The request
FR92003 0036 15 specifies that the client can talk (can exchange information) with the Proxy using the HTTP Version 1.0 protocol.
The HTTP protocol specifies the way that Browsers interact with Proxy servers (apart from the protocol used to directly retrieve Web pages). Clients and Proxy servers always exchange information in clear text. That means that when a Browser sends a request to a Proxy server, this request is not encrypted. The way requests for HTTP and HTTPS Web pages are handled by Browsers/Proxy servers, is completely different, but both requests use methods defined in the HTTP protocol: • When a Browser wants to access a HTTP URL (Uniform Resource locator) - a HTTP Web page on a server -, it sends to the Proxy server a request (the request has typically the following syntax : "GET http://www.ibm.com/ HTTP/1.0"). The Proxy server retrieves the HTTP Web page from the server and then forwards this page to the Browser. • When a Browser wants to access an HTTPS URL, it sends a request to the Proxy server (the request has typically the following syntax: "CONNECT www.ibm.com:443 HTTP/1.0"). The Proxy server establishes first a TCP connection with the server/port and once this TCP connection is established, just forwards the data exchanged between the Browser and the destination server. This TCP connection is then used by the client to send a direct HTTPS request (a HTTPS request is a HTTP request encrypted according to the SSL protocol) to the destination server.
• Step 402. The HTTPS Proxy system tests whether or not the Proxy system cache contains a certificate for the requested connection.
• Step 403. If the cache doesn't contain any certificate for the requested connection, the HTTPS Proxy system creates a certificate for said connection (in the example, 'www.ibm.com') and forwards it for signature to an internal Certificate Authority (CA). If the cache already contains a certificate for the requested connection, the process goes on directly with step 404.
FR92003 0036 16
• Step 404. The HTTPS Proxy system establishes a TCP connection and, over said TCP connection, a SSL session to the destination server connected to the Internet (a TCP connection with a SSL session over said TCP connection is also called "SSL connection").
• Step 405. The HTTPS Proxy system responds to the client that a TCP connection has been established with the destination server (i.e, the HTTPS Proxy system responds "HTTP/1.0 200 Connection Established"). The client expects this response, although this response does not need to be true at this point. Note that "200" is the HTTP code for "OK". In response to a "Proxy connect request", current Proxy servers just establish a TCP connection to the destination server and then forward the HTTPS traffic exchanged between the client and the server. In the present invention, in response to the "CONNECT request", the HTTPS Proxy system establishes a SSL connection over said TCP connection. In fact, because it knows nothing about the man-in-the-middle, the client expects the establishment of a TCP connection and the client believes that it will establish itself the SSL session with the destination server.
• Step 406. The HTTPS Proxy system : • establishes a SSL session with the client across the TCP connection already established with the client, and • identifies itself to the client as the destination server by presenting to the client during the SSL session establishment, the newly generated certificate. The client accepts this certificate provided that all the clients connected to the corporate network accept the internal Certificate Authority (CA) as a Trusted Certificate Authority.
• Step 407. The client sends an HTTP request, embedded in the SSL session previously established with the HTTPS Proxy system (i.e, queries "GET / HTTP/1.0"), assuming that the HTTP request is sent to the destination server instead of the HTTPS Proxy system.
FR92003 0036 17 • Step 408. The proxy receives the HTTP request from the client and sees the content of this HTTP request, as the SSL session has been established between the client and the proxy. At this point the proxy can apply acces rules per URL the same way as a HTTP proxy would do.
• Step 409. The HTTP request is sent over the SSL session established between the HTTPS Proxy system and the destination server in step (404).
• Step 410. The destination server replies with the requested information.
• Step 411. The proxy receives the HTTP response from the destination server and sees the content of this HTTP response, as the SSL session has been established between the proxy and the destination server. At this point the proxy can filter the content of the HTTP response the same way as a HTTP proxy would do.
• Step 412. The HTTPS Proxy system forwards the reply of the destination server to the client originator of the request.
The difference with the method and system according to the prior art described in Figure 2, is that the HTTPS Proxy system sees the (unencrypted) content of the secure request (407-409) and the secure response (410-412), so any HTTP filter product can be used. It is important to note that some variations in the order of the steps (in particular steps 402, 404, 405, and 406) give the same result.
Second Proxy according to the present invention Alternatively, the request can be chained in clear text to another Proxy server (i.e. a virus scanning Proxy). Figure 5 shows a corporate Local Area Network (LAN) (503) accessible from the Internet network (500) through a system comprising a Firewall (501), a Chained Proxy server (507) and a HTTPS Proxy system (502). The HTTPS Proxy system (502) according to the present invention is physically connected to :
FR92003 0036 18
• the Chained Proxy (507). Datagrams (all traffic) coming to and from the Internet (500) are received by this Chained Proxy server (507) after having been filtered by the Firewall (501 ).
• a corporate LAN network (503). As the corporate servers and clients (504) are not directly connected to the Internet network (500), it is possible to define the corporate LAN network (503) by a range of private IP addresses, although this is not a requirement for the present invention.
The HTTPS Proxy system (502) comprises :
• a internal corporate Certificate Authority (CA) (505) for generating "fake" certificates. • means (506) for translating (or decrypting) HTTPS traffic into HTTP traffic without being detected by the clients (Browsers), as a "man in the middle".
The HTTPS Proxy system (502) forwards the content of HTTPS requests in an unusual way : for each connection request, a new certificate for the destination server (508), signed by a corporate internal Certificate Authority (CA) (505), is automatically generated (a certificate previously cached can also be used). The request is then transparently forwarded to the Chained Proxy (507) in the same manner as a normal or standard HTTP Proxy server would do.
Method for filtering HTTPS traffic using a second Proxy according to the present invention As shown in Figure 6, step 404 of the method previously described in Figure 4, is replaced by a non-SSL connection to the Chained Proxy server and step 408 is modified. The result is the following :
• Step 601. The client sends a connection request (CONNECT request) to the HTTPS Proxy system (502).
• Step 602. The HTTPS Proxy system (502) tests whether or not the Proxy cache contains a certificate for the requested connection.
FR92003 0036 19
• Step 603. If the cache doesn't contain any certificate for the requested connection, the HTTPS Proxy system creates a certificate for the requested connection (in the example, 'www.ibm.com') and forwards this certificate for signature to an internal Certificate Authority (CA). If the cache already contains a certificate for the requested connection, the process goes on directly with step 604.
• Step 604. The HTTPS Proxy system replies to the client that a TCP connection has been established with the destination server.
• Step 604. The HTTPS Proxy system establishes a TCP (non-SSL) connection to the Chained Proxy.
• Step 605. The HTTPS Proxy system replies to the client that a TCP connection has been established with the destination server (i.e, the HTTPS Proxy system replies "HTTP/1.0 200 Connection Established"). The client expects this response, although this response does not need to be true at this point. Note that "200" is the HTTP code for "OK".
• Step 606. The HTTPS Proxy system : • establishes a SSL session with the client across the already established TCP connection with the client, and • identifies itself to the client as the destination server by presenting to the client during the SSL session establishment, the newly generated certificate. The client accepts this certificate provided that all the clients connected to the corporate network accept the internal Certificate Authority (CA) as a Trusted Certificate Authority.
• Step 607. The client sends the HTTP request, embedded in the SSL session previously established with the HTTPS Proxy system (i.e, queries "GET / HTTP/1.0"), assuming that the HTTP request is sent to the destination server instead of the HTTPS Proxy system.
FR92003 0036 20
• Step 608. The HTTPS proxy system receives the HTTP request from the client and sees the content of this HTTP request, as the SSL session has been established between the client and the HTTPS proxy system. At this point the HTTPS proxy system can apply acces rules per URL the same way as a HTTP proxy would do. The HTTP request is then sent by the HTTPS Proxy system to the Chained Proxy in a standard way (i.e., the HTTPS Proxy system sends "GET https://www.ibm.com/ HTTP/1.1" to the Chained Proxy.)
• Step 609. The Chained Proxy sends a standard HTTPS request to the destination server.
• Step 610. The Chained Proxy forwards to the HTTPS Proxy system the reply received from the destination server. The Chained Proxy receives the HTTP response from the destination server and sees the content of this HTTP response, as the SSL session has been established between the Chained'Proxy and the destination server. At this point the Chained Proxy can filter the HTTP response.
• Step 611. The HTTPS proxy system receives the HTTP response from the Chained Proxy in clear text (as there is no SSL between the HTTPS proxy system and the Chained Proxy) and sees the content of this HTTP response, so it can filter the content of the HTTP response the same way as a HTTP proxy would do. The HTTPS proxy system forwards then the reply received from the Chained Proxy to the client originator of the request.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood that various changes in form and detail may be made therein without departing from the spirit, and scope of the invention.