VERIFYING DIGITAL CONTENT INTEGRITY
Technical Field
The following disclosure relates generally to the field of data security and more particularly to verifying the integrity of content in a server environment.
Background
Various mechanisms exist to enhance the security of digital content associated with networks such as the World Wide Web and the Internet. These mechanisms include the use of passwords, firewalls, access and intrusion control devices, encryption techniques, and others that address the ease of accessing and altering information on the World Wide Web. To protect data during transit, most web sites or server computers use some form of secure protocol such as Secure Socket Layer (SSL).
The use of session keys known only to a client computer and a corresponding web server secures data while it is in transit by uniquely encrypting the data. However, the security of the data before or after the transmission is not guaranteed. While the data resides on the Web server or in computers throughout the server environment, the data is subject to attacks and intrusions from a number of avenues. Web servers have many access points. Data housed on a Web server or computers in a server environment can be accessed by any number of entry means including administrative functions, e-mail, publishing, and diagnostics. Furthermore, backdoors and other vulnerabilities exist that make protecting web content extremely difficult.
As technology has advanced, so has the ability for unwanted intrusions into a company's network. While not all intrusions are malicious in nature, the resulting interruptions frequently result in loss of revenue and damage to reputation. Users demand Internet sites to be available 24 hours per day, 7 days a week. Web site downtime causes immediate sales losses and often damages public confidence longer term. Compounding the nature of competition in e-business, in which alternatives are only a mouse click away, are hackers who often invade and alter web sites as a matter
of pride. Invasions into a web server can corrupt web sites, defacing the sites aesthetically and/or conveying bogus information. Such invasions often cost revenue and good will. Encountering false data is an experience that can mean a permanent loss of a customer. Companies invest intensively to provide a highly available and positive user experience. It is therefore necessary to protect digital content while the content is in transit and while the content is stored. A company must be able to rely on the integrity of the information being posted on the company's web server. The challenge has been to arrive at a balance between security and accessibility. It is easier to build an impenetrable safe than it is to build a secure repository of information that delivers requested data reliably. It is also desirable to safeguard against intrusion in a manner that is transparent to the server environment, the web server, and the web user. These and other challenges are addressed in the following disclosure.
Brief Description of the Drawings
Figure 1 is a block diagram showing one embodiment of a system architecture for verifying content integrity.
Figure 2 is a block diagram showing one embodiment of a hardware architecture for a content integrity device.
Figure 3 is a block diagram showing one embodiment of a software architecture for a content integrity device.
Figure 4 is a flow diagram of one embodiment of a method for verifying content integrity using cryptographic operations.
Figure 5 is a flow diagram of one embodiment of a method for verifying content integrity using digital signatures. Figure 6 is a flow diagram of one embodiment of a method for verifying content integrity using encryption and digital signatures.
Summary of the Invention
Systems and methods for ensuring the integrity of digital content are described. In embodiments of the invention, the integrity of digital content, as it is transmitted from one computing environment to another computing environment via a communications network, is verified by comparing the results of cryptographic operations performed on the content. In particular, a first cryptographic operation may be performed on digital content, upon which the content may be made available for transmission over the communications network. In some embodiments of the invention, content may be stored for future distribution after the performance of the first cryptographic operation. Prior to the release of the digital content, a second, corresponding cryptographic operation may be performed upon the digital content. As a precondition to releasing the content to a requestor, the integrity of the content is verified to ensure it has not been altered. In embodiments of the invention, these verification protocols can be performed by use of a content integrity device; this enables the operations to be performed transparently, without necessitating any modifications to existing software architectures.
In embodiments of the invention, two content integrity devices may be coupled to a content distribution device in communication with the external network; in some such embodiments, the content distribution device is a content distribution server. Digital content which is intended for publishing is intercepted by one of the content integrity devices as the content travels to the content distribution server. Upon arrival at the first content integrity device, the device performs a cryptographic operation such as a digital signature of the content. Once the digital signature has been created, the digital content and signature are passed to the content distribution server. Upon receiving a request for the digital content from outside the server environment, the content distribution server identifies the requested digital content and passes the digital content and its associated signature to a second content integrity device.
The second content integrity device executes the corresponding cryptographic operations performed by the first content integrity device upon the digital content. In some embodiments of the invention, the signature prepared by the second content integrity device is compared to the original in order to verify the content. If, and only if, the digital content is verified, the content integrity device concludes that the integrity of
the digital content is intact and, accordingly, forwards the digital content to the requester outside of the server environment.
Another aspect of the invention includes ensuring the privacy of the digital content while the content resides on a content distribution server, storage medium, or isolated network. Content released for publication by one computing environment is intercepted by a content integrity device. The content integrity device performs cryptographic operations on the content, operations which may include, but are not limited to, the creation of a digital signature. Other cryptographic operations may which may also be performed include encryption and compression operations. In some such embodiments, content which has been encrypted, along with associated digital signatures are passed to a content distribution server where the encrypted content and signature reside until requested by a user outside of the server environment. Upon receiving an external request for the digital content, the content distribution server identifies the encrypted content as the content requested by the user and forwards the encrypted content and associated signature to another content integrity device. The second content integrity device, proceeds to (1) decrypt the content and (2) verify the integrity of the decrypted content. The decrypted digital content is released to the requesting user only after verifying the digital signature.
Other embodiments of the invention include caching the content on a content integrity device so that if altered content is discovered, previously verified content can be forwarded to the requesting user without having to reacquire content from its source. Other aspects of the content integrity device include other cryptographic operations and combinations of cryptographic operations to verify the integrity of the content and ensure the security of the content while residing on the content distribution server. These and other embodiments are described in greater detail herein.
Detailed Description of the Illustrated Embodiments
A. Overview
The invention described herein includes systems and methods for securing and verifying the integrity of digital content. In embodiments of the invention, digital content produced within a secure environment may be conveyed to an intermediate device, such as a publishing system, and subsequently made available for access by users
external to the secure environment. In some such embodiments, digital content with is ready for publishing may be forwarded to a content distribution device, such as a web server, via a device which verifies the integrity of the content. Upon receiving the content from the secure, authorized source, the content integrity device performs one or more cryptographic operations on the content, and subsequently makes it assessable to users outside of the secure environment. In embodiments of the invention, when the content distribution device receives an external request for the content, the content is identified and forwarded to a second content integrity device. This second device performs one or more cryptographic operations, such as to examine the content by verifying the associated digital signature performed by the first content integrity device.
In embodiments of the invention, content integrity is verified by a dedicated content integrity device, which is separate from the devices which house the digital content prior to its external publication. In such embodiments, the content integrity devices are coupled to the remaining devices, such as the content dispersal systems and data networks, in a manner that obviates modification to the remaining devices. This enables web site operators or other managers of digital content to ensure the integrity of digital content transparently, without altering current software or modifying existing hardware on their systems.
B. Network Architecture of the Content Integrity System
A network architecture 100 for verifying content integrity is illustrated in Figure 1. The system 100 includes two content integrity devices 102 104. A first content integrity device 102 is coupled among a publishing system 106 and a content distribution server 1 10. The publishing system 106 may be further coupled to one or more server computers 1 12. The intranet established by the publishing system and server computers 112 is, for purposes of this discussion, isolated from outside intervention and can be considered a secure environment. The second content integrity device 104 is coupled among the content distribution server 1 10 and a network 120, such as the Internet or an intranet, which is in turn coupled to several other computers 130 from which content requests may originate; by way of non-limiting example, the requestors may client programs such as web browsers. The requestors 130 can each possess software for accessing network resources, such as, by way of non-limiting example, a
web browser that when directed by a user requests content from the content distribution server. The protocol for exchange and transport of this information can be one of any protocols known to one skilled in the relevant art and includes but is not limited to HyperText Transmission Protocol (HTTP), File Transfer Protocol (FTP), HyperText Transmission Protocol Secure (HTTPS), Common Internet File System
(CIFS), and Network File Systems (NFS). Other suitable protocols shall be apparent to those skilled in the art.
In one embodiment of the claimed invention, content developed on the server computers 1 12 is communicated to the publishing system 06 and passed to the content distribution server 1 10 via the first content integrity device 102. The first content integrity device 102 generates a first digital signature and digitally signs the content before the content reaches the web server 110. Upon receiving a request for the content from a client computer 130 via network 120, the content is passed from the content distribution server 1 10 to the second content integrity device 104 where the digital signature is verified. If the content is successfully verified, the content is released to the requesting client computer 130. If the content is not verified, its release is blocked, preventing unauthorized or modified content to be released. In an embodiment of the claimed invention, the content integrity device 104 stores a trusted version of the digital content. This cache may be updated either periodically or each time content is verified prior to transmittal. If the verification process fails, the trusted, cached content can be forwarded to the requesting user in place of the content provided by the content distribution server. In some embodiments of the invention, logs may be maintained which record verification failures. Other responses undertaken in case the content integrity is comprised shall be apparent to those skilled in the art. The functionality provided by the content integrity device 102 and 104 can be hosted on dedicated network appliances as shown in Figure 1 , but is not so limited. The content integrity functionality can also be performed by, or distributed among, any combination of the publishing systems 106 and 1 12, numerous client processing devices and browsers 130 coupled to the network 120, and any of the associated network components. Typically each content integrity device can include at least one processor capable of executing computer executable instructions and at least one storage medium for retention of data and software. In some embodiments of the invention, protection and verification operations can be performed on a single physical
device. Many hardware and/or network architectures which support the content integrity functions will be apparent to those skilled in the art.
The cryptographic operations which may be performed by a content integrity device are multifarious. Encryption operations may be symmetric, such as, by way of non-limiting example, any of the variants of the Data Encryption Standard (DES).
Alternatively, asymmetric encryption may be employed, such as, by way of non-limiting example, public-private key algorithms, such as any of the variants of Rivest-Shamir- Ableson (RSA), Pretty Good Privacy (PGP), or other examples which will be apparent to those skilled in the art. The cryptographic operations may include one-way functions, such as one-way hash functions. These one way hash functions may include, by way of non-limiting example, any of the variants of Secure Hash Algorithm (SHA), Message Authentication Code (MAC), and a Message Digest (MD) functions. In embodiments of the invention, a content integrity device may perform multiple encryption functions on digital content. These and other permutations of cryptographic functions in the invention shall be apparent to those skilled in the art.
C. Hardware Architectures of the Content Integrity System
Figure 2 illustrates a hardware architecture 200 for verifying the integrity of digital content. The hardware architecture 200 includes at least one processor 208, a memory system 210, and I/O. Inherent to the architecture 200 is a system bus 206 that operatively couples the various components together. The processing unit may be any logic processing unit, such as one or more central processing units (CPUs), digital signal processors (DSPs), application-specific integrated circuits (ASIC), etc. Unless described otherwise, the construction and operation of the various blocks shown in Figure 2 are of conventional design. As a result, such blocks need not be described in further detail herein, as they will be readily understood by one skilled in the relevant art.
The operating system 202 contains the basic routines that help transfer information amongst the elements within the architecture 200. In non-limiting embodiments of the invention, the operating system is based on a version of the Linux operating system The system bus 206 can employ any know bus structures or architectures including a memory bus with memory controller, a peripheral bus, and a local bus. The memory 21 0 includes read-only memory (ROM) and random access
memory (RAM). The input / output system 216 contains basic routines that help transfer information between elements with the content integrity device. Non-limiting examples of input / output 216 include various forms of Ethernet. The content integrity device can also include secondary storage media, non-limiting examples of which include a hard disk drive for reading from and writing to a hard disk, and an optical disk drive and a magnetic disk drive for reading from and writing to removable optical disks and magnetic disks, respectively. The optical disk can be a CD-ROM, while the magnetic disk can be a magnetic floppy disk. The hard disk drive, optical disk drive and magnetic disk drive communicate with the processing unit 208 via the bus 206. The hard disk drive, optical disk drive and magnetic disk drive may include interfaces or controllers coupled between such drives and the bus 206, as is known by those skilled in the art. The drives and their associated computer-readable media, provide nonvolatile storage of computer readable instructions, data structures, program modules and other data for the content integrity device. Those skilled in the relevant art will appreciate that other types of computer-readable media that can store data accessible by a processor may be employed, such as magnetic cassettes, flash memory cards, digital video disks ("DVD"), Bernoulli cartridges, RAMs, ROMs, smart cards, etc.
Program modules, such as an operating system can be stored in the system memory 210 one or more application programs, other programs or modules, and program data. In embodiments of the invention, the system memory 210 may also include software for permitting the content integrity device to access and exchange data with web sites in the World Wide Web of the Internet.
In one embodiment the system memory 210 stores private and public keys that enable the processing unit 208 to create digital signatures of content received through the input and output ports 216. Furthermore, various algorithms and other computer cryptographic executable codes are retained in the system memory 210 for the encryption and decryption of the digital content. The operating system 202 can also direct the caching of the digital content in both the encrypted and clear text state as well as associating a digital signature generated in the content integrity device with a particular piece of digital content.
A further aspect of the content integrity device is the inclusion of a hardware security module 220 with a smart card. The hardware security module 220 comprises a
tamper resistant device that stores private keys in a secure format. The private keys are encrypted using a separate group key known to a select, predefined group of ancillary network devices. This encrypted group of keys can be transported between the various devices using a smart card. The smart card can also be used to back up the encrypted key data. As the data contained on the smart cards is encrypted with a separate key, the encrypted group of private keys can only be accessed by one of the devices in the predefined group.
Another aspect of the hardware security module is a protocol that supports "k out of n" secret sharing of the separate group key. Such a protocol enhances security by requiring each device to use multiple smart cards for backing up and restoring the private keys. For example, if the private key information is distributed across a group of five smart cards (n=5), preferences can be established requiring three smart cards (k=3) be inserted into a smart card reader before the group data can be accessed. Any attempt to access the data with less than three smart cards will fail. Using a "k out of n" schema ensures data security. If a single card is stolen, misplaced or unaccounted for, a unauthorized user of the card will not be able to access the key data stored on the hardware security module. D. Software Architecture of the Content Integrity System
Figure 3 shows one embodiment of a software architecture 300 for verifying the integrity of digital content by use of a content integrity device. The architecture 300 includes a caching engine 315, a content identification engine 320, a cryptographic engine 325, a digital signature engine 330, and a process manger 332. As content is received, the process manager 332 uses the digital signature engine 330 and cryptographic engine 325 to perform cryptographic operations on the content. Once completed, the content may optionally be cached using the caching engine 315 for subsequent access and dispersal. In addition to directing cryptographic operations on the content, the process manager 332 uses the content identification engine 320 to associate the content with the digital signature and encrypted data. In one embodiment, the cryptographic engine 325 may embed a numerical representation, such as a digital signature, in the content itself. In other embodiments the numerical representation may or may not be part of the content. Upon receiving a request for the content, the process manager 332, using the content identification engine 320,
identifies and retrieves the content and associated signatures from the cache for dispersal.
In the embodiment shown in Figure 3, the process manger 332 manages content received from a server environment 340 that includes a publishing system 350 and other server computers 360. The process manager 332 conveys the content to entities outside the server environment such as a content distribution server 375 and a network 385 such as the Internet. While this embodiment illustrates the software architecture 300 of a content integrity device which is positioned between a server environment 340 and a content distribution server 375, in alternate embodiments an identical software architecture 300 may be placed between the content distribution server 375 and a network 385 to facilitate the verification of the content's integrity as a condition to its distribution over the network 385.
E. Integrity Verification Process
The invention supports several techniques for verifying the integrity of digital content. A method 400 for verifying the integrity of content in a network environment used in embodiments of the invention is illustrated in the flow diagram of Figure 4.
Content is published 405 and intercepted by a content integrity device. The content integrity device performs at least one cryptographic operation on the content 410.
These operations can include encryption operations, decryption operations, hash operations, keyed hash operations, keyed hash verifications, digital signatures, signature verification, checksums, and other like operations known to those skilled in the relevant art.
Once the desired cryptographic operation has been completed 410, the content is transferred 415, to a content distribution server, along with any associated result from the cryptographic operation, for dissemination over the Internet or like network.
The content remains on the content distribution server until an updated version is received from the publishing server. By way of non-limiting example, a client computer may request content from a web page resident on the content distribution server. The content distribution server 420 receives the request for the content and identifies the desired object with the appropriate embedded objects to send. As the content is directed to the client's IP address, the content is intercepted 425 by the second content integrity device.
The second content integrity device performs additional cryptographic operations on the content. Before the content is released to requests originating outside the system, the signature is verified 435. If the verification succeeds, indicating that the content has not been altered, the content integrity device releases the content 440 to the requesting client.
Figure 5 depicts a flow diagram for verifying content integrity 500 by use of digital signatures. Content is received at the first integrity device from a trusted source such as the publishing server 505. Transparent to the publishing server and the content distribution server, the content integrity device creates a digital signature 510. The digital signature can be created using various algorithms known to one skilled in the relevant art. The digital signature of the content is formed in one embodiment by using secret information such as a private key which can be later verified by using public information such as a public key. Other algorithms which may be used to create the digital signature include, but are not limited to one way hashing, keyed hashes such as Keyed Hash Message Authentication Code (HMAC), timestamps, and other techniques that will be apparent to those skilled in the art. Once signed, the digital signature is associated 515 with the content and then transferred to the content distribution server, or other storage medium 520.
In embodiments of the invention, the content remains on the content distribution server until it is either requested by a client or replaced by the publishing server 525.
Upon receipt of the request, the appropriate content is identified 530 and ultimately forwarded for integrity verification 535.
The content and the associated signature arrive at the second integrity device where the signature is verified 540. As the algorithms and keys used correspond to the original signature, the signature verification will be successful if the content has not been altered. Likewise the signature verification will fail if false data has been placed on the content distribution server lacking the proper signature. If the signatures are verified 545, the content is released 550 to the requestor. In an alternative embodiment, the content integrity device maintains a cache of the verified content. Upon detecting a discrepancy in the digital signatures, the content integrity device releases the cached content and alerts the network manager of the presence of false data 560. Various other protocols in response to a verification failure can be established that are aligned with the use of the methodology and techniques for
content integrity verification described herein; these protocols shall be apparent to those skilled in the art.
An alternative method for verifying the integrity of digital content is shown in the flow diagram of Figure 6. Continuing with the theme of ensuring the integrity of the content prior to its dispersal, this method 600 also protects the privacy of the content as it rests on the content distribution server. As described herein, content is published by the publishing server 605 to a first integrity device. Upon arrival, a digital signature of the content is formed 610 using methodology described herein and known to one skilled in the relevant art. The content is subsequently encrypted 615 using a non- limiting algorithm such as Data Encryption Standard (DES), Rivest Shamir Adleman
(RSA), or another cipher commonly known to one skilled in the relevant art. In an alternative embodiment, the content is signed using a private key and encrypted using a distinct public key.
The encrypted content is associated 620 with the digital signature and transferred 625 to the content distribution server. The content and digital signature reside on the content distribution server in an encrypted state until updated by new authorized data. The content remains encrypted until a request for the content 630 is received from a requestor. Upon receiving the request, the content distribution server 635 identifies the content and associated signature. The encrypted content and signature are then forwarded 640 to a second content integrity.
The second content integrity device decrypts the content 645 using, in one embodiment, the corresponding private key of the public/private key pair. With the content decrypted, the digital signature is verified. In one embodiment, verifying the digital signature includes using a public key corresponding to the public/private key pair utilized by the first content integrity device. Other techniques for verifying integrity readily known to those skilled in the relevant art can also be used without affecting the functionality of the invention. Having verified 655 the digital signature, the decrypted content is released 660 to the Internet and ultimately to the requestor. In embodiments of the invention, slightly different operations may be used for the encryption and the signatures.
The system and methodology described herein protects the content on a content distribution server from being stolen and altered by unauthorized users. It further verifies that the content being served to the Internet from a content distribution server is
the content that was intended to be published to the content distribution server. The content is signed as it is being published to the content distribution server. If the signature associated with the content upon transmission from the content distribution server is different from the one when the content was initially published, the transmission is blocked ensuring the client is not exposed to false or misleading data.
Alternative Embodiments
Though many of the embodiments described herein involve the deployment of two content integrity devices, many alternative embodiments shall be apparent to those skilled in the art. For instance, the verification procedures may be performed on a single device. By way of non-limiting example, a single content integrity device may include separate process which perform an initial digital signature on digital content, and then, prior to release by a content distribution server, verify the integrity of the digital content by performing and comparing a second digital signature on the content. In some embodiments of the invention, the digital content device may itself be incorporated into a content distribution server, or may comprise discrete processes within a content distribution server. In some embodiments, the encryption and verification processes described infra may be at least partially performed on line cards within networking devices. Additionally, a content integrity device may perform multiple cryptographic operations on digital content.
Those skilled in the relevant art will appreciate that the routines and other functions and methods described herein can be preformed by or distributed among any of the components described herein. While many of the embodiments are shown and described as being implemented in hardware (e.g. one or more devices designed specifically for a task), such embodiments could equally be implemented in software and be performed by one or more processors. Such software can be stored on any suitable computer-readable medium, such as micro-code stored in a semiconductor chip, on a computer-readable disk, or downloaded from a server and stored locally at a client. The embodiments described herein are for illustrative purposes only; many equivalents and alternatives shall be apparent to those skilled in the art.