DATA ENCRYPTION AND DISTRIBUTION METHOD AND APPARATUS
The present invention relates to a data distribution method and apparatus, in particular for distributing data content securely. Companies which own rights in works, such as film studios and record companies, and also companies which transmit the data representing such works, such as cable companies, telecommunications companies, will collectively be referred to as content providers. There is a desire among content providers to begin to deliver content via electronic means, both to reduce their costs and to take a share of a growing market. So far they have tried to control distribution of their content by copyright and other legal means, but this is proving to be difficult. They are reluctant to start distribution over the internet because they see widespread piracy, for example mass-produced DVDs which are substantially identical to the authentic product, and because of the ease of file-sharing over the internet. A technical solution to this problem is desired, rather than resorting to legal action after the copyright material has already been copied. Two previous approaches are as follows. All consumers or subscribers are given the same key. Data to be distributed is encrypted by the content provider and can be broadcast to the consumers who can then decrypt it. However, this has the problem that once an unauthorised person has obtained the key they can access all the data until the key for all of the consumers is changed. A second method is to provide each consumer with a unique key and for the content provider to encrypt the content differently for each consumer. However, this method has the problem that it prevents broadcasting and also has the problem that a large computing resource is needed to perform the unique encryption for each consumer. The present invention seeks to alleviate any or all of the above problems. According to the invention there is provided a data distribution method for distribution of data comprising: dividing the data into a plurality portions, wherein the division is independent of the underlying content of the data; encrypting at least a first portion of the data independently of the other portion or portions; and
distributing the data portions. This has the advantage of significantly reducing the computing resources needed by a content provider for distributing data content, because only the first portion need be independently encrypted, such as for each consumer. In practice, it can be virtually as secure as encrypting the whole of the data for each consumer because the other portion or portions of the data are generally useless unless the first portion can be decrypted. This is especially true because the division of the data is independent of the underlying content of the data, i.e. the data is not specifically divided at data boundaries nor according to particular structures of the format in which the data has been encoded, so it is difficult to patch up the data without obtaining and decrypting the first portion. Preferably at least said first portion is independently encrypted a plurality of times according to a different key each time, wherein each key is specific to a particular receiver or group of receivers. Accordingly, a unique key exists for each consumer or group of consumers, having particular receivers, so that even if that is fraudulently obtained or cracked, the data streams for other consumers will still be secure. The key for each consumer may be changed remotely, for example by transmitting a new key using secure encryption. Preferably at least one of the data portions is broadcast for reception by many receivers. Broadcasting is efficient in terms of bandwidth used and can also be secure because of the difficulty of patching up the data without the decrypted first portion. Preferably at least one of the broadcast portions is encrypted according to a key such that it is receivable and decryptable by a single receiver or group of receivers. This can add a further level of security. Preferably the step of dividing the data into a plurality of portions comprises separating potions of the data according to a defined sequence. More preferably, the sequence is defined according to a pseudo-random number generator. This enables the sender and receiver or receivers to pass data securely because only they know the sequence and so know which portions of data comprise the encrypted first portion. Preferably the first portion of the data comprises not more than 25% of the total data. Preferably the first portion of the data comprises not more than 10% of
the total data. Preferably the first portion of the data comprises not more than 1% of the total data. The lower the first proportion of the data that is independently encrypted, the less the computing overhead. Preferably the first portion of the data comprises a variable proportion of the total data. This gives flexibility, and for example enables the size of the first portion of the data for independent encryption to be reduced at times of high demand. Preferably the first portion of the data comprises a plurality of individual bytes of data selected from the total data. Selecting individual bytes of data for independent encryption provides a convenient method of dividing the data. Preferably the selected bytes comprise, on average, less than one byte in every 100 bytes.. This reduces the computational resources required to less than 1% of that which would be required if the whole data content were being encrypted uniquely for each consumer. Preferably the number of bytes between the selected bytes is fixed or variable. Again this provides flexibility and can assist in making it more difficult for a pirate to crack the encryption. Preferably the data comprises a variable length data type.. The method is especially advantageous with a variable length encoded data type, such as MPEG, JPEG, H263, H264, because the portion or portions of the data which are not encrypted in the same way as the first portion would be very difficult to patch up because decoding variable length data types correctly requires that the data stream is understood by the decoder in its entirety, because there is no knowledge of the data item or its length in the decoder. An error in such a data stream, which is what an encrypted portion or byte would appear to be, will cause the decoder to loose synchronization of the decode process until the next synchronizing character is correctly received. Because the data from real video is fairly random, gaining synchronization after an error by using trial and error is very difficult, and will only hold until the next error i.e. encrypted portion. Another aspect of the invention provides a corresponding method for receiving the data distributed by this method and decrypting at least the first portion of the data.
Further aspects of the invention provide apparatus, integrated circuits, computer programs for implementing the above methods, and a data stream comprising a plurality of portions of data, at least a first portion of the data being encrypted independently of the other portion or portions of data, and wherein the division of the data into portions is independent of the content of the data.
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings in which: Fig. 1 is a schematic diagram of a server and terminal according to the invention for operating according to the data distribution method of the invention; and Fig. 2 is a schematic illustration of a data stream according to the invention.
Figure 1 illustrates schematically a server 10, which constitutes the data distribution apparatus of a content provider, and a terminal apparatus 30 of a data consumer, also called a receiver. There is a connection 50 between the server 10 and terminal 30 by means of which data can be transmitted between the server 10 and terminal 30. In the preferred embodiment the connection 50 comprises the internet, but it could be any other suitable connection, for example a local network, public telecommunications system, broadcast system and so on, using any suitable technology, for example cable connection, optical fibre, wireless links using radio waves, microwaves or infrared, satellite transmissions and so forth. The connection 50 does not have to be the same in both directions, for example the server 10 could send transmissions by broadcasting to a plurality of terminals 30, and each terminal 30 could send data to the server 10 by a telephone line. The term "connection" does not imply any fixed physical link; it could be intermittent and it could involve e.g. wireless technology. It is also not necessary for the connection to be two-way; for example the terminal 30 does not have to be able to communicate with the server 10, provided the server 10 knows in advance any encryption keys necessary for use with a specific terminal 30. One embodiment which is envisaged is for the server 10 to record data an a storage medium, such as a DVD, which is then distributed to the consumer for use in their terminal 30.
The server 10 comprises a content database, such as a large amount of storage containing desired data such as video and/or audio data in digital form; an encoder 14 the operation of which will be discussed below; a communication controller 16 for sending and receiving data over the connection 50; a server processing manager 18 for controlling the operation of the items within the server 10; and an optional key database 20 to be described below. The terminal apparatus 30 comprises the following elements: a communication controller 32 for sending and receiving data via the connection 50; a decoder 34 the operation of which will be described below; a storage unit 36; input/output unit 38 and a processor 40 for controlling the elements of the terminal 30 which are connected to each other, in this embodiment, by means of a bus 42. The terminal apparatus 30 could comprise a personal computer (PC), or could be a specific card or chip for inclusion in a PC. Equally, the termmal apparatus 30 could be a dedicated unit, such as a so-called set-top box for use with a television or other audio/video equipment. The storage unit 36 in this embodiment includes a storage medium such as a magnetic hard disc or a recordable optical disc. The storage unit 36 can be integral with the terminal apparatus 30, or can be external to it, such as a disc drive of a PC or a DVD recorder. The storage medium within the storage unit 36 can be fixed or removable. The input/output unit 38 communicates with items that provide a user interface, such as a keyboard, remote controller, television or computer screen. One example of operation of the above-described apparatus will now be explained. The content provider may be offering, for example, video on demand. The user of the terminal apparatus 30 chooses a video that he wishes to receive and conveys this information via a user interface to the input/output unit 38. This data request information is sent to the server 10 via the communication controller 32, connection 50 and communication controller 16. The decoder 34 in the terminal apparatus 30 includes a chip (integrated circuit), which stores a unique machine identification (ID) for the terminal apparatus 30, and a private machine key for the terminal apparatus 30. It may also contain a public key for the server 10, and a public key for the terminal apparatus corresponding to the private key. Communications from the terminal 30 to the server 10 can be encrypted using the
public key of the server 10, and all such terminal apparatus can use that same public key for communications, which can be decrypted by the server processing manager 18 using its private key. Even if these communications are intercepted they cannot be decrypted because knowledge of the public key is useless for that purpose; only the private key retained by the server-processing manager can decrypt them. In addition to the data request information, the terminal apparatus 30 also transmits its machine ID to the server 10. The server processing manager 18 uses the machine ID to look up a corresponding machine key for that particular terminal apparatus 30 in the key database 20 and the key is passed to the encoder 14. In this example the machine key is never transmitted so the key database 20 could simply contain the same machine key as in the decoder 34. In an alternative embodiment, the terminal apparatus 30 transmits its public key to the server 10, preferably in encrypted form. The server processing manager 18 decrypts this public key and passes it to the encoder 14, without the need for a key database. The machine ID can also be used in the server 10 for verifying that the terminal apparatus 30 belongs to an authorised user, and for billing that user. The server processing manager 18 passes the data request information to the content database 12 which supplies the requested data to the encoder 14. The data supplied from the content database may already be encoded form, for example using MPEG encoding. The output from the content database 12 will be in the form of a data stream as shown schematically in the upper half of figure 2. The output of the encoder 14 is illustrated in the lower half of figure 2 and comprises two portions of data, the first portion comprises the parts marked "a" and the second portion comprises the parts marked "b". In this embodiment the first portion comprises a plurality of individual bytes of the data stream, such as every hundredth byte, which have been encrypted using the machine key for the corresponding terminal apparatus 30. This data stream is then sent to the terminal apparatus 30 via communication controller 16, connection 50, and communication controller 32. In the terminal apparatus 30, the unique machine key in the decoder 34 is used to decrypt the first portion (i.e. bytes "a") of the data stream to reconstruct the data stream, which can then be sent for viewing via the input/output unit 38. Although the unit 34 is called a decoder, in this example it is more specifically involved in decryption; any further
decoding, for example to convert MPEG into viewable data could be done outside the terminal apparatus 30. The data stream is divided into the portions independently of the underlying content of the data, i.e. the data is not specifically divided at data boundaries or according to particular structures of the format in which the data has been encoded. In the present embodiment the data stream is treated as a stream of bytes, and within that stream individual bytes or groups of bytes are selected as the first portion "a" for independent encryption. The selection can be performed according to a defined sequence, such as every hundredth byte, or a sequence determined, for example, by a pseudo-random number generator. The second portion of the data stream, indicated by "b" in the bottom portion of figure 2, may also be encrypted, for example using a key common to all terminal apparatuses, but the point is that the first portion of the data stream is independently encrypted because it is encrypted using a key specific to one terminal apparatus 30, or optionally using a key specific to a subset of all terminal apparatuses. The second portion of the data can be broadcast to many terminal apparatuses because it is the same for each destination. Only the first portion (such as every hundredth byte) needs to be uniquely encrypted and sent to a specific terminal apparatus 30. Thus the server load is hugely reduced compared with encrypting each data stream separately for every consumer, and also the router packet load is greatly reduced for the packets for multiple destinations comprising the second portion of the data stream. If the same data is to be sent to more than one terminal, then the first portion of that data is independently encrypted a plurality of times according to a different key each time, wherein each key is specific to a particular terminal or group of terminals. As explained previously, the data stream encrypted according to this method is valueless to unauthorised users because the part of the data stream that is missing the first portion, i.e. the second portion indicated by parts "b" in the bottom portion of figure 2, would be very difficult to patch up and the result would be of very inferior quality because MPEG is very sensitive to data errors, which propagate across frames as well as within a frame. This means that only a small portion of the data stream needs to have individual encryption applied to make the result valueless.
In the terminal apparatus 30, according to the preferred embodiment of the invention, the data stream is stored in the storage unit 36 as it is received by the communication controller 32 and without any decryption being applied, and is only decrypted, using the information in the decoder 34, and output as it is actually being played. Thus, according to this embodiment of the invention, most of the data stream is broadcast to many consumers and only a small portion is independently encrypted for each consumer. The independently encrypted portion could be sent over the internet, and the broadcast portion sent via other means, such as satellite or cable transmission. The broadcast data is useless without the individually encrypted portion, but there are as many individually encrypted versions as there are consumers. Even if a pirate breaks the encryption for one stream, all the other streams are still secure and the broken stream key can be disabled from further use to limit the potential loss. Needless to say, the present invention is not limited to the above-described embodiments, and various modifications may be performed without departing from the scope of the present invention. For example, the proportion of individually encrypted data could be 10% or 1%, such as every tenth byte or every hundredth byte, or indeed, the proportion could vary continuously, to inhibit cracking. The proportion could also be reduced to cope with times of high demand when the processing power of the server's encoders is reaching its limit. The number of bytes between encrypted bytes need not be fixed, but could be varied, so that on average the proportion of encrypted bytes is 10%o or 1%) or any desired percentage. A pseudo-random number generator in the terminal 30 can be used to generate the same predetermined sequence as in the server 10 for determining which bytes are independently encrypted, and therefore require decryption. The pseudorandom number generator sequence can be changed at any time using a key to pass the information in encrypted form. In practice there can be more than two data portions, and_different encryption or the same encryption or no encryption may be applied to the portions, provided there is at least one portion which is independently encrypted, as explained
previously according to the invention. The data of the data portions may, of course, already be encoded, for example in a format such as MPEG. The term "first portion" in no way implies the relative position or sequence of the data portion in the datastream. As illustrated in figure 2, the first portion is not necessarily at the beginning of the data stream, and the first portion is not necessarily contiguous, but may be composed of a plurality of individual parts "a". The illustration of the server 10 in figure 1 is purely schematic. In practice, most of the content would be streamed from standard server racks, with the encrypted parts coming from dedicated hardware which takes in the data stream and uniquely encrypts the portion for each consumer. The server 10 would have to have sufficient encrypting capability for the supported number of consumers with their terminal apparatuses 30, and this would be provided in secure chips.