WO2013006919A1 - Cryptographic processes - Google Patents

Cryptographic processes Download PDF

Info

Publication number
WO2013006919A1
WO2013006919A1 PCT/AU2012/000843 AU2012000843W WO2013006919A1 WO 2013006919 A1 WO2013006919 A1 WO 2013006919A1 AU 2012000843 W AU2012000843 W AU 2012000843W WO 2013006919 A1 WO2013006919 A1 WO 2013006919A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
audio
data
encryption key
visual
Prior art date
Application number
PCT/AU2012/000843
Other languages
French (fr)
Inventor
Simon LOCKE
Gautam Tendulkar
Original Assignee
Commonwealth Scientific And Industrial Research Organisation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2011902809A external-priority patent/AU2011902809A0/en
Application filed by Commonwealth Scientific And Industrial Research Organisation filed Critical Commonwealth Scientific And Industrial Research Organisation
Priority to EP12811527.6A priority Critical patent/EP2732577A4/en
Priority to US14/232,795 priority patent/US20150256334A1/en
Priority to AU2012283683A priority patent/AU2012283683A1/en
Publication of WO2013006919A1 publication Critical patent/WO2013006919A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0819Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s)
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/30Public key, i.e. encryption algorithm being computationally infeasible to invert or user's encryption keys not requiring secrecy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/24Key scheduling, i.e. generating round keys or sub-keys for block encryption

Definitions

  • the present invention relates to processes and systems for secure communication, including processes for receiving and sending encryption keys and for establishing a secure communications channel, such as may be used for secure video conferencing.
  • Asymmetric encryption systems are known (e.g. , public key encryption systems such as RSA) in which a party A (conventionally known as 'Alice') has a pair of keys: a public key, which a counterparty B (conventionally known as 'Bob') uses to encrypt messages intended for Alice; and a corresponding private or secret key that Alice can use to decrypt messages encrypted using the public key.
  • a message encrypted using Alice's public key cannot in theory feasibly be decrypted other than with Alice's private key.
  • counterparty Bob will also have a key pair, one of which he makes public for Alice, for example, to encrypt messages to him, and one he keeps private for decryption of messages encrypted with his public key.
  • Alice and Bob may proceed -to exchange encrypted messages (for example, by email) which may, theoretically, only be decrypted by the intended recipient.
  • TTP trusted third parties
  • P I public key infrastructures
  • TTP trusted third parties
  • Alice may obtain it from the TTP.
  • the TTP may provide to both Alice and Bob a session key, encrypted with their respective public keys. Once decrypted, this session key can be used both to encrypt and decrypt communications between Alice and Bob, and is therefore referred to as a symmetric encryption key.
  • TTP provides a session key
  • the TTP itself or a further party who manages to breach the TTP's security
  • a process for sending via a communications network a public encryption key of a first node of said network to a second node of said network the process being executed by the first node, and including the steps of:
  • first audio-visual data representing at least one of an audio and a visual environment of the first node
  • the confirmation that the first digest data was received by the second node includes digest data generated by the second node from audio-visual data representing at least one of an audio and a visual environment of the second node. In other embodiments, the confirmation that the first digest data was received by the second node includes digest data generated by the second node from audio-visual data representing at least one of an audio and a visual environment of the second node and from a public encryption key of the second node. In some embodiments, the process includes generating the public encryption key and a corresponding private encryption key at the beginning of a communications session, and disposing of at least the private encryption key at the end of the communications session. In some embodiments, the generated private encryption key is only stored in volatile memory.
  • the process includes:
  • the encrypted data includes encrypted audio-visual data representing at least one of an audio and a visual environment of the second node, including at least one of streaming audio and streaming video of one or more participants in a teleconference or a video conference.
  • the process includes the steps of:
  • third digest data from the communications network, the third digest data purportedly being sent from the second node and generated by applying a one way function to the public encryption key of the second node and third audio-visual data representing at least one of an audio environment and a visual environment of the second node;
  • the third digest data provides the confirmation that the first digest data was received by the second node, and the public encryption key and the first audiovisual data sent to the second node provide the confirmation of receipt of the third digest data.
  • a process for receiving via a communications network a public encryption key of a first node of the network the process being executed by a second node of the network and including the steps of:
  • first digest data from the communications network, the first digest data purportedly being sent from the first node and generated by applying a one way function to the public encryption key of the first node and first audio-visual data representing at least one of an audio environment and a visual environment of the first node;
  • a process for establishing secure communications between first and second nodes of a communications network the process being executed by the first node, and including the steps of:
  • a process for establishing secure communications between a first party and a second party wherein each party sends its public encryption key to the other party using any one of the above processes for sending, and each party receives the public encryption key of the other party using any one of the above processes for receiving, and wherein the sending steps are performed by the parties antiphonally.
  • the audio-visual data being assessed for congruence are contiguous portions of an audio-visual data stream.
  • the first audio-visual data represents request to communicate via the communications network.
  • each said audio-visual data represents an audio and a visual environment of the corresponding node.
  • a communications system configured to execute any one of the above processes.
  • at least one computer-readable medium storing computer-executable instructions that, when executed by at least one processor of a computer system, cause the processor to execute any one of the above processes.
  • a communications system including:
  • a second node of a communications network for use by a second party
  • first node is configured to send its public encryption key to the second party by executing any one of the above processes for sending
  • second node is configured to receive the public encryption key of the first node by executing any one of the above processes for receiving.
  • a communications device configured for secure communications with at least one other communications device of the same type over a communications network, the communications device including:
  • an audio-visual input module configured to receive captured audio and/or video of the environment of the communications device
  • a hash component configured to generate first digest data by applying a one-way function to a public encryption key of the communications device and first audio-visual data representing the captured audio and/or visual environment of the communications device;
  • a transmission component configured to send the first digest data to the other communications device, and, responsive to receipt of a confirmation that the first digest data was received by the other communications device, to send the public encryption key and the first audio-visual data to the other communications device to allow it to determine that the public encryption key and the first audio-visual data were used to generate the first digest data; and to send second audio-visual data to the other communications device, the second audio-visual data being different to but congruent with the first audio-visual data to allow the other communications device to determine that the second audio-visual data is congruent with the first audio-visual data, ' and consequently that the public encryption key received by the other communications device is that of the communications device.
  • Also described herein is a method for receiving an encryption key via a communications link, comprising the steps of:
  • step b) subsequent to step a), receiving an encryption key and a first audio-visual data item via the communications link;
  • step b) subsequent to step b), performing a one-way function on the encryption key and the audio-visual data item to generate a second digest data item and comparing the first and second digest data items to confirm that the encryption key and the first audio-visual data item were used to generate the first digest data item;
  • step c) subsequent to step c), receiving a second audio-visual data item, comparing it with the first audio-visual data item to determine a degree of congruence and, depending upon the degree of congruence, determining whether there is an eavesdropper active on the communications link.
  • Also described herein is a method for transmitting an encryption key via a communications link, comprising the steps of:
  • step a) performing a one-way function on an encryption key and a first audio-visual data item to generate a first digest data item, and transmitting the first digest data item via the communications link: b) subsequent to step a), transmitting the encryption key and the first audiovisual data item via the communications link;
  • step b) subsequent to step b), transmitting a second audio-visual data item via the communications link, wherein the second audio-visual data item is congruent with the first audio-visual data item.
  • Also described herein is a method for establishing secure communications comprising receiving a remote party's encryption key by the above method for receiving and transmitting a local party's encryption key by the above method for transmitting.
  • Also described herein is a method for establishing a secure communications channel between a first party and a second party, wherein each party transmits a respective encryption key to the other party using the above method for transmitting and each party receives the others party by the above method for receiving, and wherein the transmission steps are performed by the parties antiphonally.
  • the audio-visual data items are preferably contiguous audiovisual data, which are preferably generated contemporaneously with the other steps of the method.
  • the contiguous audio-visual data includes audio-visual data of a party desiring to communicate via the communications link.
  • Also described herein is a communications system comprising:
  • first computer system is adapted to transmit an encryption key to the second party by the method of the second aspect
  • second computer system is adapted to receive the encryption key by the method of the first aspect.
  • the respective first and second audio-visual data items transmitted by the first and second computer systems are preferably contiguous audio-visual data, preferably generated by the first and second computer systems based upon their respective environments.
  • the respective contiguous audio-visual data transmitted by the first and second computer systems include audio-video streams of respective users of the first and second computer systems.
  • Also described herein is a method for receiving an encryption key via a communications link, comprising the steps of:
  • step b) subsequent to step a), receiving an encryption key and a first audio-visual data item via the communications link;
  • step c) subsequent, to step b), performing a one-way function on the encryption key and the aiidio-visual data item to generate a second digest data item and comparing the first 1 and second digest data items to confirm that the encryption key and the first audio-visual data item were used to generate the first digest data item;
  • step d) subsequent to step c), receiving a second audio-visual data item, and comparing it with the first audio-visual data item to assess the congruence of the first and second audio-visual data items and the security of the communications link.
  • Figure 1 is a schematic representation of an embodiment of a communications system
  • Figure 2 is a flow diagram of a process for sending via a communications network a public encryption key of a first node of said network to a second node of said network;
  • Figure 3 is a flow diagram of a process for receiving via a communications network a public encryption key of a node of the network;
  • Figure 4 is a flow diagram illustrating the information flow in the steps occurring in an embodiment of a process for establishing a secure communications channel; and
  • Figure 5 is a schematic diagram of a computer system in which embodiments of the present invention may be implemented.
  • a communications system 100 for secure communication between first and second parties includes a first node 102 for use by the first party ('Alice') and a second node 104 for use by the second party (' Bob'), the first and second nodes 102, 104 being nodes of a communications network.
  • each of the network nodes 102, 104 is also referred to herein as a "computer system", where the scope of this term in this specification includes not only general purpose computers executing software instructions that cause the computer to behave as described below, but also devices and systems configured for more specific purposes, such as teleconference or video conference devices, systems, and/or hardware, mobile telephones and the like.
  • the communications channel 106 may be or include any form or forms of communications channel, including, for example, a fixed wire network, a wireless network, a telecommunications channel such as a mobile telecommunications channel, a TCP/IP communications channel via a computer network such as the Internet, or any combination of these.
  • the computer systems 102, 104 constitute respective nodes of a communications network.
  • the nodes 102, 104 are general purpose computers (for which the reference numerals 102, 104 will continue to be used) executing video telephony software that causes the computers 102, 104 to effect the processes described herein.
  • the software is a complete (perhaps open source) video telephony solution.
  • the processes described herein are provided by a plug-in module for an existing video telephony solution such as Skype.
  • the standard computer systems are 32-bit or 64-bit Intel Architecture based computer systems 500, as shown in Figure 5, and the described j n th e form of programming instructions of one or more software modules 502 stored on non-volatile (e.g.
  • hard disk storage 504 associated with the computer system, as shown in Figure 5.
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • Each of the computer systems 500 includes standard computer components, including random access memory (RAM) 506, at least one processor 508, and external interfaces 510, 512, 514, all interconnected by a bus 516.
  • the external interfaces include universal serial bus (USB) interfaces 510, a network interface connector (NIC) 512 which connects the system 500 to a communications network such as the Internet, a display adapter 514, which is connected to a display device such as an LCD panel display 522.
  • USB universal serial bus
  • NIC network interface connector
  • At least one of the universal serial bus (USB) interfaces 510 is connected to a keyboard and a pointing device such as a mouse 518, at least one other being connected to a video camera and microphone 1 12, 1 14, which may be in the form of physically separate devices or an integrated video/audio capture device.
  • the microphone and video devices are integrated components internal to the computer system 500, as is the case, for example, where the computer system 500 is an Apple iMac desktop computer.
  • a process for transmitting a public key is initiated at step 202 when, for example, a user, Alice, of the first node 102 indicates (to the software 502) a desire to communicate securely with the user, Bob, of the second node 104.
  • Alice's node 102 generates a new public/private encryption key pair (P A ,X A ) (e.g., an RSA key pair).
  • P A ,X A public/private encryption key pair
  • the private key XA is to be kept secret for decryption of received data, and in some embodiments is only ever stored temporarily in volatile RAM 506 for the duration of the communications session.
  • the public key P A is to be transmitted to the second node 104 for the encryption of data to be sent to Alice.
  • Alice's node 102 generates first audio-visual data representing at least one of its audio and its visual environment, using the associated microphone and/or camera 1 12.
  • audio-visual in this specification is to be construed broadly as encompassing:(i) audio only, (ii) visual only, and (iii) both audio and visual.
  • the term “visual” ' in this context means purely visual or image information without an audio component, and most broadly means at least one image or video frame but is typically a sequence of temporally contiguous video frames; e.g. , a video stream.
  • audio environment and visual environment in respect of a node/computer system and/or its user refer respectively to audio information and visual information capable of being captured by a microphone and pure imaging device (i. e. , without sound), respectively, and being capable of distinguishing the environment of the node/computer system itself and/or its user from the environments of other nodes/computer systems and/or their users.
  • the first audio-visual data represents both visual and audio information of Alice herself (being deemed a part of the 'environment' of her node 102) and optionally also of her immediate surroundings. That is, the visual component of the first audio-visual data represents at least one image (and typically a sequence of video frames, more typically being a portion of a 'live' video stream) of Alice herself, and optionally also her immediate surroundings (e.g. , the room she is in and associated furniture/decor, the chair she is sitting in, etc).
  • the audio component of the first audio-visual data thus represents the accompanying sounds of Alice and/or her environment; for example, Alice talking and/or music and/or ambient sounds (musical or otherwise) of Alice's environment.
  • the music component may be ambient music playing in Alice's environment, and in some embodiments includes a music file stored on Alice's computer 106 that is played by the software 502.
  • the music file may be one of a plurality of music files randomly selected by the software 502.
  • Alice's node 102 uses a one-way function (e.g. , the cryptographic hash algorithms SHA-2 family of functions), Alice's node 102 generates at least one hash Hi as a function of at least a portion of the captured first audio-visual data (e.g. , the visual component may be a predetermined frame, a predetermined number of consecutive frames, or a predetermined scheme of non-consecutive frames, but most typically a portion of a live stream of video with accompanying audio) and of Alice's public key PA-
  • the hash Hi is generally described herein as being a single hash Hi .
  • the first audio-visual data represents a relatively long portion of streaming video
  • multiple hashes may need to be generated; however, only one of them (e.g. , the first one) needs to include Alice's public key.
  • Alice's node 102 transmits the hash H i via the communications link 1 10 to Bob's node 104.
  • Bob's node 104 receives the hash Hi and in response issues an acknowledgement confirming receipt of the hash H
  • the acknowledgement/confirmation is then received at Alice's node 102 at step 212.
  • the acknowledgement is provided in the form of an audio-visual confirmation from Bob himself that the hash has been received, so that Alice and/or Alice's software 502 can be confident that Bob has indeed received the hash Hi .
  • Alice's node 102 transmits the public key P A and the first audio-visual data that were used to generate the hash Hi, and, at step 216, continues to transmit audio-visual data (referred to herein as 'second' audio-visual data to avoid confusion) captured by the camera/microphone 1 12 and congruent with the first audio-visual data that was used to generate the hash H
  • congruent is meant that it can be determined by a remote party fin this case. Bob himself and/or Bob's node 104) that the source of the audio- visual data is the same.
  • this will be the case where the first and second audio-visual data are contiguous respective portions of a stream of audio-visual data, such that there is temporal continuity between the first and second audio-visual data, but this might not always be the case in other embodiments.
  • the audio-visual data may, for example, include video of Alice or Bob in which details of their background, attire, etc are visible.
  • a viewer may be able to determine that the audio-visual data is of a single source or origin, notwithstanding a small temporal discontinuity.
  • congruence can be assessed by a computer or other processing device analysing one or more properties of the audio-visual data, for example, pixel colour and brightness.
  • Congruence can also be assessed, at least in part, using an audio component of the audio- visual data captured by the camera/microphone 1 12 to overcome difficulties caused, for example, by Alice being silent at the time of capture.
  • this audio component may include audio generated from an audio file stored on Alice's computer 102.
  • a video file stored on Alice's computer 106 can be used to assess congruence, rather than, or in addition to, captured video of Alice, although the use of stored information alone may be less robust than using the 'live' or real- time visual and/or audio environment.
  • the first and second audio-visual data are contiguous portions of a live video stream (with audio) showing Alice and perhaps also of part of her surroundings, being initial portions of the same video stream that will constitute Alice's contributions to the video conference, once established.
  • Bob's node 104 commences the process for receiving the key at step 302, in response to receiving a request to establish a secure communications channel.
  • the receipt of that hash by Bob's node 1 04 at step 304 provides the request.
  • Bob confirms or acknowledges receipt of the hash.
  • this can be an automated acknowledgement generated and sent by Bob's computer 108.
  • the acknowledgement is in the form of, or at least includes, audio-visual data representing Bob's verbal and visual acknowledgement of receipt of the hash, so that Alice can be confident that Bob genuinely has received the hash Hi, rather than a man-in-the-middle eavesdropper.
  • audio-visual data from Bob may also be in the form of a hash, and the unhashed audio-visual data sent subsequently.
  • Alice's node 102 receives Bob's confirmation of receipt of the hash, it transmits the public key PA and the first audio-visual data that was used to generate the hash H I . They are received by Bob's node 104 at step 306.
  • Bob's node 104 uses the same one-way function that was used by Alice's node 102 and in the same way to generate a hash of the received public key PA and the received first audio-visual data.
  • Bob's node 104 compares the received and the generated hashes. If they are equal to one another, Bob's node 1 04 deduces that the public key P A and the first audiovisual data that it received are the same as were used to generate the hash H
  • Bob's node 1 04 receives further or second audio-visual data from Alice's node 102 (sent by Alice in response to receipt of Bob's acknowledgement sent at step 3 1 1 ), and then, at step 3 14, the first and second audio-visual data are compared to assess their mutual congruence.
  • this step is performed automatically by Bob's node 104 (e.g. , by comparison of one or more selected properties of the audio-visual data; for example, pixel brightness and colour or spatial distributions thereof).
  • the assessment is performed manually by Bob, who compares the first and second audio-visual data to determine whether they are congruent (e.g.
  • this comparison can be facilitated by the software 502 displaying the first and second audio-visual data in a picture-in-picture or split screen arrangement.
  • the data is checked in both ways.
  • the software 502 can also be configured to automatically compare the two portions of the video stream to look for any discontinuities or other forms of inconsistency between them, and to generate an alert if any is found.
  • Bob's node 104 may also execute the process of Figure 2 to generate and transmit a public encryption key that is received by Alice's node 102 using the process of Figure 3, thereby providing an overall secure communications process such as the one shown in Figure 4, as is the case in the described video conferencing example scenario.
  • the transmission of the hash of Bob's public encryption key and first audio-visual data by Bob's node 1 04 at step 210 of the transmission process may be performed in response to receipt of the hash from Alice's node 104 and the receipt of Bob's hash by Alice therefore also serves the purpose of securely acknowledging receipt of the hash from Alice's node 102 (i.e. , step 305 of the receiving process).
  • both the initiating and the responding nodes 102, 104 execute the same processes, but in which the various steps where hashes and audiovisual data are transmitted to the other party are interleaved or performed antiphonally, as shown in Figure 4, thereby further enhancing the security of the processes.
  • Alice in response to receipt of Bob's hash, Alice then sends to Bob her public key and first audio-visual data, and when Bob receives those, he generates a corresponding hash and compares it to the hash he received from Alice, and only if the hashes match does Bob then send his own public key and first audio-visual data to Alice. Alice then generates a hash of Bob's public key and first audio-visual data and compares them to the hash previously received from Bob. Only if the hashes match does Alice then continue to send further audio-visual data to Bob.
  • Bob's node 104 can simply (generate and) encrypt Bob's public key with Alice's public key, and send the encrypted key to Alice, so that only Alice can determine Bob's public key.
  • Bob's node 104 can generate a symmetric session key (perhaps from Alice's public key), encrypt that with Alice's public key, and send the encrypted key to Alice for subsequent encrypted communications.
  • Bob's confirmation of receipt of Alice's hash at step 212 can still include audiovisual data representing Bob's audio-visual confirmation of Alice's hash.
  • a session key i. e. , a symmetric encryption key
  • the described processes can be used to provide secure communications between parties without using a third party to provide public keys.
  • the public and private encryption key pairs can be generated on demand at the beginning of a communications session, and in some embodiments are only temporarily stored in volatile memory of the communicating nodes. The keys, in particular the private keys, can then be securely destroyed at the end of the, communications session.
  • the described processes are particularly suited to videoconferencing, where the audiovisual data includes 'live' streaming audio and video of the conference participants, and these are used to assess congruence, either by the human participants or by the participating nodes themselves, or both.
  • the encryption key pairs can be generated once at the beginning of a communications session, or in some embodiments can be generated multiple times during the one session, either periodically, randomly, and/or in response to the arrival of a new participant in the conference and/or in response to the departure of an existing participant, and/or in response to an input from one of the participants, for example.
  • the described processes can be used, for example, in unmanned aircraft, drones, or other vessels or vehicles, where it is undesirable to have a private key stored in persistent memory in case of capture.
  • a computer in. an unmanned aircraft (Bob) generates a key pair on the fly and transmits its public key to its controller Alice using the processes described above.
  • the audio-visual data can include video captured by an onboard camera, for example, before or during take-off, showing the view of the camera of ground activity that is verifiable by Alice (for example, particular ground crew activity, perhaps with an aircraft identifier on the body or wing of the aircraft within the field of view of the camera).
  • Alice's public key can be sent to the aircraft using the processes described above, requiring the use of automated processes by Bob to assess congruence of audio-visual data received from Alice.
  • Many suitable processes for comparing audio and/or video data to assess congruence will be apparent to those skilled in the art.
  • the software 502 implementing the processes described above includes the public key of a body such as a government body authorised to intercept communications.
  • the described processes can be used in a tri-partite mode to allow authorised intercepts or, indeed, in a more general multi-partite mode communication among three or more parties.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A process for sending via a communications network a public encryption key of a first node of the network to a second node of the network, the process being executed by the first node, and including the steps of: generating first audio-visual data representing at least one of an audio and a visual environment of the first node; applying a one-way function to the public encryption key and the first audio-visual data to generate first digest data; sending the first digest data to the second node; receiving from the second node confirmation that the first digest data was received by the second node; subsequent to the step of receiving the confirmation, sending the public encryption key and the first audio-visual data to the second node to allow the second node to determine that the public encryption key and the first audio-visual data were used to generate the first digest data; sending second audio-visual data to the second node, the second audio-visual data being different to but congruent with the first audio-visual data to allow the second node to determine that the second audio-visual data is congruent with the first audio-visual data, and consequently that the public encryption key received by the second node is that of the first node.

Description

CRYPTOGRAPHIC PROCESSES
TECHNICAL FIELD
The present invention relates to processes and systems for secure communication, including processes for receiving and sending encryption keys and for establishing a secure communications channel, such as may be used for secure video conferencing.
BACKGROUND
Asymmetric encryption systems are known (e.g. , public key encryption systems such as RSA) in which a party A (conventionally known as 'Alice') has a pair of keys: a public key, which a counterparty B (conventionally known as 'Bob') uses to encrypt messages intended for Alice; and a corresponding private or secret key that Alice can use to decrypt messages encrypted using the public key. A message encrypted using Alice's public key cannot in theory feasibly be decrypted other than with Alice's private key. Typically, counterparty Bob will also have a key pair, one of which he makes public for Alice, for example, to encrypt messages to him, and one he keeps private for decryption of messages encrypted with his public key. When Alice and Bob have each other's public encryption keys, they may proceed -to exchange encrypted messages (for example, by email) which may, theoretically, only be decrypted by the intended recipient.
When the exchange of public keys can be conducted securely (e.g., face-to-face), the system as a whole is relatively secure. However, if Alice and Bob are remote from one another and communicating with one another by email, for example), there is the potential for an active eavesdropper (say, 'Eve') to compromise the security of the system in the following way. Assuming Eve is able to intercept all messages between Alice and Bob, when Alice and Bob initially exchange public keys, Eve can store the keys and substitute her own public key. Alice and Bob then believe that they have each other's public keys and they use them for communication. When Alice sends an encrypted message to Bob, it is intercepted by Eve, who is able to decrypt it using her private key, read it and then re- encrypt it using Bob's real public key before forwarding it to him. Communications from Bob to Alice are vulnerable in the same way. This type of attack on secure communication is known as a "Man in the Middle" attack.
In part to address these difficulties, infrastructures have been established (referred to as public key infrastructures or P I) in which trusted third parties (TTP) serve as repositories of public keys, taking responsibility for ensuring that the public keys do in fact belong to the relevant parties. If Bob has provided his public key to such a TTP (and confirmed his identity), Alice may obtain it from the TTP. Alternatively, if the TTP also has Alice's public key, the TTP may provide to both Alice and Bob a session key, encrypted with their respective public keys. Once decrypted, this session key can be used both to encrypt and decrypt communications between Alice and Bob, and is therefore referred to as a symmetric encryption key.
Such systems rely, however, upon one or both parties having provided their public keys to the TTP. Furthermore, where the TTP provides a session key, the TTP itself (or a further party who manages to breach the TTP's security) may be able to decrypt communications encrypted using that session key.
There is thus a need for a mechanism by which parties desiring to communicate with one another may communicate cryptographic keys without fear of interception and substitution, and without using a third party.
It is desired, therefore, to provide a process for sending via a communications network a public encryption key of a first node of said network to a second node of said network, a process for receiving via a communications network a public encryption key of a first node of the network, and a communications system that alleviate one or more difficulties of the prior art, or that at least provide a useful alternative. SUMMARY
In accordance with some embodiments of the present invention, there is provided a process for sending via a communications network a public encryption key of a first node of said network to a second node of said network, the process being executed by the first node, and including the steps of:
generating first audio-visual data representing at least one of an audio and a visual environment of the first node;
applying a one-way function to the public encryption key and the first audio-visual data to generate first digest data;
sending the first digest data to the second node;
receiving from the second node confirmation that the first digest data was received by the second node;
subsequent to the step of receiving the confirmation, sending the public encryption key and the first audio-visual data to the second node to allow the second node to determine that the public encryption key and the first audio-visual data were used to generate the first digest data;
sending second audio-visual data to the second node, the second audio-visual data being different to but congruent with the first audio-visual data to allow the second node to determine that the second audio-visual data is congruent with the first audio-visual data, and consequently that the public encryption key received by the second node is that of the first node.
In some embodiments, the confirmation that the first digest data was received by the second node includes digest data generated by the second node from audio-visual data representing at least one of an audio and a visual environment of the second node. In other embodiments, the confirmation that the first digest data was received by the second node includes digest data generated by the second node from audio-visual data representing at least one of an audio and a visual environment of the second node and from a public encryption key of the second node. In some embodiments, the process includes generating the public encryption key and a corresponding private encryption key at the beginning of a communications session, and disposing of at least the private encryption key at the end of the communications session. In some embodiments, the generated private encryption key is only stored in volatile memory.
In some embodiments, the process includes:
receiving, from the second node, encrypted data encrypted by the second node using the public encryption key of the first node; and
decrypting the received encrypted data using the private encryption key of the first node.
In some embodiments, the encrypted data includes encrypted audio-visual data representing at least one of an audio and a visual environment of the second node, including at least one of streaming audio and streaming video of one or more participants in a teleconference or a video conference.
In some embodiments, the process includes the steps of:
receiving third digest data from the communications network, the third digest data purportedly being sent from the second node and generated by applying a one way function to the public encryption key of the second node and third audio-visual data representing at least one of an audio environment and a visual environment of the second node;
sending to the second node a confirmation of receipt of the third digest data;
subsequent to the step of sending the confirmation of receipt to the second node, receiving via the communications network a public encryption key and audio-visual data, the received public encryption key purportedly being the public encryption key of the second node, and the received audio-visual data purportedly being the third audio-visual data of the second node;
applying the one-way function to the received encryption key and the received audio-visual data to generate fourth digest data; comparing the third digest data to the fourth digest data to determine whether the received encryption key and the received audio-visual data were used to generate the second digest data; and
only if said step of comparing determines that the received encryption key and the received audio-visual data were used to generate the second digest data, then:
receiving from the communications network fourth audio-visual data representing at least one of the audio environment and the visual environment of the second node, and comparing the fourth audio-visual data with the received audio-visual data to assess their mutual congruence, and, based on the assessment, determining whether the received public encryption key is that of the second node.
In some embodiments, the third digest data provides the confirmation that the first digest data was received by the second node, and the public encryption key and the first audiovisual data sent to the second node provide the confirmation of receipt of the third digest data.
In accordance with some embodiments of the present invention, there is provided a process for receiving via a communications network a public encryption key of a first node of the network, the process being executed by a second node of the network and including the steps of:
receiving first digest data from the communications network, the first digest data purportedly being sent from the first node and generated by applying a one way function to the public encryption key of the first node and first audio-visual data representing at least one of an audio environment and a visual environment of the first node;
sending to the first node a confirmation of receipt of the first digest data;
subsequent to the step of sending the confirmation, receiving via the communications network a public encryption key and audio-visual data, the received public encryption key purportedly being the public encryption key of the first node, and the received audio-visual data purportedly being the first audio-visual data of the first node; applying the one-way function to the received encryption key and the received audio-visual data to generate second digest data; comparing the first digest data to the second digest data to determine whether the received encryption key and the received audio-visual data were used to generate the first digest data; and
only if said step of comparing determines that the received encryption key and the received audio-visual data were used to generate the first digest data, then:
receiving from the communications network second audio-visual data representing at least one of the audio environment and the visual environment of the first node, and comparing the second audio-visual data with the received audio-visual data to assess their mutual congruence, and, based on the assessment, determining whether the received public encryption key is that of the first node.
In accordance with some embodiments of the present invention, there is provided a process for establishing secure communications between first and second nodes of a communications network, the process being executed by the first node, and including the steps of:
sending via the communications network a public encryption key of a first node of said network to a second node of said network by executing any one of the above processes for sending; and
receiving via the communications network a public encryption key of the second node of the network by executing any one of the above processes for receiving, but with the first node acting as the second node, and vice-versa.
In accordance with some embodiments of the present invention, there is provided a process for establishing secure communications between a first party and a second party, wherein each party sends its public encryption key to the other party using any one of the above processes for sending, and each party receives the public encryption key of the other party using any one of the above processes for receiving, and wherein the sending steps are performed by the parties antiphonally. In some embodiments, the audio-visual data being assessed for congruence are contiguous portions of an audio-visual data stream. In some embodiments, the first audio-visual data represents request to communicate via the communications network.
In some embodiments, each said audio-visual data represents an audio and a visual environment of the corresponding node.
In accordance with some embodiments of the present invention, there is provided a communications system configured to execute any one of the above processes. In accordance with some embodiments of the present invention, there is provided at least one computer-readable medium storing computer-executable instructions that, when executed by at least one processor of a computer system, cause the processor to execute any one of the above processes. In accordance with some embodiments of the present invention, there is provided a communications system, including:
a first node of a communications network for use by a first party; and
a second node of a communications network for use by a second party;
wherein the first node is configured to send its public encryption key to the second party by executing any one of the above processes for sending, and wherein the second node is configured to receive the public encryption key of the first node by executing any one of the above processes for receiving.
In accordance with some embodiments of the present invention, there is provided a communications device configured for secure communications with at least one other communications device of the same type over a communications network, the communications device including:
an audio-visual input module configured to receive captured audio and/or video of the environment of the communications device;
a hash component configured to generate first digest data by applying a one-way function to a public encryption key of the communications device and first audio-visual data representing the captured audio and/or visual environment of the communications device; and
a transmission component configured to send the first digest data to the other communications device, and, responsive to receipt of a confirmation that the first digest data was received by the other communications device, to send the public encryption key and the first audio-visual data to the other communications device to allow it to determine that the public encryption key and the first audio-visual data were used to generate the first digest data; and to send second audio-visual data to the other communications device, the second audio-visual data being different to but congruent with the first audio-visual data to allow the other communications device to determine that the second audio-visual data is congruent with the first audio-visual data,' and consequently that the public encryption key received by the other communications device is that of the communications device.
Also described herein is a method for receiving an encryption key via a communications link, comprising the steps of:
a) receiving a first digest data item via the communications link;
b) subsequent to step a), receiving an encryption key and a first audio-visual data item via the communications link;
c) subsequent to step b), performing a one-way function on the encryption key and the audio-visual data item to generate a second digest data item and comparing the first and second digest data items to confirm that the encryption key and the first audio-visual data item were used to generate the first digest data item; and
d) subsequent to step c), receiving a second audio-visual data item, comparing it with the first audio-visual data item to determine a degree of congruence and, depending upon the degree of congruence, determining whether there is an eavesdropper active on the communications link.
Also described herein is a method for transmitting an encryption key via a communications link, comprising the steps of:
a) performing a one-way function on an encryption key and a first audio-visual data item to generate a first digest data item, and transmitting the first digest data item via the communications link: b) subsequent to step a), transmitting the encryption key and the first audiovisual data item via the communications link;
c) subsequent to step b), transmitting a second audio-visual data item via the communications link, wherein the second audio-visual data item is congruent with the first audio-visual data item. -
Also described herein is a method for establishing secure communications comprising receiving a remote party's encryption key by the above method for receiving and transmitting a local party's encryption key by the above method for transmitting.
Also described herein is a method for establishing a secure communications channel between a first party and a second party, wherein each party transmits a respective encryption key to the other party using the above method for transmitting and each party receives the others party by the above method for receiving, and wherein the transmission steps are performed by the parties antiphonally.
In each of the above aspects, the audio-visual data items are preferably contiguous audiovisual data, which are preferably generated contemporaneously with the other steps of the method. In a particularly preferred embodiment, the contiguous audio-visual data includes audio-visual data of a party desiring to communicate via the communications link.
Also described herein is a communications system comprising:
a communications channel;
a first computer system for use by a first party; and
a second computer system for use by a second party;
wherein the first computer system is adapted to transmit an encryption key to the second party by the method of the second aspect, and wherein the second computer system is adapted to receive the encryption key by the method of the first aspect. The respective first and second audio-visual data items transmitted by the first and second computer systems are preferably contiguous audio-visual data, preferably generated by the first and second computer systems based upon their respective environments. In a particularly preferred embodiment, the respective contiguous audio-visual data transmitted by the first and second computer systems include audio-video streams of respective users of the first and second computer systems.
Also described herein is a method for receiving an encryption key via a communications link, comprising the steps of:
a) receiving a first digest data item via the communications link;
b) subsequent to step a), receiving an encryption key and a first audio-visual data item via the communications link;
c) subsequent, to step b), performing a one-way function on the encryption key and the aiidio-visual data item to generate a second digest data item and comparing the first1 and second digest data items to confirm that the encryption key and the first audio-visual data item were used to generate the first digest data item; and
d) ' subsequent to step c), receiving a second audio-visual data item, and comparing it with the first audio-visual data item to assess the congruence of the first and second audio-visual data items and the security of the communications link.
BRIEF DESCRIPTION OF THE DRAWINGS
Some embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
Figure 1 is a schematic representation of an embodiment of a communications system;
Figure 2 is a flow diagram of a process for sending via a communications network a public encryption key of a first node of said network to a second node of said network;
Figure 3 is a flow diagram of a process for receiving via a communications network a public encryption key of a node of the network;
Figure 4 is a flow diagram illustrating the information flow in the steps occurring in an embodiment of a process for establishing a secure communications channel; and Figure 5 is a schematic diagram of a computer system in which embodiments of the present invention may be implemented. DETAILED DESCRIPTION
With reference to Figure 1 , a communications system 100 for secure communication between first and second parties includes a first node 102 for use by the first party ('Alice') and a second node 104 for use by the second party (' Bob'), the first and second nodes 102, 104 being nodes of a communications network. In the described embodiments, each of the network nodes 102, 104 is also referred to herein as a "computer system", where the scope of this term in this specification includes not only general purpose computers executing software instructions that cause the computer to behave as described below, but also devices and systems configured for more specific purposes, such as teleconference or video conference devices, systems, and/or hardware, mobile telephones and the like. Between the computer systems 102, 104 is a communications channel 106 by which data is exchanged between the computer systems 102, 104. The communications channel 106 may be or include any form or forms of communications channel, including, for example, a fixed wire network, a wireless network, a telecommunications channel such as a mobile telecommunications channel, a TCP/IP communications channel via a computer network such as the Internet, or any combination of these. Thus the computer systems 102, 104 constitute respective nodes of a communications network.
In the specific example now described, the nodes 102, 104 are general purpose computers (for which the reference numerals 102, 104 will continue to be used) executing video telephony software that causes the computers 102, 104 to effect the processes described herein. In some embodiments, the software is a complete (perhaps open source) video telephony solution. In others, the processes described herein are provided by a plug-in module for an existing video telephony solution such as Skype. In the described embodiments, the standard computer systems are 32-bit or 64-bit Intel Architecture based computer systems 500, as shown in Figure 5, and the described jn the form of programming instructions of one or more software modules 502 stored on non-volatile (e.g. , hard disk) storage 504 associated with the computer system, as shown in Figure 5. However, it will be apparent that in other embodiments at least parts of the processes could alternatively be implemented as one or more dedicated hardware components, such as application-specific integrated circuits (ASICs) and/or field programmable gate arrays (FPGAs), for example.
Each of the computer systems 500 includes standard computer components, including random access memory (RAM) 506, at least one processor 508, and external interfaces 510, 512, 514, all interconnected by a bus 516. The external interfaces include universal serial bus (USB) interfaces 510, a network interface connector (NIC) 512 which connects the system 500 to a communications network such as the Internet, a display adapter 514, which is connected to a display device such as an LCD panel display 522.
At least one of the universal serial bus (USB) interfaces 510 is connected to a keyboard and a pointing device such as a mouse 518, at least one other being connected to a video camera and microphone 1 12, 1 14, which may be in the form of physically separate devices or an integrated video/audio capture device. In some embodiments, the microphone and video devices are integrated components internal to the computer system 500, as is the case, for example, where the computer system 500 is an Apple iMac desktop computer.
With reference to Figures 2 and 3, processes for transmitting and receiving public keys will now be described, followed by a description of an exemplary scenario in which the processes are used. In the following description, for convenience only, and unless the context indicates otherwise, it will be apparent that a reference to Bob can be understood as a reference to Bob's node or computer 104, and a reference to Alice can be understood as a reference to Alice's node or computer 102.
Turning to Figure 2 (and with reference also to the data flow illustrated in Figure 4), a process for transmitting a public key is initiated at step 202 when, for example, a user, Alice, of the first node 102 indicates (to the software 502) a desire to communicate securely with the user, Bob, of the second node 104. At step 204, Alice's node 102 generates a new public/private encryption key pair (PA,XA) (e.g., an RSA key pair). The private key XA is to be kept secret for decryption of received data, and in some embodiments is only ever stored temporarily in volatile RAM 506 for the duration of the communications session. The public key PA is to be transmitted to the second node 104 for the encryption of data to be sent to Alice. At step 206, Alice's node 102 generates first audio-visual data representing at least one of its audio and its visual environment, using the associated microphone and/or camera 1 12.
In general, the term "audio-visual" as used in this specification is to be construed broadly as encompassing:(i) audio only, (ii) visual only, and (iii) both audio and visual. The term "visual" ' in this context means purely visual or image information without an audio component, and most broadly means at least one image or video frame but is typically a sequence of temporally contiguous video frames; e.g. , a video stream. In this specification, the terms "audio environment" and "visual environment" in respect of a node/computer system and/or its user refer respectively to audio information and visual information capable of being captured by a microphone and pure imaging device (i. e. , without sound), respectively, and being capable of distinguishing the environment of the node/computer system itself and/or its user from the environments of other nodes/computer systems and/or their users.
In the described example scenario, and most typically, the first audio-visual data represents both visual and audio information of Alice herself (being deemed a part of the 'environment' of her node 102) and optionally also of her immediate surroundings. That is, the visual component of the first audio-visual data represents at least one image (and typically a sequence of video frames, more typically being a portion of a 'live' video stream) of Alice herself, and optionally also her immediate surroundings (e.g. , the room she is in and associated furniture/decor, the chair she is sitting in, etc). The audio component of the first audio-visual data thus represents the accompanying sounds of Alice and/or her environment; for example, Alice talking and/or music and/or ambient sounds (musical or otherwise) of Alice's environment. Where the audio includes music, the music component may be ambient music playing in Alice's environment, and in some embodiments includes a music file stored on Alice's computer 106 that is played by the software 502. The music file may be one of a plurality of music files randomly selected by the software 502.
At step 208, using a one-way function (e.g. , the cryptographic hash algorithms SHA-2 family of functions), Alice's node 102 generates at least one hash Hi as a function of at least a portion of the captured first audio-visual data (e.g. , the visual component may be a predetermined frame, a predetermined number of consecutive frames, or a predetermined scheme of non-consecutive frames, but most typically a portion of a live stream of video with accompanying audio) and of Alice's public key PA- For convenience of description, the hash Hi is generally described herein as being a single hash Hi . However, where the first audio-visual data represents a relatively long portion of streaming video, multiple hashes may need to be generated; however, only one of them (e.g. , the first one) needs to include Alice's public key. At step 210, Alice's node 102 transmits the hash H i via the communications link 1 10 to Bob's node 104. Bob's node 104 receives the hash Hi and in response issues an acknowledgement confirming receipt of the hash H| . The acknowledgement/confirmation is then received at Alice's node 102 at step 212. As it is important that the acknowledgement confirming receipt is genuinely from Bob's node 104 and not from a man-in-the-middle eavesdropper 'Eve', in the described embodiments the acknowledgement is provided in the form of an audio-visual confirmation from Bob himself that the hash has been received, so that Alice and/or Alice's software 502 can be confident that Bob has indeed received the hash Hi .
Only after Alice has received Bob's confirmation of receipt of the hash H I , then at step 214 Alice's node 102 transmits the public key PA and the first audio-visual data that were used to generate the hash Hi, and, at step 216, continues to transmit audio-visual data (referred to herein as 'second' audio-visual data to avoid confusion) captured by the camera/microphone 1 12 and congruent with the first audio-visual data that was used to generate the hash H| . By congruent is meant that it can be determined by a remote party fin this case. Bob himself and/or Bob's node 104) that the source of the audio- visual data is the same. Typically, this will be the case where the first and second audio-visual data are contiguous respective portions of a stream of audio-visual data, such that there is temporal continuity between the first and second audio-visual data, but this might not always be the case in other embodiments.
For example, it may be apparent to a viewer of the audio-visual data that the data is of a single source or origin, even if the audio-visual data is not continuous in a temporal sense. The audio-visual data may, for example, include video of Alice or Bob in which details of their background, attire, etc are visible. In such cases, a viewer may be able to determine that the audio-visual data is of a single source or origin, notwithstanding a small temporal discontinuity. Alternatively or additionally, congruence can be assessed by a computer or other processing device analysing one or more properties of the audio-visual data, for example, pixel colour and brightness. Congruence can also be assessed, at least in part, using an audio component of the audio- visual data captured by the camera/microphone 1 12 to overcome difficulties caused, for example, by Alice being silent at the time of capture. As described above, this audio component may include audio generated from an audio file stored on Alice's computer 102. Similarly, a video file stored on Alice's computer 106 can be used to assess congruence, rather than, or in addition to, captured video of Alice, although the use of stored information alone may be less robust than using the 'live' or real- time visual and/or audio environment.
In the described example scenario, the first and second audio-visual data are contiguous portions of a live video stream (with audio) showing Alice and perhaps also of part of her surroundings, being initial portions of the same video stream that will constitute Alice's contributions to the video conference, once established.
Turning to Figure 3, a process for receiving a public key (for example, the Alice's transmitted public encryption key as discussed above) will now be described. In the described embodiments, Bob's node 104 commences the process for receiving the key at step 302, in response to receiving a request to establish a secure communications channel. In one embodiment, assuming that Alice's node 102 has transmitted a hash H| at step 210 of the process discussed above with reference to Figure 2, the receipt of that hash by Bob's node 1 04 at step 304 provides the request.
At step 305, Bob confirms or acknowledges receipt of the hash. As described above, in some embodiments, this can be an automated acknowledgement generated and sent by Bob's computer 108. However, in the described embodiments, the acknowledgement is in the form of, or at least includes, audio-visual data representing Bob's verbal and visual acknowledgement of receipt of the hash, so that Alice can be confident that Bob genuinely has received the hash Hi, rather than a man-in-the-middle eavesdropper. As described below, such audio-visual data from Bob may also be in the form of a hash, and the unhashed audio-visual data sent subsequently.
As discussed above, once Alice's node 102 receives Bob's confirmation of receipt of the hash, it transmits the public key PA and the first audio-visual data that was used to generate the hash H I . They are received by Bob's node 104 at step 306.
At step 308, Bob's node 104 uses the same one-way function that was used by Alice's node 102 and in the same way to generate a hash of the received public key PA and the received first audio-visual data.
At step 3 10, Bob's node 104 compares the received and the generated hashes. If they are equal to one another, Bob's node 1 04 deduces that the public key PA and the first audiovisual data that it received are the same as were used to generate the hash H|, and sends a corresponding acknowledgement to Alice's node 1 02 at step 3 1 1 . However, Bob's node 1 04 is not yet able to be sure that the hash, the audio-visual data and the public key were all not substituted by an eavesdropper.
At step 3 1 2, Bob's node 1 04 receives further or second audio-visual data from Alice's node 102 (sent by Alice in response to receipt of Bob's acknowledgement sent at step 3 1 1 ), and then, at step 3 14, the first and second audio-visual data are compared to assess their mutual congruence. In some embodiments, this step is performed automatically by Bob's node 104 (e.g. , by comparison of one or more selected properties of the audio-visual data; for example, pixel brightness and colour or spatial distributions thereof). In other embodiments, the assessment is performed manually by Bob, who compares the first and second audio-visual data to determine whether they are congruent (e.g. , that they are of the same subject, that the background is the same and that there are no temporal discontinuities in the visual and audio components). In some embodiments, this comparison can be facilitated by the software 502 displaying the first and second audio-visual data in a picture-in-picture or split screen arrangement. In some embodiments, the data is checked in both ways. In the described example scenario where the first and second audio-visual data are contiguous portions of a live video stream (with audio) of Alice, it is usually readily straightforward for Bob (and any other co-located participants) to manually assess whether the two portions of the video stream of Alice are congruent. However, as a precaution, the software 502 can also be configured to automatically compare the two portions of the video stream to look for any discontinuities or other forms of inconsistency between them, and to generate an alert if any is found.
In any event, once it has been established that the data are congruent, it follows that the data now being received from Alice's node 1 02 was generated in the same way as was the data that was used to generate the hash Hi . Furthermore, since the data that was used to generate the hash was not transmitted until after the hash had been received by Bob, it follows that the received public key PA was also the public key used by Alice's node 102 to generate the hash. That is to say, the public key received by Bob has not been substituted by a man-in-the-middle eavesdropper but is genuinely Alice's public key.
While the foregoing description has made reference to a key being transmitted by Alice's node 102 and being received by Bob's node 104, it will be apparent that Bob's node 104 may also execute the process of Figure 2 to generate and transmit a public encryption key that is received by Alice's node 102 using the process of Figure 3, thereby providing an overall secure communications process such as the one shown in Figure 4, as is the case in the described video conferencing example scenario. In such cases, the transmission of the hash of Bob's public encryption key and first audio-visual data by Bob's node 1 04 at step 210 of the transmission process may be performed in response to receipt of the hash from Alice's node 104 and the receipt of Bob's hash by Alice therefore also serves the purpose of securely acknowledging receipt of the hash from Alice's node 102 (i.e. , step 305 of the receiving process). In this arrangement, both the initiating and the responding nodes 102, 104 execute the same processes, but in which the various steps where hashes and audiovisual data are transmitted to the other party are interleaved or performed antiphonally, as shown in Figure 4, thereby further enhancing the security of the processes.
Accordingly, as shown in Figure 4, in response to receipt of Bob's hash, Alice then sends to Bob her public key and first audio-visual data, and when Bob receives those, he generates a corresponding hash and compares it to the hash he received from Alice, and only if the hashes match does Bob then send his own public key and first audio-visual data to Alice. Alice then generates a hash of Bob's public key and first audio-visual data and compares them to the hash previously received from Bob. Only if the hashes match does Alice then continue to send further audio-visual data to Bob. As an alternative to such two-way processes, once Bob's node 104 has securely received Alice's public key as described above, Bob's node 104 can simply (generate and) encrypt Bob's public key with Alice's public key, and send the encrypted key to Alice, so that only Alice can determine Bob's public key. Alternatively, Bob's node 104 can generate a symmetric session key (perhaps from Alice's public key), encrypt that with Alice's public key, and send the encrypted key to Alice for subsequent encrypted communications. In either case, Bob's confirmation of receipt of Alice's hash at step 212 can still include audiovisual data representing Bob's audio-visual confirmation of Alice's hash.
Once the public encryption keys have been exchanged by the processes described above, further communication (e.g. , video conferencing) can be encrypted using the public keys so exchanged. The public keys may then also be used to encrypt documents to be transmitted between the parties. In some embodiments, a session key (i. e. , a symmetric encryption key) is agreed securely between the parties using the public keys to reduce the volume of traffic encrypted using the public encryption keys. It will be apparent from the above description that the described processes can be used to provide secure communications between parties without using a third party to provide public keys. Indeed, the public and private encryption key pairs can be generated on demand at the beginning of a communications session, and in some embodiments are only temporarily stored in volatile memory of the communicating nodes. The keys, in particular the private keys, can then be securely destroyed at the end of the, communications session.
The described processes are particularly suited to videoconferencing, where the audiovisual data includes 'live' streaming audio and video of the conference participants, and these are used to assess congruence, either by the human participants or by the participating nodes themselves, or both. The encryption key pairs can be generated once at the beginning of a communications session, or in some embodiments can be generated multiple times during the one session, either periodically, randomly, and/or in response to the arrival of a new participant in the conference and/or in response to the departure of an existing participant, and/or in response to an input from one of the participants, for example.
Moreover, one or more of the participants need not be human. For example, the described processes can be used, for example, in unmanned aircraft, drones, or other vessels or vehicles, where it is undesirable to have a private key stored in persistent memory in case of capture. In one embodiment, a computer in. an unmanned aircraft (Bob) generates a key pair on the fly and transmits its public key to its controller Alice using the processes described above. In such cases, the audio-visual data can include video captured by an onboard camera, for example, before or during take-off, showing the view of the camera of ground activity that is verifiable by Alice (for example, particular ground crew activity, perhaps with an aircraft identifier on the body or wing of the aircraft within the field of view of the camera). Alice's public key can be sent to the aircraft using the processes described above, requiring the use of automated processes by Bob to assess congruence of audio-visual data received from Alice. Many suitable processes for comparing audio and/or video data to assess congruence will be apparent to those skilled in the art. In yet further embodiments, the software 502 implementing the processes described above includes the public key of a body such as a government body authorised to intercept communications. Alternatively, the described processes can be used in a tri-partite mode to allow authorised intercepts or, indeed, in a more general multi-partite mode communication among three or more parties.
Many modifications will be apparent to those skilled in the art without departing from the scope of the present invention.

Claims

CLAIMS:
A process for sending via a communications network a public encryption key of a first node of said network to a second node of said network, the process being executed by the first node, and including the steps of:
generating first audio- visual data representing at least one of an audio and a visual environment of the first node;
applying a one-way function to the public encryption key and the first audio-visual data to generate first digest data;
sending the first digest data to the second node;
receiving from the second node confirmation that the first digest data was received by the second node;
subsequent to the step of receiving the confirmation, sending the public encryption key and the first audio-visual data to the second node to allow the second node to determine that the public encryption key and the first audio-visual data were used to generate the first digest data;
sending second audio-visual data to the second node, the second audiovisual data being different to but congruent with the first audio-visual data to allow the second node to determine that the second audio-visual data is congruent with the first audio-visual data, and consequently that the public encryption key received by the second node is that of the first node.
The process of claim 1 , wherein the confirmation that the first digest data was received by the second node includes digest data generated by the second node from audio-visual data representing at least one of an audio and a visual environment of the second node.
3. The process of claim 1 , wherein the confirmation that the first digest data was received by the second node includes digest data generated by the second node from audio-visual data representing at least one of an audio and a visual environment of the second node and from a public encryption key of the second node.
4. The process of any one of claims 1 to 3, including generating the public encryption key and a corresponding private encryption key at the beginning of a communications session, and disposing of at least the private encryption key at the end of the communications session.
5. The process of claim 4, wherein the generated private encryption key is only stored in volatile memory.
6. The process of any one of claims 1 to 5, including:
receiving, from the second node, encrypted data encrypted by the second node using the public encryption key of the first node; and
decrypting the received encrypted data using the private encryption key of the first node.
7. The process of claim 6, wherein the encrypted data includes encrypted audio-visual data representing at least one of an audio and a visual environment of the second node, including at least one of streaming audio and streaming video of one or more participants in a teleconference or a video conference.
!
8. The process of any one of claims 1 to 7, incl uding the steps of:
receiving third digest data from the communications network, the third digest data purportedly being sent from the second node and generated by applying a one way function to the public encryption key of the second node and third audio- visual data representing at least one of an audio environment and a visual environment of the second node;
sending to the second node a confirmation of receipt of the third digest data; subsequent to the step of sending the confirmation of receipt to the second node, receiving via the communications network a public encryption key and audio-visual data, the received public encryption key purportedly being the public encryption key of the second node, and the received audio-visual data purportedly being the third audio- visual data of the second node;
applying the one-way function to the received encryption key and the received audio-visual data to generate fourth digest data;
comparing the third digest data to the fourth digest data to determine whether the received encryption key and the received audio-visual data were used to generate the second digest data; and
only if said step of comparing determines that the received encryption key and the received audio-visual data were used to generate the second digest data, then:
receiving from the communications network fourth audio-visual data representing at least one of the audio environment and the visual environment of the second node, and comparing the fourth audio-visual data with the received audio-visual data to assess their mutual congruence, and, based on the assessment, determining whether the received public encryption key is that of the second node.
9. The process of claim 8, wherein the third digest data provides the confirmation that the first digest data was received by the second node, and the public encryption key and the first audio-visual data sent to the second node provide the confirmation of receipt of the third digest data. A process for receiving via a communications network a public encryption key of a first node of the network, the process being executed by a second node of the network and including the steps of:
receiving first digest data from the communications network, the first digest data purportedly being sent from the first node and generated by applying a one way function to the public encryption key of the first node and first audio-visual data representing at least one of an audio environment and a visual environment of the first node;
sending to the first node a confirmation of receipt of the first digest data; subsequent to the step of sending the confirmation, receiving via the communications network a public encryption key and audio-visual data, the received public encryption key purportedly being the public encryption key of the first node, and the received audio-visual data purportedly being the first audiovisual data of the first node;
applying the one-way function to the received encryption key and the received audio-visual data to generate second digest data;
comparing the first digest data to the second digest data to determine whether the received encryption key and the received audio-visual data were used to generate the first digest data; and
only if said step of comparing determines that the received encryption key and the received audio-visual data were used to generate the first digest data, then: receiving from the communications network second audio-visual data representing at least one of the audio environment and the visual environment of the first node, and comparing the second audio-visual data with the received audiovisual data to assess their mutual congruence, and, based on the assessment, determining whether the received public encryption key is that of the first node.
1 1. A process for establishing secure communications between first and second nodes of a communications network, the process being executed by the first node, and including the steps of:
sending via the communications network a public encryption key of a first node of said network to a second node of said network by executing the process of any one of claims 1 to 7; and
receiving via the communications network a public encryption key of the second node of the network by executing the process of claim 10, but with the first node acting as the second node, and vice-versa.
12. A process for establishing secure communications between a first party and a second party, wherein each party sends its public encryption key to the other party using the process of any one of claims 1 to 7, and each party receives the public encryption key of the other party using the process of claim 10, and wherein the sending steps are performed by the parties antiphonally.
13. The process of any one of claims 1 to 12, wherein the audio-visual data being assessed for congruence are contiguous portions of an audio-visual data stream.
14. The process of claim 13, wherein the first audio-visual data represents request to communicate via the communications network.
15. The process of any one of claims 1 to 14, wherein each said audio-visual data represents an audio and a visual environment of the corresponding node.
16. A communications system configured to execute the process of any one of claims 1 to 15. 17. At least one computer-readable medium storing computer-executable instructions that, when executed by at least one processor of a computer system, cause the processor to execute the process of any one of claims 1 to 15.
18. A communications system, including:
a first node of a communications network for use by a first party; and a second node of a communications network for use by a second party; wherein the first node is configured to send its public encryption key to the second party by executing the process of any one of claims 1 to 7, and wherein the second node is configured to receive the public encryption key of the first node by executing the process of claim 10.
19. A communications device configured for secure communications with at least one other communications device of the same type over a communications network, the communications device including:
an audio-visual input module configured to receive captured audio and/or video of the environment of the communications device;
a hash component configured to generate first digest data by applying a one-way function to a public encryption key of the communications device and first audio-visual data representing the captured audio and/or visual environment of the communications device; and
a transmission component configured to send the first digest data to the other communications device, and, responsive to receipt of a confirmation that the first digest data was received by the other communications device, to send the public encryption key and the first audio-visual data to the other communications device to allow it to determine that the public encryption key and the first audio-visual data were used to generate the first digest data; and to send second audio-visual data to the other communications device, the second audio-visual data being different to but congruent with the first audio-visual data to allow the other communications device to determine that the second audio-visual data is congruent with the first audio-visual data, and consequently that the public encryption key received by the other communications device is that of the communications device.
PCT/AU2012/000843 2011-07-14 2012-07-13 Cryptographic processes WO2013006919A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP12811527.6A EP2732577A4 (en) 2011-07-14 2012-07-13 Cryptographic processes
US14/232,795 US20150256334A1 (en) 2011-07-14 2012-07-13 Cryptographic processes
AU2012283683A AU2012283683A1 (en) 2011-07-14 2012-07-13 Cryptographic processes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2011902809A AU2011902809A0 (en) 2011-07-14 Cryptographic methods
AU2011902809 2011-07-14

Publications (1)

Publication Number Publication Date
WO2013006919A1 true WO2013006919A1 (en) 2013-01-17

Family

ID=47505429

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/AU2012/000843 WO2013006919A1 (en) 2011-07-14 2012-07-13 Cryptographic processes
PCT/AU2012/000842 WO2013006918A1 (en) 2011-07-14 2012-07-13 Cryptographic processes

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/AU2012/000842 WO2013006918A1 (en) 2011-07-14 2012-07-13 Cryptographic processes

Country Status (4)

Country Link
US (1) US20150256334A1 (en)
EP (1) EP2732577A4 (en)
AU (1) AU2012283683A1 (en)
WO (2) WO2013006919A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102061724B1 (en) * 2014-04-07 2020-01-02 마이크로 모우션, 인코포레이티드 Apparatus and method for detecting asymmetric flow in vibrating flowmeters

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020029350A1 (en) * 2000-02-11 2002-03-07 Cooper Robin Ross Web based human services conferencing network
US20080263648A1 (en) * 2007-04-17 2008-10-23 Infosys Technologies Ltd. Secure conferencing over ip-based networks

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2366470B (en) * 2000-08-25 2005-07-20 Hewlett Packard Co Improvements relating to document transmission techniques iv
US7804961B2 (en) * 2000-12-19 2010-09-28 Qualcomm Incorporated Method and apparatus for fast crytographic key generation
GB2367986B (en) * 2001-03-16 2002-10-09 Ericsson Telefon Ab L M Address mechanisms in internet protocol
GB2381700B (en) * 2001-11-01 2005-08-24 Vodafone Plc Telecommunication security arrangements and methods
NO319858B1 (en) * 2003-09-26 2005-09-26 Tandberg Telecom As Identification procedure
US7434051B1 (en) * 2003-09-29 2008-10-07 Sun Microsystems, Inc. Method and apparatus for facilitating secure cocktail effect authentication
US8572387B2 (en) * 2006-07-26 2013-10-29 Panasonic Corporation Authentication of a peer in a peer-to-peer network
GB0811210D0 (en) * 2008-06-18 2008-07-23 Isis Innovation Improvements related to the authentication of messages
US8621210B2 (en) * 2008-06-26 2013-12-31 Microsoft Corporation Ad-hoc trust establishment using visual verification

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020029350A1 (en) * 2000-02-11 2002-03-07 Cooper Robin Ross Web based human services conferencing network
US20080263648A1 (en) * 2007-04-17 2008-10-23 Infosys Technologies Ltd. Secure conferencing over ip-based networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2732577A4 *

Also Published As

Publication number Publication date
EP2732577A1 (en) 2014-05-21
WO2013006918A1 (en) 2013-01-17
EP2732577A4 (en) 2015-06-24
AU2012283683A1 (en) 2014-01-23
US20150256334A1 (en) 2015-09-10

Similar Documents

Publication Publication Date Title
US11362811B2 (en) Secure telecommunications
US10389694B2 (en) System and method for non-replayable communication sessions
US10230524B2 (en) Securely transferring user information between applications
US11405370B1 (en) Secure file transfer
US8824684B2 (en) Dynamic, selective obfuscation of information for multi-party transmission
US20180367540A1 (en) Controlling access to content
US20100005497A1 (en) Duplex enhanced quality video transmission over internet
US11736492B2 (en) Signed contact lists for user authentication in video conferences
US9954909B2 (en) System and associated methodology for enhancing communication sessions between multiple users
US11882215B2 (en) Handling joining and leaving of participants in videoconferencing with end-to-end encryption
US12069036B2 (en) Encrypted shared state for electronic conferencing
US10855846B1 (en) Encrypting multiple party calls
US20220078169A1 (en) Methods, systems, and media for providing secure network communications
US20150256334A1 (en) Cryptographic processes
US20230188509A1 (en) Enabling and disabling end-to-end encryption in multiparty conference
US20160099980A1 (en) Split screen teleconferencing
Xing et al. Intrusions into privacy in video chat environments: Attacks and countermeasures
CN114710420A (en) Hybrid network monitoring system based on active network technology
CN116016994A (en) Method suitable for high-efficiency video encryption transmission of Internet
KR20190004180A (en) Method and apparatus for mansging data based on credit information
Felker Security and efficiency concerns with distributed collaborative networking environments

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12811527

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2012811527

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2012811527

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2012283683

Country of ref document: AU

Date of ref document: 20120713

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 14232795

Country of ref document: US