GB2600348A - Video compression and decompression using neural networks - Google Patents

Video compression and decompression using neural networks Download PDF

Info

Publication number
GB2600348A
GB2600348A GB2201144.9A GB202201144A GB2600348A GB 2600348 A GB2600348 A GB 2600348A GB 202201144 A GB202201144 A GB 202201144A GB 2600348 A GB2600348 A GB 2600348A
Authority
GB
United Kingdom
Prior art keywords
images
features
image
neural networks
computer system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2201144.9A
Other versions
GB202201144D0 (en
Inventor
Liu Ming-Yu
Wang Ting-Chun
Mohanray Mallya Arun
Tapani Karras Tero
Matias Laine Samuli
Patrick Luebke David
Lehtinen Jaakko
Samuli Aittala Miika
Oskari Aila Timo
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nvidia Corp
Original Assignee
Nvidia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/069,253 external-priority patent/US20210329306A1/en
Application filed by Nvidia Corp filed Critical Nvidia Corp
Publication of GB202201144D0 publication Critical patent/GB202201144D0/en
Publication of GB2600348A publication Critical patent/GB2600348A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/20Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video object coding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing

Abstract

Apparatuses, systems, and techniques to perform compression of video data using neural networks to facilitate video streaming, such as video conferencing. In at least one embodiment, a sender transmits to a receiver a key frame from video data and one or more keypoints identified by a neural network from said video data, and a receiver reconstructs video data using said key frame and one or more received keypoints.

Claims (74)

1. A processor comprising: one or more circuits to use one or more neural networks of a first computer system to identify one or more features of one or more objects within one or more images, wherein the one or more features are to be used to regenerate the one or more objects of the one or more images after being transmitted over a network to a second computer system.
2. The processor of claim 1, wherein: the one or more images comprise a first image and one or more additional images; the one or more neural networks identify the one or more features of the one or more objects in the one or more additional images; the first image and the one or more features are transmitted over the network to the second computer system; and the first image and the one or more features are usable to regenerate the one or more additional images.
3. The processor of claim 2, wherein the one or more neural networks use one or more graphics processing units of the first computer system to identify the one or more features.
4. The processor of claim 2, wherein the one or more features indicate anatomical aspects of the one or more objects.
5. The processor of claim 2, wherein the one or more images are one or more video frames.
6. The processor of claim 2, wherein the second computer system is a computing resource services provider.
7. The processor of claim 2, wherein the first image is determined from the one or more images based, at least in part, on a request from the second computer system.
8. A processor comprising: one or more circuits to use one or more neural networks on a first computer system to generate one or more images based, at least in part, on one or more features of one or more objects of the one or more images received from a second computer system.
9. The processor of claim 8, wherein: the one or more images comprise a first image received from the second computer system; the one or more neural networks generate the one or more images based, at least in part, on the first image and the one or more features.
10. The processor of claim 9, wherein the first computer system comprises one or more parallel processing units usable by the one or more neural networks to generate the one or more images.
11. The processor of claim 9, wherein the first computer system is a computing resource services provider comprising one or more parallel processing units usable by the one or more neural networks.
12. The processor of claim 9, wherein the one or more neural networks determine that the one or more features are usable to generate the one or more images using the first image.
13. The processor of claim 12, wherein the first computer system requests a second image from the second computer system if the one or more features are not usable to generate the one or more images using the first image.
14. The processor of claim 8, wherein the one or more images are one or more video frames of video data for display on one or more video output devices of the first computer system.
15. A computer system comprising: one or more processors to use one or more neural networks to identify one or more features of one or more objects within one or more images, wherein the one or more features are to be used to regenerate the one or more objects of the one or more images after being transmitted over a network to a second computer system.
16. The computer system system of claim 15, wherein: the one or more images comprise a first image and one or more additional images; the one or more neural networks identify the one or more features of the one or more objects in the one or more additional images; the first image and the one or more features are transmitted over the network to the second computer system; and the first image and the one or more features are usable to regenerate the one or more additional images.
17. The computer system of claim 16, wherein the one or more images are one or more frames of video data captured by one or more video capture devices for the computer system.
18. The computer system of claim 16, wherein the first image is selected by the computer system based, at least in part, on one or more notifications from the second computer system.
19. The computer system of claim 16, wherein the second computer system is a computing resource services provider.
20. The computer system of claim 16, wherein the one or more neural networks identify the one or more features using one or more parallel processing units.
21. The computer system of claim 15, wherein the one or more images comprise individual video frames from video data.
22. A system comprising: one or more processors to use one or more features of one or more objects of one or more images identified by one or more neural networks to generate the one or more objects of the one or more images.
23. The system of claim 22, wherein: the one or more images comprise a first image received from a second computer system; and one or more second neural networks generate the one or more objects of the one or more images based, at least in part, on the first image and the one or more features.
24. The system of claim 23, wherein the system comprises one or more parallel processing units usable by the one or more second neural networks to generate the one or more objects of the one or more images.
25. The system of claim 23, wherein the computer system is a computing resource services provider comprising one or more parallel processing units usable by the one or more second neural networks.
26. The system of claim 23, wherein the one or more second neural networks determine that the one or more features are usable to generate the one or more objects of the one or more images from the first image.
27. The system of claim 26, wherein the system requests a second image from the second computer system if the one or more features are not usable to generate the one or more objects of the one or more images from the first image.
28. The system of claim 22, wherein the one or more images are one or more video frames of video data for display on one or more video output devices of the system.
29. A method comprising: using one or one neural networks to identify one or more features of one or more objects within one or more images, wherein the one or more features are to be used to regenerate the one or more objects of the one or more images after being transmitted over a network to a second computer system.
30. The method of claim 29, further comprising: identifying a first image from the one or more images; identifying one or more additional images from the one or more images, wherein the one or more additional images do not contain the first image; identifying the one or more features from the one or more additional images; and transmitting the first image and one or more features to the second computer system.
31. The method of claim 30, wherein the one or more features are identified from the one or more additional images using the one or more neural networks.
32. The method of claim 30, wherein the one or more neural networks identify the one or more features using one or more parallel processing units.
33. The method of claim 30, wherein the first image and the one or more features are transmitted to a computing resource services provider.
34. The method of claim 30, wherein the one or more features are keypoints indicating information about the one or more objects of the one or more additional images.
35. The method of claim 29, wherein the one or more images are frames of video data captured by one or more video capture devices.
36. A method comprising: using one or more features of one or more objects of one or more images identified by one or more neural networks to generate the one or more objects of the one or more images.
37. The method of claim 36, further comprising: identifying a first image based, at least in part, on the one or more images; and generating the one or more images based, at least in part, on the first image and the one or more features of the one or more objects.
38. The method of claim 37, wherein the one or more images are video frames of video data received over a network and the first image is an initial video frame of the video data.
39. The method of claim 37, wherein the one or more features of the one or more objects of the one or more images are key points indicating information about locations in the one or more images comprising the one or more objects.
40. The method of claim 37, wherein if the one or more images are not generated by the first image and the one or more features of the one or more objects, a request is generated for a second image.
41. The method of claim 37, wherein the one or more images are generated by one or more second neural networks.
42. The method of claim 41, wherein the one or more second neural networks use one or more parallel processing units to facilitate generation of the one or more images.
43. The method of claim 36, wherein the one or more images are received over a network from a computing resource services provider comprising one or more second neural networks to generate the one or more objects of the one or more images.
44. A processor comprising: one or more circuits to use compressed information for teleconferencing by at least generating one or more objects of one or more images based, at least in part, on one or more features of the one or more objects identified using one or more neural networks.
45. The processor of claim 44, wherein: a sender computer system identifies the one or more features of the one or more objects of the one or more images using the one or more neural networks; the sender computer system transmits, over a network, the one or more features to a receiver computer system; and the receiver computer system generates the one or more images using one or more second neural networks based, at least in part, on the one or more features and a first image.
46. The processor of claim 45, wherein the receiver computer system is a computing resource services provider and the one or more images are transmitted over the network to a client computer system.
47. The processor of claim 45, wherein the one or more neural networks identify the one or more features of the one or more objects of the one or more images using one or more parallel processing units.
48. The processor of claim 45, wherein the one or more second neural networks generate the one or more objects of the one or more images using one or more parallel processing units.
49. The processor of claim 45, wherein the first image is a previously stored image on the receiver computer system, the first image comprising the one or more objects.
50. The processor of claim 45, wherein the first image is transmitted, by the sender computer system, over the network to the receiver computer system.
51. A system comprising: a memory to store video conferencing software; and one or more processors to use one or more neural networks to identify one or more features in one or more images to transmit using the video conferencing software, such that the one or more images may be regenerated using the one or more features.
52. The system of claim 51, wherein: the one or more images comprise a first image and one or more additional images; the one or more neural networks identify the one or more features of the one or more objects in the one or more additional images; the first image and the one or more features are transmitted by the video conferencing software to the receiver computer system; and the first image and the one or more features are usable to regenerate the one or more images.
53. The system of claim 52, wherein the one or more images are one or more frames of video data captured by one or more video capture devices of the system.
54. The system of claim 52, wherein the first image is selected by the system based, at least in part, on a request received by the video conferencing software.
55. The computer system of claim 51, wherein the one or more neural networks identify the one or more features using one or more parallel processing units.
56. A system comprising: a memory to store video conferencing software; and one or more processors to use one or more neural networks to generate one or more images based, at least in part, on one or more features received using the video conference software.
57. The system of claim 56, wherein: the video conferencing software receives a first image associated with the one or more features; and the one or more neural networks generate the one or more images based, at least in part, on the first image and the one or more features.
58. The system of claim 57, wherein the system comprises one or more parallel processing units usable by the one or more neural networks to generate the one or more images.
59. The system of claim 57, wherein the system determines, based at least in part on output from the one or more neural networks, that the first image and the one or more features are usable to generate the one or more images.
60. The system of claim 59, wherein the system requests a second image using the video conferencing software if the first image and the one or more features are not usable to generate the one or more images.
61. The system of claim 57, wherein the system is a computing resource services provider comprising one or more parallel processing units usable by the one or more neural networks.
62. The system of claim 61, wherein the system transmits, over a network, the one or more images to a client computer system.
63. A method comprising: using compressed information for teleconferencing by at least generating one or more objects of one or more images based, at least in part, on one or more features of the one or more objects identified using one or more neural networks.
64. The method of claim 63, wherein: a sender identifies the one or more features of the one or more objects of the one or more images using the one or more neural networks; the sender transmits the one or more features to a receiver over a network; and the receiver generates the one or more images using one or more second neural networks based, at least in part, on the one or more features and a first image.
65. The method of claim 64, wherein the receiver is a computing resource services provider and the one or more images are transmitted over the network to a user of the computing resource services provider.
66. The method of claim 64, wherein the one or more neural networks identify the one or more features of the one or more objects of the one or more images using one or more parallel processing units.
67. The method of claim 64, wherein the one or more second neural networks generate the one or more images using one or more parallel processing units.
68. The method of claim 64, wherein the first image is transmitted by the sender over the network to the receiver.
69. The method of claim 68, wherein the sender transmits the first image in response to a request by the receiver.
70. The method of claim 64, wherein the first image is a previously stored image on the receiver, the first image comprising the one or more objects.
71. The method of claim 64, wherein the one or more neural networks generate one or more second images based, at least in part, on the one or more images, the one or more second images comprising one or more modifications to the one or more images.
72. The method of claim 71, wherein the one or more modifications comprise adding one or more second objects to the one or more images.
73. The method of claim 71, wherein the one or more modifications comprise rotating the one or more objects in the one or more images.
74. The method of claim 71, wherein the one or more modifications comprise adjusting the one or more features of the one or more objects.
GB2201144.9A 2020-04-15 2021-04-14 Video compression and decompression using neural networks Pending GB2600348A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US202063101511P 2020-04-15 2020-04-15
US17/069,253 US20210329306A1 (en) 2020-04-15 2020-10-13 Video compression using neural networks
PCT/US2021/027343 WO2021211750A1 (en) 2020-04-15 2021-04-14 Video compression and decompression using neural networks

Publications (2)

Publication Number Publication Date
GB202201144D0 GB202201144D0 (en) 2022-03-16
GB2600348A true GB2600348A (en) 2022-04-27

Family

ID=80621107

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2201144.9A Pending GB2600348A (en) 2020-04-15 2021-04-14 Video compression and decompression using neural networks

Country Status (1)

Country Link
GB (1) GB2600348A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5764803A (en) * 1996-04-03 1998-06-09 Lucent Technologies Inc. Motion-adaptive modelling of scene content for very low bit rate model-assisted coding of video sequences
US20180075581A1 (en) * 2016-09-15 2018-03-15 Twitter, Inc. Super resolution using a generative adversarial network
US20180268571A1 (en) * 2017-03-14 2018-09-20 Electronics And Telecommunications Research Institute Image compression device
CN108629753A (en) * 2018-05-22 2018-10-09 广州洪森科技有限公司 A kind of face image restoration method and device based on Recognition with Recurrent Neural Network
US20190139218A1 (en) * 2017-11-06 2019-05-09 Beijing Curacloud Technology Co., Ltd. System and method for generating and editing diagnosis reports based on medical images
US20190246130A1 (en) * 2018-02-08 2019-08-08 Samsung Electronics Co., Ltd. Progressive compressed domain computer vision and deep learning systems
US20190287024A1 (en) * 2018-03-13 2019-09-19 Lyft, Inc. Low latency image processing using byproduct decompressed images
US20190377953A1 (en) * 2018-06-06 2019-12-12 Seventh Sense Artificial Intelligence Pvt Ltd Network switching appliance, process and system for performing visual analytics for a streaming video

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5764803A (en) * 1996-04-03 1998-06-09 Lucent Technologies Inc. Motion-adaptive modelling of scene content for very low bit rate model-assisted coding of video sequences
US20180075581A1 (en) * 2016-09-15 2018-03-15 Twitter, Inc. Super resolution using a generative adversarial network
US20180268571A1 (en) * 2017-03-14 2018-09-20 Electronics And Telecommunications Research Institute Image compression device
US20190139218A1 (en) * 2017-11-06 2019-05-09 Beijing Curacloud Technology Co., Ltd. System and method for generating and editing diagnosis reports based on medical images
US20190246130A1 (en) * 2018-02-08 2019-08-08 Samsung Electronics Co., Ltd. Progressive compressed domain computer vision and deep learning systems
US20190287024A1 (en) * 2018-03-13 2019-09-19 Lyft, Inc. Low latency image processing using byproduct decompressed images
CN108629753A (en) * 2018-05-22 2018-10-09 广州洪森科技有限公司 A kind of face image restoration method and device based on Recognition with Recurrent Neural Network
US20190377953A1 (en) * 2018-06-06 2019-12-12 Seventh Sense Artificial Intelligence Pvt Ltd Network switching appliance, process and system for performing visual analytics for a streaming video

Also Published As

Publication number Publication date
GB202201144D0 (en) 2022-03-16

Similar Documents

Publication Publication Date Title
WO2021244211A1 (en) Blockchain message processing method and apparatus, computer and readable storage medium
US10924783B2 (en) Video coding method, system and server
US11196962B2 (en) Method and a device for a video call based on a virtual image
CN110602519B (en) Continuous-microphone video processing method and device, storage medium and electronic equipment
WO2015120766A1 (en) Video optimisation system and method
US10924782B2 (en) Method of providing streaming service based on image segmentation and electronic device supporting the same
US20220253268A1 (en) Smart screen share reception indicator in a conference
CN114928758A (en) Live broadcast abnormity detection processing method and device
CN111131843A (en) Network live broadcast system and method
GB2575388A (en) Method, apparatus and system for discovering and displaying information related to video content
US20220309725A1 (en) Edge data network for providing three-dimensional character image to user equipment and method for operating the same
GB2600348A (en) Video compression and decompression using neural networks
CN110351014B (en) Data processing method, data processing device, computer readable storage medium and computer equipment
CN110753243A (en) Image processing method, image processing server and image processing system
JP2004159042A (en) Information processor, information processing method, and program
IL292731A (en) Privacy secure batch retrieval using private information retrieval and secure multi-party computation
JP2006333417A (en) Multivideo chat system
CN113709401A (en) Video call method, device, storage medium, and program product
CN111353133B (en) Image processing method, device and readable storage medium
JP2020053904A (en) Data receiving apparatus, data distribution control method, and data distribution control program
US20240089410A1 (en) Method of allowing user to participate in video conference using qr code and method of participating, by user, in video conference using qr code
US11463336B2 (en) Conferencing session management
CN112398884B (en) Flow scheduling control method under mirror image back source scene, readable storage medium and computer equipment
CN111818300B (en) Data storage method, data query method, data storage device, data query device, computer equipment and storage medium
US20240070806A1 (en) System and method for transmission and receiving of image frames