GB2600348A

GB2600348A - Video compression and decompression using neural networks

Info

Publication number: GB2600348A
Application number: GB2201144.9A
Authority: GB
Inventors: Liu Ming-Yu; Wang Ting-Chun; Mohanray Mallya Arun; Tapani Karras Tero; Matias Laine Samuli; Patrick Luebke David; Lehtinen Jaakko; Samuli Aittala Miika; Oskari Aila Timo
Original assignee: Nvidia Corp
Current assignee: Nvidia Corp
Priority date: 2020-04-15
Filing date: 2021-04-14
Publication date: 2022-04-27
Also published as: GB202201144D0

Abstract

Apparatuses, systems, and techniques to perform compression of video data using neural networks to facilitate video streaming, such as video conferencing. In at least one embodiment, a sender transmits to a receiver a key frame from video data and one or more keypoints identified by a neural network from said video data, and a receiver reconstructs video data using said key frame and one or more received keypoints.

Claims

1. A processor comprising: one or more circuits to use one or more neural networks of a first computer system to identify one or more features of one or more objects within one or more images, wherein the one or more features are to be used to regenerate the one or more objects of the one or more images after being transmitted over a network to a second computer system.

2. The processor of claim 1, wherein: the one or more images comprise a first image and one or more additional images; the one or more neural networks identify the one or more features of the one or more objects in the one or more additional images; the first image and the one or more features are transmitted over the network to the second computer system; and the first image and the one or more features are usable to regenerate the one or more additional images.

3. The processor of claim 2, wherein the one or more neural networks use one or more graphics processing units of the first computer system to identify the one or more features.

4. The processor of claim 2, wherein the one or more features indicate anatomical aspects of the one or more objects.

5. The processor of claim 2, wherein the one or more images are one or more video frames.

6. The processor of claim 2, wherein the second computer system is a computing resource services provider.

7. The processor of claim 2, wherein the first image is determined from the one or more images based, at least in part, on a request from the second computer system.

8. A processor comprising: one or more circuits to use one or more neural networks on a first computer system to generate one or more images based, at least in part, on one or more features of one or more objects of the one or more images received from a second computer system.

9. The processor of claim 8, wherein: the one or more images comprise a first image received from the second computer system; the one or more neural networks generate the one or more images based, at least in part, on the first image and the one or more features.

10. The processor of claim 9, wherein the first computer system comprises one or more parallel processing units usable by the one or more neural networks to generate the one or more images.

11. The processor of claim 9, wherein the first computer system is a computing resource services provider comprising one or more parallel processing units usable by the one or more neural networks.

12. The processor of claim 9, wherein the one or more neural networks determine that the one or more features are usable to generate the one or more images using the first image.

13. The processor of claim 12, wherein the first computer system requests a second image from the second computer system if the one or more features are not usable to generate the one or more images using the first image.

14. The processor of claim 8, wherein the one or more images are one or more video frames of video data for display on one or more video output devices of the first computer system.

15. A computer system comprising: one or more processors to use one or more neural networks to identify one or more features of one or more objects within one or more images, wherein the one or more features are to be used to regenerate the one or more objects of the one or more images after being transmitted over a network to a second computer system.

16. The computer system system of claim 15, wherein: the one or more images comprise a first image and one or more additional images; the one or more neural networks identify the one or more features of the one or more objects in the one or more additional images; the first image and the one or more features are transmitted over the network to the second computer system; and the first image and the one or more features are usable to regenerate the one or more additional images.

17. The computer system of claim 16, wherein the one or more images are one or more frames of video data captured by one or more video capture devices for the computer system.

18. The computer system of claim 16, wherein the first image is selected by the computer system based, at least in part, on one or more notifications from the second computer system.

19. The computer system of claim 16, wherein the second computer system is a computing resource services provider.

20. The computer system of claim 16, wherein the one or more neural networks identify the one or more features using one or more parallel processing units.

21. The computer system of claim 15, wherein the one or more images comprise individual video frames from video data.

22. A system comprising: one or more processors to use one or more features of one or more objects of one or more images identified by one or more neural networks to generate the one or more objects of the one or more images.

23. The system of claim 22, wherein: the one or more images comprise a first image received from a second computer system; and one or more second neural networks generate the one or more objects of the one or more images based, at least in part, on the first image and the one or more features.

24. The system of claim 23, wherein the system comprises one or more parallel processing units usable by the one or more second neural networks to generate the one or more objects of the one or more images.

25. The system of claim 23, wherein the computer system is a computing resource services provider comprising one or more parallel processing units usable by the one or more second neural networks.

26. The system of claim 23, wherein the one or more second neural networks determine that the one or more features are usable to generate the one or more objects of the one or more images from the first image.

27. The system of claim 26, wherein the system requests a second image from the second computer system if the one or more features are not usable to generate the one or more objects of the one or more images from the first image.

28. The system of claim 22, wherein the one or more images are one or more video frames of video data for display on one or more video output devices of the system.

29. A method comprising: using one or one neural networks to identify one or more features of one or more objects within one or more images, wherein the one or more features are to be used to regenerate the one or more objects of the one or more images after being transmitted over a network to a second computer system.

30. The method of claim 29, further comprising: identifying a first image from the one or more images; identifying one or more additional images from the one or more images, wherein the one or more additional images do not contain the first image; identifying the one or more features from the one or more additional images; and transmitting the first image and one or more features to the second computer system.

31. The method of claim 30, wherein the one or more features are identified from the one or more additional images using the one or more neural networks.

32. The method of claim 30, wherein the one or more neural networks identify the one or more features using one or more parallel processing units.

33. The method of claim 30, wherein the first image and the one or more features are transmitted to a computing resource services provider.

34. The method of claim 30, wherein the one or more features are keypoints indicating information about the one or more objects of the one or more additional images.

35. The method of claim 29, wherein the one or more images are frames of video data captured by one or more video capture devices.

36. A method comprising: using one or more features of one or more objects of one or more images identified by one or more neural networks to generate the one or more objects of the one or more images.

37. The method of claim 36, further comprising: identifying a first image based, at least in part, on the one or more images; and generating the one or more images based, at least in part, on the first image and the one or more features of the one or more objects.

38. The method of claim 37, wherein the one or more images are video frames of video data received over a network and the first image is an initial video frame of the video data.

39. The method of claim 37, wherein the one or more features of the one or more objects of the one or more images are key points indicating information about locations in the one or more images comprising the one or more objects.

40. The method of claim 37, wherein if the one or more images are not generated by the first image and the one or more features of the one or more objects, a request is generated for a second image.

41. The method of claim 37, wherein the one or more images are generated by one or more second neural networks.

42. The method of claim 41, wherein the one or more second neural networks use one or more parallel processing units to facilitate generation of the one or more images.

43. The method of claim 36, wherein the one or more images are received over a network from a computing resource services provider comprising one or more second neural networks to generate the one or more objects of the one or more images.

44. A processor comprising: one or more circuits to use compressed information for teleconferencing by at least generating one or more objects of one or more images based, at least in part, on one or more features of the one or more objects identified using one or more neural networks.

45. The processor of claim 44, wherein: a sender computer system identifies the one or more features of the one or more objects of the one or more images using the one or more neural networks; the sender computer system transmits, over a network, the one or more features to a receiver computer system; and the receiver computer system generates the one or more images using one or more second neural networks based, at least in part, on the one or more features and a first image.

46. The processor of claim 45, wherein the receiver computer system is a computing resource services provider and the one or more images are transmitted over the network to a client computer system.

47. The processor of claim 45, wherein the one or more neural networks identify the one or more features of the one or more objects of the one or more images using one or more parallel processing units.

48. The processor of claim 45, wherein the one or more second neural networks generate the one or more objects of the one or more images using one or more parallel processing units.

49. The processor of claim 45, wherein the first image is a previously stored image on the receiver computer system, the first image comprising the one or more objects.

50. The processor of claim 45, wherein the first image is transmitted, by the sender computer system, over the network to the receiver computer system.

51. A system comprising: a memory to store video conferencing software; and one or more processors to use one or more neural networks to identify one or more features in one or more images to transmit using the video conferencing software, such that the one or more images may be regenerated using the one or more features.

52. The system of claim 51, wherein: the one or more images comprise a first image and one or more additional images; the one or more neural networks identify the one or more features of the one or more objects in the one or more additional images; the first image and the one or more features are transmitted by the video conferencing software to the receiver computer system; and the first image and the one or more features are usable to regenerate the one or more images.

53. The system of claim 52, wherein the one or more images are one or more frames of video data captured by one or more video capture devices of the system.

54. The system of claim 52, wherein the first image is selected by the system based, at least in part, on a request received by the video conferencing software.

55. The computer system of claim 51, wherein the one or more neural networks identify the one or more features using one or more parallel processing units.

56. A system comprising: a memory to store video conferencing software; and one or more processors to use one or more neural networks to generate one or more images based, at least in part, on one or more features received using the video conference software.

57. The system of claim 56, wherein: the video conferencing software receives a first image associated with the one or more features; and the one or more neural networks generate the one or more images based, at least in part, on the first image and the one or more features.

58. The system of claim 57, wherein the system comprises one or more parallel processing units usable by the one or more neural networks to generate the one or more images.

59. The system of claim 57, wherein the system determines, based at least in part on output from the one or more neural networks, that the first image and the one or more features are usable to generate the one or more images.

60. The system of claim 59, wherein the system requests a second image using the video conferencing software if the first image and the one or more features are not usable to generate the one or more images.

61. The system of claim 57, wherein the system is a computing resource services provider comprising one or more parallel processing units usable by the one or more neural networks.

62. The system of claim 61, wherein the system transmits, over a network, the one or more images to a client computer system.

63. A method comprising: using compressed information for teleconferencing by at least generating one or more objects of one or more images based, at least in part, on one or more features of the one or more objects identified using one or more neural networks.

64. The method of claim 63, wherein: a sender identifies the one or more features of the one or more objects of the one or more images using the one or more neural networks; the sender transmits the one or more features to a receiver over a network; and the receiver generates the one or more images using one or more second neural networks based, at least in part, on the one or more features and a first image.

65. The method of claim 64, wherein the receiver is a computing resource services provider and the one or more images are transmitted over the network to a user of the computing resource services provider.

66. The method of claim 64, wherein the one or more neural networks identify the one or more features of the one or more objects of the one or more images using one or more parallel processing units.

67. The method of claim 64, wherein the one or more second neural networks generate the one or more images using one or more parallel processing units.

68. The method of claim 64, wherein the first image is transmitted by the sender over the network to the receiver.

69. The method of claim 68, wherein the sender transmits the first image in response to a request by the receiver.

70. The method of claim 64, wherein the first image is a previously stored image on the receiver, the first image comprising the one or more objects.

71. The method of claim 64, wherein the one or more neural networks generate one or more second images based, at least in part, on the one or more images, the one or more second images comprising one or more modifications to the one or more images.

72. The method of claim 71, wherein the one or more modifications comprise adding one or more second objects to the one or more images.

73. The method of claim 71, wherein the one or more modifications comprise rotating the one or more objects in the one or more images.

74. The method of claim 71, wherein the one or more modifications comprise adjusting the one or more features of the one or more objects.