US20210160556A1 - Method for enhancing resolution of streaming file - Google Patents

Method for enhancing resolution of streaming file Download PDF

Info

Publication number
US20210160556A1
US20210160556A1 US16/618,335 US201916618335A US2021160556A1 US 20210160556 A1 US20210160556 A1 US 20210160556A1 US 201916618335 A US201916618335 A US 201916618335A US 2021160556 A1 US2021160556 A1 US 2021160556A1
Authority
US
United States
Prior art keywords
neural network
video data
data
learning
network file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/618,335
Other languages
English (en)
Inventor
Kyoung Ik Jang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GDFLab Co Ltd
Original Assignee
GDFLab Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GDFLab Co Ltd filed Critical GDFLab Co Ltd
Priority claimed from PCT/KR2019/004891 external-priority patent/WO2019209006A1/ko
Assigned to GDFLAB CO., LTD. reassignment GDFLAB CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JANG, KYOUNG IK
Publication of US20210160556A1 publication Critical patent/US20210160556A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47202End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for requesting content on demand, e.g. video on demand
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/222Secondary servers, e.g. proxy server, cable television Head-end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234363Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the spatial resolution, e.g. for clients with a lower screen resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2408Monitoring of the upstream path of the transmission network, e.g. client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • H04N21/4666Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms using neural networks, e.g. processing the feedback provided by the user
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Definitions

  • the present disclosure relates to a method for enhancing resolution of a video file based on artificial intelligence. More specifically, the present disclosure relates to a method for recovering low image quality data into high image quality data based on an artificial neural network that is trained by extracting information on a grid pattern occurring in the course of changing image data of a streaming file into the low image quality data.
  • the present disclosure is to enhance resolution of a streaming video file using a convenient method.
  • the present disclosure enables generating video data in a file format divided to enhance a resolution of streamed video data upon video streaming and generating a neural network file.
  • a method for enhancing resolution include a processing operation for generating the video data, a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information, and a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device.
  • FIG. 1 is a diagram illustrating a configuration of a system for enhancing resolution according to an embodiment of the present disclosure.
  • FIG. 2 is a diagram illustrating an example of an image quality enhancing operation according to an embodiment of the present disclosure.
  • FIG. 3 is a block diagram illustrating a configuration of a server according to an embodiment of the present disclosure.
  • FIG. 4 is a diagram illustrating a configuration of a material processor according to an embodiment of the present disclosure.
  • FIG. 5 is a diagram illustrating a configuration of a neural network trainer according to an embodiment of the present disclosure.
  • FIG. 6 is a diagram illustrating an example of size change performed by the size changer according to an embodiment of the present disclosure.
  • FIG. 7 is a diagram illustrating an example of a deep-learning training operation according to an embodiment of the present disclosure.
  • FIG. 8 is a diagram illustrating a user device according to an embodiment of the present disclosure.
  • FIG. 9 is a diagram illustrating a process of generating and transmitting a neural network file for image quality improvement according to an embodiment of the present disclosure.
  • FIG. 10 is a flowchart illustrating a process of generating a specialized neural network file based on additional learning according to an embodiment of the present disclosure.
  • a method for enhancing resolution at a server for providing video data for streaming includes a processing operation for generating the video data, a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information, and a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device.
  • elements of the drawings described in the invention are independently drawn for the purpose of convenience of explanation on different specific functions, and do not mean that the elements are embodied by independent hardware or independent software.
  • two or more elements out of the elements may be combined to form a single element, or one element may be split into plural elements.
  • Embodiments in which the elements are combined and/or split belong to the scope of the invention without departing from the concept of the invention.
  • FIG. 1 is a diagram illustrating a configuration of a system for enhancing resolution according to an embodiment of the present disclosure.
  • the system for enhancing resolution may include a server 100 and a user device 200 , as shown in FIG. 1 .
  • the server 100 may include a server for providing a video on demand (VOD) service to the user device 200 .
  • VOD video on demand
  • the server 100 may transmit video data to the user device 200 to provide a VOD service. At this point, the server 100 may transmit, to the user device 200 , not just original video data, but a downscaled file of the original video data of which resolution is degraded.
  • the server 100 may calculate a neural network file, which is a file required to recover resolution of the video data (the downscaled file) to a preset match rate or higher, and the server may transmit the neural network file to the user device 200 . Accordingly, the user device 200 may enhance the resolution of the low-quality data (the downscaled file), provided from the server 100 , based on the neural network file.
  • the user device 200 may select video data (e.g., content name selection) to be transmitted, and request streaming or downloading of the selected video data from the server 100 .
  • the user device 200 may calculate user viewing pattern information based on video data selection information and video data reproduction information of the user device, and transmit the user viewing pattern information to the server 100 .
  • FIG. 2 will be referred to describe to briefly explain an operation performed by the user device 200 to enhance resolution.
  • FIG. 2 is a diagram illustrating an example of an image quality enhancing operation according to an embodiment of the present disclosure.
  • a user device 200 may generate a video file of which resolution is enhanced through a neural network file.
  • the neural network file according to an embodiment of the present disclosure may be combined with any video file transmitted to the user device 200 , thereby enhancing resolution.
  • a video file transmitted from a server 100 to the user device 200 for a streaming or downloading purpose may be a content divided into multiple segments, as shown in FIG. 2 .
  • the neural network file may be also divided to correspond to respective video file segments.
  • the respective neutral network file segments and the respective video file segments may be labeled to be combined in the user device 200 .
  • each video file segment may be matched with a corresponding neural network file segment, thereby enhancing resolution.
  • the neural network file may include data on an artificial neural network algorithm for recovering resolution of the video file, and accordingly, the user device may perform an artificial neural network computing process using the respective video file segments and the neural network file segments so as to recover resolution.
  • a video file of the present disclosure may be a downscaled file corresponding to a low image quality data having resolution into which resolution of video data included in a server is converted, or may be the original video data having resolution equal to or smaller than a reference.
  • FIG. 3 is a block diagram illustrating a configuration of a server according to an embodiment of the present disclosure.
  • the server 100 may include a communicator 110 , a storage 120 , and a controller 130 , as shown in FIG. 3 .
  • the controller 130 may include a material processor 131 , a neural network trainer 132 , a result evaluator 133 .
  • the communicator 110 may use a network for data transmission and reception between a user device and a server, and a type of the network is not limited.
  • the network may be, for example, an All IP network providing a service of transmitting and receiving large-scale data through an internet protocol (IP) or may be an ALL IP network which is a combination of different IP networks.
  • IP internet protocol
  • the network may be one of a wired network, a wireless broadband (Wibro) network, a mobile communication network including WCDMA, a high speed downlink packet access (HSDPA) network, a mobile communication network including a long term evolution (LTE) network, a mobile communication network including an LTE advanced (LTE-A) and Five Generation (5G), a satellite communication network, and a Wi-Fi network, or may be a combination of at least one of the aforementioned networks.
  • Wibro wireless broadband
  • WCDMA wireless high speed downlink packet access
  • HSDPA high speed downlink packet access
  • LTE long term evolution
  • LTE-A LTE advanced
  • 5G Five Generation
  • the communicator 110 may perform data communication with an external web server, a plurality of user devices, and the like.
  • the communicator 110 may receive content data (a photo and a video) including an image from another web server or a user device (including a device for a manager).
  • the server 100 includes a server for providing a VOD service, the server 100 may transmit a VOD content to the user device 200 .
  • the communicator 110 may receive and transmit a VOD file for a VOD service.
  • the communicator 110 may perform a communication function for collecting learning data required to generate a neural network file for enhancing resolution.
  • the neural network file may contain information necessary to recover resolution of damaged image data to be similar to original data through an artificial neural network algorithm, and may include information on various parameters necessary to be selected when the artificial neural network algorithm is driven.
  • the storage 120 may include, for example, an internal memory or an external memory.
  • the internal memory may include, for example, at least one of a volatile memory (e.g., a dynamic RAM (DRAM), a static RAM (SRAM), or a synchronous dynamic RAM (SDRAM), and the like), and a non-volatile memory (e.g., a one time programmable ROM (OTPROM), a programmable ROM (PROM), an erasable and programmable ROM (EPROM), an electrically erasable and programmable ROM (EEPROM), a mask ROM, a flash ROM, a flash memory (e.g., a NAND flash memory, a NOR flash memory, or the like), a hard drive, or a solid state drive (SSD)).
  • a volatile memory e.g., a dynamic RAM (DRAM), a static RAM (SRAM), or a synchronous dynamic RAM (SDRAM), and the like
  • a non-volatile memory
  • the external memory may further include a flash drive, for example, a CF (compact flash), a SD (secure digital), a Micro-SD (micro secure digital), a Mini-SD (mini secure digital), a XD (extreme digital), an MMC (multi-media card), a memory stick, or the like.
  • a flash drive for example, a CF (compact flash), a SD (secure digital), a Micro-SD (micro secure digital), a Mini-SD (mini secure digital), a XD (extreme digital), an MMC (multi-media card), a memory stick, or the like.
  • the external memory may be functionally and/or functionally connected with an electronic device through various interfaces.
  • the storage 120 may store original data matched with processed data (data corresponding to original image data reduced to a predetermined rate or data corresponding to the reduced original data enlarged to an original data size), which is obtained by processing image data (e.g., photo and video data) received from a user device (or a manager device) or another web server.
  • image data e.g., photo and video data
  • the original data and the processed data may be used to extract information regarding a grid phenomenon that occurs when resolution is reduced.
  • the storage 120 may store a neural network file for recovering resolution to an original image level, by removing a grid from the processed data through an artificial intelligence algorithm (e.g., a Super-Resolution Convolutional Neural Network (SRCNN)).
  • an artificial intelligence algorithm e.g., a Super-Resolution Convolutional Neural Network (SRCNN)
  • the controller 130 may be referred to as a processor, a controller, a microcontroller, a microprocessor, a microcomputer, or the like.
  • the controller may be implemented by any one of hardware, firmware, and software, or a combination thereof.
  • an embodiment of the present disclosure may be implemented by a module, a procedure, a function, and the like for performing the above-described functions or operations.
  • a software code may be stored in a memory and executed by the controller.
  • the memory may be located inside or outside the user device and a server, and may exchange data with the controller through various well-known means.
  • the controller 130 may generate a neural network file that is a file required to improve resolution of image data through computation based on an artificial neural network.
  • the controller 130 may include a material processor 131 , a neural network trainer 132 , a result evaluator 133 , and a use pattern calculator 134 .
  • the material processor 131 may collect and process learning material necessary to produce a neural network file required to improve image quality of video data.
  • the material processor 131 may perform primary change (reduction) and secondary change (enlargement of reduced image data) on a collected material. A detailed description of an operation of the material processor 131 will be provided with reference to FIG. 4 .
  • the neural network trainer 132 may train a neural network through artificial intelligence based on processed data that is obtained after collecting and processing of a material by the material processor 131 .
  • the neural network trainer 132 may set a parameter required for a training process, and produce a neural network. A detailed description of the neural network trainer 132 will be described with reference to FIG. 5 .
  • the result evaluator 133 may evaluate a result value obtained by applying the neural network file, produced by the neutral network trainer 132 , to a user device 200 .
  • the result value evaluator 133 may determine an enhanced degree of resolution of data resulting from applying the neural network file to the user device 200 .
  • the result evaluator 133 may determine an error rate between result data and original data, the error rate resulting from applying the neural network file.
  • a unit of comparison between the result data and the original data may be each frame included in an image or may be a piece into which an image is divided for transmission of the image.
  • each image data may be divided into a plurality of frame chunks (e.g., when an image is displayed over 100 frames, 100 frames may be set as one chunk and there may be a plurality of frame chunks).
  • a unit of comparison used to compare the result data and the original data may be a chunk unit that is divided based on image identicality.
  • the result evaluator 133 may request modifying a weight, a bias, and the like that form the neural network file. That is, through the comparison between the original data and the result data, the result evaluator 133 may determine whether it is necessary to modify a parameter forming the neural network file.
  • the result evaluator 133 may calculate an importance of each image object from the original data, the importance which is required to comprehend an image.
  • an error rate for one unit e.g., one frame, one frame chunk, and the like
  • the result data data of which resolution is enhanced by applying the neural network file in the user device 200
  • the result evaluator 133 may request modifying a weight, a bias, and the like forming the neural network file.
  • the importance for each image object may be calculated based on a size ratio of a corresponding image object occupied in one frame, a repetition rate of the corresponding image object, and the like.
  • the result evaluator 133 may calculate an importance for an image object based on a content characteristic of video data.
  • the result evaluator 133 may check the content characteristic of the video data.
  • the content characteristic of the video data may be calculated by the material processor 131 .
  • the content characteristic of the video data may be calculated based on information on an uploading path of the video data to the server 100 (e.g., a name of a folder selected by a user or a manager when uploading a video file to the server 100 ), a content genre or field input by the user or the manager when uploading the corresponding video file to the server 100 , and the like.
  • the calculated content characteristic of the video data may be managed as metadata of the corresponding video data.
  • the result evaluator 133 may check content characteristic information of each video data extracted and stored at a point in time of uploading to the server and calculate an importance for each image object based on the content characteristic information. For example, the result evaluator 133 may classify an image object into categories including a human face, a text (e.g., subtitle), an object, and the like, and determine a category matched with the content characteristic information. Specifically, an item with a high image object importance in a drama content may be set as a human face, and a text item in a lecture content may be set to have a high importance.
  • FIG. 4 is a diagram illustrating a configuration of a material processor according to an embodiment of the present disclosure.
  • the material processor 131 may include a size changer 131 a , an image divider 131 b , and a characteristic area extractor 131 c , as illustrated in FIG. 4 .
  • a material to be input to an input layer or a feature value of the input material need to be prepared in the neural network trainer.
  • the material and data to be input to the input layer may be prepared by the material processor 131 .
  • the size changer 131 a may perform primary size change for reducing a size of an image of video data from an original size to a preset value, and secondary size change for enlarging an image resulting from the primary adjustment to the original size. At this point, the secondary size change may be selectively performed. Size change performed by the size changer 131 a will be described with reference to FIG. 6 .
  • FIG. 6 is a diagram illustrating an example of size change performed by the size changer according to an embodiment of the present disclosure.
  • the size changer 131 a may perform an operation a, which corresponds to primary size change for reducing an original image 605 to a predetermined rate, and may perform an operation b, which corresponds to secondary size change for enlarging the a reduced image 610 resulting from the operation a to the same size of the original image.
  • An image 615 generated after the processing operation (the primary change (a) and the secondary change (b)), may have resolution lower than resolution of the original image 605 . It is because only the size of the image is enlarged without increasing the number of pixels that form the image.
  • the image 615 When the image 615 (having resolution identical to the resolution of the image 610 ) and the original image 605 are compared, the image 615 increases in pixel size and accordingly a grid is formed in a mosaic shape.
  • the server 100 may perform neural network training based on the processed image 615 and the original image 605 to in order to convert resolution conversion from resolution of the low-quality downscaled image 615 to resolution of the original image 605 .
  • the size changer 131 a of the material processor 131 may perform primary size change for reducing the size of the original image by a preset value, and secondary size change for enlarging the image reduced by the primary size change to the same size as the original image.
  • the material processor 131 may extract the original image and a processed image, generated resulting from the first size change and the second size change, as learning data.
  • the material processor 131 may extract pattern information (location information, color information, and the like) of a grid formed in the processed image (an image enlarged after size reduction), and utilize data on the pattern information as input data for neural network training.
  • the image divider 131 b may divide video data stored in the server 100 by a preset standard. At this point, the image divider 131 b may perform an operation of dividing the video data based on the number of frames. Alternatively, the image divider 131 b may divide the video datum by binding frames into chucks each having a match rate of image objects being equal to or greater a preset reference (e.g., 90%).). For example, a unit of division may be photographing the same person. In addition, the image divider 131 b may divide video data into a plurality of chucks on the basis of a unit transmitted from the server 100 to the user device when a streaming service is provided.
  • a preset reference e.g. 90%
  • the chunks divided by the image divider 131 b may be utilized to train an artificial neural network and evaluate a resolution enhancement result.
  • the characteristic area extractor 131 c may extract a characteristic area with a characteristic image included therein with reference to each frame or division unit of video data.
  • the characteristic area extractor 131 c may determine as to whether an image area meeting a preset characteristic area requirement is present in each frame or division unit.
  • the characteristic area may be determine according to whether there is an image object of which an image object importance corresponding to a content genre is equal to or greater than a preset value.
  • the characteristic area extractor 131 c may set an image object importance of a face image of a main character in a drama content to be higher, and accordingly, a characteristic area may be set as an area in which the face image of the main character is displayed (e.g., removed area and an object display area distinguishable from the background).
  • the characteristic area extractor 131 c may extract not just a characteristic area in an image, but also a specific frame or a specific division unit in images of the whole frames of the video data.
  • a learning importance weight may be applied to the characteristic area extracted by the characteristic area extractor 131 c , so that the learning repetition number increases.
  • the characteristic area extracted by the characteristic area extractor 131 c may be requested to generate an increased number of processed data. For example, when it is assumed that there are an area a set as a characteristic area in one frame and an area b set as a normal area, five processed images for the area a may be generated through size reduction (e.g., 80%, 50%, 30%, 20%, and 10%) that is performed an increased number of times (e.g., five), and two processed image for the area b may be generated through size reduction that is performed a normal number of times (e.g., two). As a result, a resolution recovery accuracy of a characteristic area selected by the characteristic area extractor 131 c may be set to be higher than that of a normal area.
  • FIG. 5 is a diagram illustrating a configuration of a neural network trainer according to an embodiment of the present disclosure.
  • the neural network trainer 132 may include a learning importance identifier 132 a , a similar data learning supporter 132 b , and a neural network file calculator 132 c , as illustrated in FIG. 5 .
  • the neutral network trainer 132 may perform a deep-learning training process based on an artificial neural network and accordingly generate a neural network file that is a file required to improve image quality of low-resolution video data.
  • FIG. 7 For brief description about a deep-learning training operation based on a neural network, FIG. 7 is referred to.
  • FIG. 7 is a diagram illustrating an example of a deep-learning training operation according to an embodiment of the present disclosure.
  • perceptron that is a neural network model including an input layer, a hidden layer, and an output layer.
  • neural network training according to the present disclosure may be performed using multi-layer perceptron that is implemented to include at least one hidden layer. Basically, the perceptron may receive an input of multiple signals and output one signal in response.
  • a weight and a bias required for a computation process using an artificial neural network model may be calculated through backward propagation.
  • the artificial neural network training process extracts proper weight data and bias data through the backward propagation.
  • a neural network file calculated through an artificial neural network may include the proper weight data, the bias data, and the like.
  • a training method using an artificial neural network through backward propagation, and parameter modification are well-known technologies and thus a detailed description thereof is omitted.
  • the neural network trainer 132 may perform training using a convolution neural network (CNN) model in artificial neural network models.
  • the CNN is characterized by maintaining a shape of input/output data of each layer, effectively recognizing a feature of an adjacent image while maintaining spatial information of an image, and extracting and learning a feature of an image through a plurality of filters.
  • a basic operating method of the CNN may utilize learning that is performed by scanning an area of a part of one image through a filter and discovering a result for the area. In this case, discovering a filter having a proper weight is the goal of the CNN.
  • the filter may be generally defined as a square matrix such as (4,4) and (3,3).
  • a set value of the filter for the CNN according to an embodiment of the present disclosure is not limited.
  • the filter may calculate a convolution by iterating over input data at a predetermined interval.
  • the learning importance identifier 132 a may identify a learning importance assigned to a characteristic area or a specific frame chunk of the learning data.
  • the characteristic area 131 c may divide one frame into a plurality of areas and set at least one area from among the plurality of divided areas as a characteristic area.
  • a criterion for dividing a frame is not limited.
  • the characteristic area extractor 131 c may assign a different learning importance for each characteristic area according to a set reference element (e.g., an image object importance, a size, and the like).
  • the learning importance identifier 132 a may identify a learning importance included in the learning data.
  • the learning data received from the material processor 131 may be identified as a plurality of divided segments, and accordingly, the learning importance identifier 132 a may identify a learning importance assigned to each divided unit.
  • the learning data received from the material processor 131 may be in a divided state, and accordingly, the learning importance identifier 132 a may identify a learning importance assigned to each frame.
  • the learning importance identifier 132 a may identify a learning importance for each area in a frame.
  • the learning importance identifier 132 a may extract learning option information regarding a learning number indicated by a learning importance assigned to each division unit or each frame, whether learning is performed using similar data, and the like.
  • An operation of extracting the option information indicated by the learning importance may be, for example, in the case of an item with a learning importance of 1, extracting option information indicating that a learning number is three and a learning process using similar data is not performed, and, in the case of an item with a learning process of 2, extracting option information indicating that a learning number is four and a learning process using similar data is not performed.
  • the learning importance identifier 132 a may transmit an instruction in accordance with option information indicated by a learning importance to the similar data learning supporter 132 b and the neural network file calculator 132 c.
  • the similar data learning supporter 132 b may support to perform learning using similar data.
  • the learning importance identifier 132 a identifies option information corresponding to a learning importance and accordingly determines that learning through similar data in the option information is being performed, the learning importance identifier 132 a may transmit a relevant instruction to the similar data learning supporter 132 b and the neural network calculator 132 c.
  • the similar data learning supporter 132 b may perform an operation of acquiring similar data similar to a target image.
  • the similar data may indicate a similar image that is found through an external web and the like.
  • the similar data learning supporter 132 b may search for and acquire cosmos images through a portal web and the like. From among found images, the similar data learning supporter 132 b may select and acquire similar data based on similarity in the number of objects in the respective found images, similarity in resolution, similarity in color combination, and the like.
  • the neural network file calculator 132 c may set an initial parameter value for performing a process regarding image data through a CNN.
  • the neural network file calculator 132 c may determine a frame size of original data and a reduction rate which is set in the original data when processed data is generated, and may set a corresponding initial parameter.
  • the neural network file calculator 132 c may specify a type of image data required for artificial neural network training, and request inputting the corresponding image data as learning data.
  • the neural network file calculator 132 c may additionally request frame information including a relevant image object as learning data from the similar data learning supporter 132 b in order to perform repetitive learning with respect to a main character.
  • the neural network file calculator 132 c may perform learning by inputting a material processed by the material processor 131 into a preset artificial neural network model.
  • the neural network file calculator 132 c may extract information on a grid generated in the course of changing original data into processed data (grid generation pattern information), by inputting the original data and the processed data (reduced to a preset rate) into a CNN algorithm). More specifically, the grid generation pattern information calculated by the neural network file calculator 132 c may be calculated based on a difference between the original data and the processed data, and may include pattern information regarding a location of the grid, and color change of the grid, and the like.
  • the neural network file calculator 132 c may generate a neural network file required to recover the original image, by removing the grid from the processed data based on the calculated grid generation pattern information.
  • the neural network file calculator 132 c may perform computation by inputting downscaled data (processed data) as input data into an artificial neural network algorithm.
  • the neural network file calculator may terminate a data learning process.
  • the neural network file calculator 132 c may repeatedly perform an operation of inputting a countless number of various types of processed data into an input layer to determine a match rate between an artificial neural network computation result and original data.
  • the neural network file calculator 132 c may calculate grid generation pattern information that is created when an image of a specific size is reduced by inputting various types of original data and processed data. Accordingly, the neural network file calculator 132 c may calculate grid generation pattern information that is commonly created not just in a specific image but also in various images when image reduction is performed.
  • the neural network file calculator 132 c may input processed data into the input layer, and, when a match rate between output data and original data is equal to or greater than the preset value, the neural network file calculator 132 c may generate a neural network file including information regarding parameters (e.g., a weight, a bias, a learning rate, and the like) set in a corresponding artificial neural network algorithm, an activation function for each layer, and the like.
  • parameters e.g., a weight, a bias, a learning rate, and the like
  • the user device 200 receives the neural network file and perform an artificial neural network test of low image quality video data (downscaled data) based on information on the neural network file and accordingly the user device may perform a function of enhancing resolution of the video data.
  • FIG. 8 is a diagram illustrating a user device according to an embodiment of the present disclosure.
  • a user 200 may include a communicator 210 , a storage 220 , an input part 230 , a display part 240 , a camera part 250 , and a controller 260 .
  • the controller 260 may include a video reproducer 261 , a resolution converter 262 , and a user information collector 263 .
  • the communicator 210 may perform a communication function to receive a neural network file and video data from a server 100 . Further, the communicator 210 may perform a communication operation to transmit feedback information collected from the user device 200 to the server 100 .
  • the storage 220 may store the neural network file and the video data, each received from the server 100 .
  • the storage 220 may store or temporarily store result data (resolution enhanced data), that is a result from computation that is performed by applying a neural network file to downscaled data having resolution equal to or smaller than a preset reference.
  • the storage 220 may store the generated feedback information.
  • the storage 220 may store information required to calculate feedback information. For example, when one frame is extracted from result data (resolution enhanced data), generated as the result of the computation of the artificial neural network algorithm, to provide feedback, the storage 220 may store reference information regarding the extraction (e.g., whether a user's frown face is detected during reproduction of a video, a content regarding extracting a frame corresponding to a timing when the frown face is detected, and the like).
  • result data resolution enhanced data
  • the storage 220 may store reference information regarding the extraction (e.g., whether a user's frown face is detected during reproduction of a video, a content regarding extracting a frame corresponding to a timing when the frown face is detected, and the like).
  • the input part 230 may receive user selection information regarding a content genre, a content name, and the like.
  • the display part 240 may display a reproduction screen of a corresponding video.
  • the camera part 250 may photograph a picture and a video in response to a user request. Image information regarding the picture and the video photographed by the camera part 250 may be uploaded to the server 100 or another web server. Alternatively, image information photographed by the camera part 250 may be transmitted to another user device.
  • the camera part 250 may first determine resolution based on a user request. According to an embodiment, based on whether a neural network file for image quality improvement is installed, the camera part 250 may store the photographed picture or video in a manner in which resolution of the photographed picture or video is reduced to a preset level or lower.
  • the camera part 250 may operate a camera that regularly at a preset reference interval photographs a user's face. It is possible to determine the user's facial expression or frowning face and extract feedback information in response to a determination.
  • the controller 260 may convert resolution of a video file downloaded from the server 100 or reproduce the video file.
  • the controller 260 may include a video reproducer 261 , a resolution converter 262 , and a user information collector 263 .
  • the video reproducer 261 may perform a control to reproduce a streamed video file so that the streamed video file is displayed on the display part 240 .
  • the video reproducer 261 may determine resolution of video data which is requested to be output. When it is determined that the resolution of the video data requested to be output is requested to be improved to a preset level or lower, the video reproducer 261 may request resolution enhancement from the resolution converter 262 . Then, the video reproducer 261 may reproduce resolution enhanced file through the resolution converter 262 .
  • the resolution converter 262 may determine a current resolution of image data (a picture and a video) and a target resolution requested by a user. At this point, during a streaming operation, the resolution converter 262 may match segmented video data received from the server 100 and a neural network file and then execute an artificial neural network algorithm to convert downscaled data to a desired resolution.
  • the user information collector 263 may collect user information for feedback.
  • the user information collector 263 may select a frame to be used as a feedback information from among result data obtained after resolution enhancement is performed based on an artificial neural network algorithm, and may store the selected frame. For example, while a user reproduces resolution enhanced video, the user information collector 263 may acquire the user's face information, and, when an event such as the user's frowning face occurs, the user information collector 263 may collect video frame information being displayed at a time when the event occurs.
  • the user information collector 263 may collect content information such as an item, a genre, and the like of a content that has been reproduced at or above a reference level. For example, the user information collector 263 may determine a reproduction frequency of an animation content compared to a documentary content (based on a photo image), a frequency of reproduction of a subtitleless content compared to a subtitle-contained content, and the like, and collect information regarding the determination. Reproduction information collected by the user information collector 263 may be provided to the server 100 , and the server may generate user pattern information based on the reproduction information.
  • FIG. 9 is a diagram illustrating a process of generating and transmitting a neural network file for image quality improvement according to an embodiment of the present disclosure.
  • a server 100 may generate a neural network file for resolution enhancement and transmit the neural network file to a user device.
  • the server 100 may first perform an operation 705 to process video data present in a preset data set.
  • the data set may mean a learning data set for generating a basic neural network file.
  • the data set may consist of random video data having various genres, various subjects, and various formats, and some meta-information including resolution may be normalized. Accordingly, various types of video data included in the data set may have been pre-processed to have the same resolution.
  • the operation 705 may be an operation for generating data to be learned by an artificial neural network algorithm, and may perform a downscaling process to degrade resolution of video data to generate data appropriate for learning.
  • the operation 705 may be a processing operation (image reproduction, downscaling) for each frame included in a video file.
  • the operation 705 may be an operation of selecting a frame to be input to train an artificial neural network through sampling on a division unit and then processing (downscaling to a preset rate) the corresponding selected frame. For example, when it is assumed that a video file having 2400 frames in total is composed of 100 chunks each consisting of 24 frames, the server may sample one frame per corresponding video division unit and thereby process a total of 100 frames into learning data.
  • the server 100 may perform an operation 710 to acquire grid generation pattern information based on processed video data.
  • the processed video data may mean data obtained by reducing a size of original data (original data for learning is indicated from among data having resolution equal to or greater than a preset resolution) to a preset rate.
  • the server may acquire grid generation pattern information by comparing processed image with the grid phenomenon and the original image.
  • the acquired gating occurrence pattern information may be used later to recover resolution by removing a grid from the image in which the grid phenomenon occurs.
  • the server 100 may perform an operation 715 to generate a neural network file for image quality improvement based on the grid generation pattern information. Then, the server 100 may generate a basic neural network file by calculating artificial neural network algorithm information (an activation function for each layer, a weight, a bias, and the like) that is required to recover the original image by removing grid from downscaled image data in which the grid has occurred.
  • artificial neural network algorithm information an activation function for each layer, a weight, a bias, and the like
  • An element provided as a result value such as a weight, a bias, and the like may be determined based on a match rate between a final resulting product (image quality improved data) and original image data.
  • the server 100 may determine weight and bias information, which has been applied when computing the corresponding artificial neural network, as information to be included in a neural network file.
  • the server 100 may perform an operation 720 to confirm that a first streaming request (or a download request) regarding video data is received from the user device 200 .
  • the server 100 may perform an operation 725 to transmit a low image quality version of the requested video data (downscaled data) along with a basic neural network file for image quality improvement.
  • the user device 200 may receive a content easily without a constraint to a network environment.
  • FIG. 10 is a flowchart illustrating a process of generating a specialized neural network file based on additional learning according to an embodiment of the present disclosure.
  • the controller 130 of the server 100 may perform an operation 805 to acquire new video data and confirm the acquisition of the data. Then, the controller 130 may perform an operation 810 to identify an additional learning condition of the new video data. For example, the controller 120 may determine whether a result of performing a recovery operation based on a basic neural network file generated in the operation 715 of FIG. 9 has a recovery rate equal to or greater than a reference level, and accordingly the controller 120 may determine whether to perform additional learning. In this case, the determination as to the recovery rate may be performed based on a structural similarity (SSIM) (which is an indicator to measure similarity of two images) and a peak signal to noise ratio (PSNR).
  • SSIM structural similarity
  • PSNR peak signal to noise ratio
  • the controller 130 may perform an operation 815 to perform additional learning on the new video data.
  • the additional learning may mean performing specialized learning of an image with respect to one item of the new video data.
  • the controller 130 may change a meta-value including resolution of the new video data according to a standard used in previously performed image learning. As a pre-processing for equalizing the standard to data set used when generating the basic neural network file is completed, additional image learning may be performed.
  • the controller 130 may perform an operation 820 to generate a specialized neural network file regarding the new video data.
  • the specialized neural network file may be generated by additionally performing artificial neural network training with respect to newly added video data after applying an artificial neural network algorithm and a parameter to initial values by the basic neural network file. That is, in order to generate the specialized neural network file, an operation of retrieving the basic neural network file generated in the operation 715 of FIG. 7 may be necessarily performed beforehand.
  • the controller 130 may perform, in response, an operation 830 to transmit a downscaled file and the specialized neural network file to the user device 200 .
  • the second streaming request may be to request the specialized neural network file to recover resolution of video data.
  • the first streaming request and the second streaming request may be differentiated based on a type of a service used by a user. For example, a request received from a user of a low pricing plan is the first streaming request corresponding to a streaming method for providing a basic neural network file, and a streaming request received from a user of a relatively high pricing plan may be a second streaming request corresponding to a method for transmitting a specialized neural network file in some cases.
  • the user device 200 may transmit feedback information regarding a state of video data that has been completely reproduced or converted, according to various embodiments. Accordingly, the user device 200 may calculate reproduction associated information for each user, such as a content genre reproduced at or above a reference level, a content characteristic, a primary reproduction requested time, and the like, and transmit the reproduction associated information to the server 100 .
  • the user device 200 may provide a frame sample of resolution enhanced result data to the server 100 in a preset period. Accordingly, the server 100 may compare a result data frame calculated after resolution enhancement, which is received from the user device 200 , and an original data frame of the same content.
  • the transmitted frame information may include reproduction location information in a content, and accordingly.
  • the server 100 may retrieve a comparable frame image from original data.
  • the server 100 may compare an image frame provided as feedback and an original image frame of a corresponding content, and determine a match rate therebetween. When it is determined that the match rate is equal to or smaller than a preset reference, the server 100 may request a re-learning operation to update a neural network file and accordingly perform the re-learning operation.
  • a neural network file generated according to various embodiments of the present disclosure may be compressed, when necessary.
  • the server may compress a neural network file in consideration of performance of the user device 200 and transmit the compressed neural network file to the user device 200 .
  • the neural network file may be comprised using at least one of ng, Quantization, Decomposition, or Knowledge Distillation.
  • Pruning is a compression technique for deleting a weight and a bias that are insignificant or do not affect an output value from among weights and biases of a neural network file.
  • Quantization is a compression technique for quantizing respective weights to a preset bit.
  • Decomposition is a compression technique for reducing a size of a weight by performing approximated decomposition of a weight matrix or tensor which is a set of weights.
  • Knowledge Distillation is a compression technique for generating a student model smaller than an original model by using the original model as a teacher model and for learning the student model. In this case, the student model may be generated through Pruning, Decomposition, or Quantization, described above.
  • a degree of compression in accordance with performance of the user device 200 may be determined in various ways.
  • the degree of compression of a neural network file may be determined based on simple specification of the user device 200 . That is, the degree of compression may be determined based on specification of a processor of the user device 200 and specification of a memory of the user device 200 .
  • the degree of compression of the neural network file may be determined based on a use state of the user device 200 .
  • the server 100 may receive use state information from the user device 200 , and acquire available resource information of the user device 200 in accordance with the received use state information.
  • the server 100 may determine a degree of compression of a neural network file based on the available resource information.
  • the available resource information may be information regarding an application being executed by the user device 200 , a CPU or GPU occupancy rate determined depending on the application in execution, and information regarding a memory capacity of the user device 200 .
  • a method for enhancing resolution at a server for providing video data for streaming includes a processing operation for generating the video data, a generating operation for acquiring grid generation pattern information based on the processed video data and generating a neural network file required to enhance resolution of the video data based on the grid generation pattern information, and a transmitting operation for, in response to reception of a streaming request from a user device, dividing requested video data and a neural network file required to recover resolution of the requested video data and transmitting the divided video data and the divided neural network file to the user device.
  • the generating operation may include a file generating operation for generating a basic neural network file based on a plurality of video data items included in a preset data set, an additional learning operation for, in response to a determination that any acquired new video data satisfies an additional learning condition, performing additional learning on the new video data, wherein the additional learning is performed through an artificial neural network algorithm to which the basic neural network file is applied, and a specialized neural network file generating operation for generating a downscaled file of the new video data as a result of the additional learning and a specialized neural network file corresponding to the new video data.
  • the additional learning operation may include an operation for determining whether the additional learning condition is satisfied according to a structural similarity (SSIM) and a peak-signal-to-noise ratio (PSNR) that are obtained by performing resolution recovery on the downscaled file of the new video data based on the basic neural network.
  • SSIM structural similarity
  • PSNR peak-signal-to-noise ratio
  • the processing operation may include a dividing operation for dividing the video data into a plurality of chucks by bundling a plurality of frames having a match rate of image objects being equal to or greater than a reference into one chunk, and a size changing operation for performing primary change to reduce a size of an image included in the video data by a preset value from an original size and selectively performing secondary change to enlarge the image having gone through the primary change to the original size.
  • the processing operation may include a characteristic area extracting operation for extracting a characteristic area including a characteristic image on the basis of each frame or division unit of the video data and assigning a learning importance to the extracted characteristic area.
  • the characteristic area may include an image object of which an image object importance corresponding to a content field is equal to or greater than a preset value.
  • the generating operation may include a learning importance identifying a learning importance assigned to a characteristic area or a specific frame chuck of learning video data and extracting option information indicated by the learning importance, and a neural network file calculating operation for inputting original data in an original size and processed data reduced to a preset rate in the learning video data into a convolution neural network (CNN) algorithm to be learned.
  • a neural network file is generated including a parameter and an activation function of an artificial neural network, the parameter ad the activation function which cause a match rate between a computation result value obtained by inputting the processed data into an artificial neural network and the original data to be equal to or greater than a preset value.
  • the option information comprises a learning number and information regarding whether learning is performed through similar data.
  • the generating operation may further include a similar data acquiring operation for, when performing learning on learning data with a learning importance set thereto, in response to identifying an instruction for performing leaning using the similar data, acquiring the similar data similar to a target image to be learned.
  • the similar data is acquired based on similarity in resolution and color combinations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)
US16/618,335 2018-04-24 2019-04-23 Method for enhancing resolution of streaming file Abandoned US20210160556A1 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
KR10-2018-0047305 2018-04-24
KR20180047305 2018-04-24
KR10-2019-0046954 2019-04-22
KR1020190046954A KR102082816B1 (ko) 2018-04-24 2019-04-22 스트리밍 파일의 해상도 개선 방법
PCT/KR2019/004891 WO2019209006A1 (ko) 2018-04-24 2019-04-23 스트리밍 파일의 해상도 개선 방법

Publications (1)

Publication Number Publication Date
US20210160556A1 true US20210160556A1 (en) 2021-05-27

Family

ID=68731071

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/618,335 Abandoned US20210160556A1 (en) 2018-04-24 2019-04-23 Method for enhancing resolution of streaming file
US17/050,371 Active US11095925B2 (en) 2018-04-24 2019-04-23 Artificial intelligence based resolution improvement system

Family Applications After (1)

Application Number Title Priority Date Filing Date
US17/050,371 Active US11095925B2 (en) 2018-04-24 2019-04-23 Artificial intelligence based resolution improvement system

Country Status (4)

Country Link
US (2) US20210160556A1 (ko)
EP (1) EP3787302A4 (ko)
JP (1) JP7385286B2 (ko)
KR (4) KR102082816B1 (ko)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11182876B2 (en) 2020-02-24 2021-11-23 Samsung Electronics Co., Ltd. Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding on image by using pre-processing
US11190784B2 (en) 2017-07-06 2021-11-30 Samsung Electronics Co., Ltd. Method for encoding/decoding image and device therefor
US11288770B2 (en) 2018-10-19 2022-03-29 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
US11436703B2 (en) * 2020-06-12 2022-09-06 Samsung Electronics Co., Ltd. Method and apparatus for adaptive artificial intelligence downscaling for upscaling during video telephone call
US20220368965A1 (en) * 2019-11-15 2022-11-17 Korea Advanced Institute Of Science And Technology Live video ingest system and method
WO2023210969A1 (en) * 2022-04-26 2023-11-02 Samsung Electronics Co., Ltd. Super-resolution reconstruction method and apparatus for adaptive streaming media and server

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11368758B2 (en) * 2018-05-21 2022-06-21 Gdflab Co., Ltd. VOD service system based on AI video learning platform
US20190392312A1 (en) * 2018-06-21 2019-12-26 Deep Force Ltd. Method for quantizing a histogram of an image, method for training a neural network and neural network training system
WO2020080665A1 (en) 2018-10-19 2020-04-23 Samsung Electronics Co., Ltd. Methods and apparatuses for performing artificial intelligence encoding and artificial intelligence decoding on image
KR102524547B1 (ko) * 2019-11-28 2023-04-24 울산과학기술원 무손실 이미지 압축을 위한 데이터 압축 및 복원 장치
KR20210127412A (ko) * 2020-04-14 2021-10-22 삼성전자주식회사 Ai 다운스케일 장치 및 그 동작방법, 및 ai 업스케일 장치 및 그 동작방법
CN113556496B (zh) * 2020-04-23 2022-08-09 京东方科技集团股份有限公司 视频分辨率提升方法及装置、存储介质及电子设备
KR20220081648A (ko) * 2020-12-09 2022-06-16 삼성전자주식회사 Ai 부호화 장치 및 그 동작방법, 및 ai 복호화 장치 및 그 동작방법
US11943271B2 (en) 2020-12-17 2024-03-26 Tencent America LLC Reference of neural network model by immersive media for adaptation of media for streaming to heterogenous client end-points
KR102271371B1 (ko) * 2020-12-24 2021-06-30 전남대학교산학협력단 네트워크 트래픽 절감을 위한 모바일 엣지 컴퓨팅 기반 슈퍼-레졸루션 스트리밍 영상 전송 시스템
CN112802338B (zh) * 2020-12-31 2022-07-12 山东奥邦交通设施工程有限公司 一种基于深度学习的高速公路实时预警方法及系统
KR102537163B1 (ko) * 2021-06-16 2023-05-26 주식회사 엘지유플러스 단말기, 클라우드 서버 및 클라우드 증강현실(ar) 플랫폼의 영상 송수신 시스템 및 그 방법
WO2023286881A1 (ko) * 2021-07-12 2023-01-19 엘지전자 주식회사 영상표시장치 및 이를 포함하는 시스템
KR20230023460A (ko) * 2021-08-10 2023-02-17 삼성전자주식회사 어플리케이션에 따라 ai 기반으로 영상을 재생하는 전자 장치 및 이에 의한 영상 재생 방법
KR102634627B1 (ko) * 2021-12-14 2024-02-08 주식회사 카이 라이브 스트리밍 방법 및 장치
KR102596384B1 (ko) * 2023-02-22 2023-10-30 진병선 인공 지능 연산을 위한 신경망 처리 유닛이 구비된 카메라 장비
KR102647960B1 (ko) * 2023-10-11 2024-03-15 주식회사 지솔 AI Vision 기반의 객체 이미지 고해상도 고화질 변환 기법을 적용한 초지능 객체 분석 방법 및 시스템

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5293454A (en) * 1989-03-08 1994-03-08 Sharp Kabushiki Kaisha Learning method of neural network
US6735253B1 (en) * 1997-05-16 2004-05-11 The Trustees Of Columbia University In The City Of New York Methods and architecture for indexing and editing compressed video over the world wide web
US7149262B1 (en) * 2000-07-06 2006-12-12 The Trustees Of Columbia University In The City Of New York Method and apparatus for enhancing data resolution
AU2002334720B8 (en) * 2001-09-26 2006-08-10 Interact Devices, Inc. System and method for communicating media signals
JP4784190B2 (ja) 2005-07-28 2011-10-05 カシオ計算機株式会社 コンテンツ再生装置、コンテンツ配信装置及びプログラム
WO2007029443A1 (ja) 2005-09-09 2007-03-15 Matsushita Electric Industrial Co., Ltd. 画像処理方法、画像記録方法、画像処理装置および画像ファイルフォーマット
KR101789845B1 (ko) * 2010-01-22 2017-11-20 톰슨 라이센싱 샘플링 기반 초 해상도 비디오 인코딩 및 디코딩을 위한 방법 및 장치
KR101344828B1 (ko) * 2012-02-27 2013-12-26 (주)티그레이프 디지털 콘텐츠 유통 방법 및 시스템
EP3259914A1 (en) * 2015-02-19 2017-12-27 Magic Pony Technology Limited Interpolating visual data
KR102450971B1 (ko) 2015-05-08 2022-10-05 삼성전자주식회사 객체 인식 장치 및 방법
US9940539B2 (en) 2015-05-08 2018-04-10 Samsung Electronics Co., Ltd. Object recognition apparatus and method
US9609307B1 (en) * 2015-09-17 2017-03-28 Legend3D, Inc. Method of converting 2D video to 3D video using machine learning
JP2017103744A (ja) 2015-12-04 2017-06-08 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America 画像復号方法、画像符号化方法、画像復号装置、画像符号化装置、及び画像符号化復号装置
KR101803471B1 (ko) * 2016-02-15 2017-12-01 성균관대학교 산학협력단 컨볼루션 신경망 기반의 영상 패턴화를 이용한 딥러닝 시스템 및 이를 이용한 영상 학습방법
JP6689656B2 (ja) 2016-04-18 2020-04-28 ルネサスエレクトロニクス株式会社 画像処理システム、画像処理方法及び画像送信装置
US10432685B2 (en) * 2016-05-31 2019-10-01 Brightcove, Inc. Limiting key request rates for streaming media
KR101780057B1 (ko) 2016-08-02 2017-09-19 한양대학교 에리카산학협력단 고해상도 영상 복원 방법 및 장치
CN106791927A (zh) 2016-12-23 2017-05-31 福建帝视信息科技有限公司 一种基于深度学习的视频增强与传输方法
WO2018216207A1 (ja) 2017-05-26 2018-11-29 楽天株式会社 画像処理装置、画像処理方法、および画像処理プログラム

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11190784B2 (en) 2017-07-06 2021-11-30 Samsung Electronics Co., Ltd. Method for encoding/decoding image and device therefor
US11288770B2 (en) 2018-10-19 2022-03-29 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
US11688038B2 (en) 2018-10-19 2023-06-27 Samsung Electronics Co., Ltd. Apparatuses and methods for performing artificial intelligence encoding and artificial intelligence decoding on image
US20220368965A1 (en) * 2019-11-15 2022-11-17 Korea Advanced Institute Of Science And Technology Live video ingest system and method
US11877018B2 (en) * 2019-11-15 2024-01-16 Korea Advanced Institute Of Science And Technology Live video ingest system and method
US11182876B2 (en) 2020-02-24 2021-11-23 Samsung Electronics Co., Ltd. Apparatus and method for performing artificial intelligence encoding and artificial intelligence decoding on image by using pre-processing
US11436703B2 (en) * 2020-06-12 2022-09-06 Samsung Electronics Co., Ltd. Method and apparatus for adaptive artificial intelligence downscaling for upscaling during video telephone call
WO2023210969A1 (en) * 2022-04-26 2023-11-02 Samsung Electronics Co., Ltd. Super-resolution reconstruction method and apparatus for adaptive streaming media and server

Also Published As

Publication number Publication date
KR20190130479A (ko) 2019-11-22
JP7385286B2 (ja) 2023-11-22
US11095925B2 (en) 2021-08-17
KR102190483B1 (ko) 2020-12-11
JP2021522723A (ja) 2021-08-30
KR102179436B1 (ko) 2020-11-16
KR20190130478A (ko) 2019-11-22
EP3787302A1 (en) 2021-03-03
US20210058653A1 (en) 2021-02-25
EP3787302A4 (en) 2022-04-20
KR102082815B1 (ko) 2020-02-28
KR20190140825A (ko) 2019-12-20
KR20190130480A (ko) 2019-11-22
KR102082816B1 (ko) 2020-02-28

Similar Documents

Publication Publication Date Title
US20210160556A1 (en) Method for enhancing resolution of streaming file
US11234006B2 (en) Training end-to-end video processes
US10904541B2 (en) Offline training of hierarchical algorithms
CN107222795B (zh) 一种多特征融合的视频摘要生成方法
KR102130076B1 (ko) 특징 영역의 학습 중요도를 바탕으로 스트리밍 파일의 해상도를 개선하는 방법
US11887277B2 (en) Removing compression artifacts from digital images and videos utilizing generative machine-learning models
CN112383824A (zh) 视频广告过滤方法、设备及存储介质
KR20220021495A (ko) Ai에 기반하여 스트리밍 파일의 해상도를 개선하는 방법
KR20220021494A (ko) Ai 기반 해상도 개선 시스템
KR20220070866A (ko) 딥러닝 기술이 적용된 영상 개선 방법, 장치 및 프로그램
CN113542780B (zh) 一种网络直播视频的压缩伪影去除方法及装置
KR102130078B1 (ko) 해상도 향상도를 바탕으로 인공지능 파라미터를 변경하는 시스템
KR102130077B1 (ko) 격자 생성 패턴 정보를 바탕으로 해상도를 개선하는 시스템
KR101997909B1 (ko) 해상도 복원을 위한 ai 영상학습 파라미터 추출 프로그램 및 기록매체
WO2019209006A1 (ko) 스트리밍 파일의 해상도 개선 방법
WO2019209005A1 (ko) 인공지능 기반 해상도 개선 시스템
CN113628121A (zh) 数据处理、训练多媒体数据的方法和装置
KR20220070868A (ko) 전자 장치의 제어 방법, 장치 및 프로그램

Legal Events

Date Code Title Description
AS Assignment

Owner name: GDFLAB CO., LTD., KOREA, REPUBLIC OF

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JANG, KYOUNG IK;REEL/FRAME:051138/0785

Effective date: 20191129

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION