DE102018008502A1

DE102018008502A1 - Video rendering system

Info

Publication number: DE102018008502A1
Application number: DE102018008502.4A
Authority: DE
Inventors: Frederick George Walls; Richard Hayden Wyman; Jason William Herrick; David Wu; Brett J. ANDREWS; Wade Keith Drive
Original assignee: Avago Technologies International Sales Pte Ltd
Current assignee: Avago Technologies International Sales Pte Ltd
Priority date: 2017-10-31
Filing date: 2018-10-29
Publication date: 2019-05-02

Abstract

Eine Vorrichtung für verbessertes Rendering umfasst eine Reihe von Verarbeitungskanälen, um mehrere Eingabe-Inhaltsquellen zu empfangen und diesen Eingabeinhalt zu verarbeiten. Ein Compositor kann verarbeiteten Eingabeinhalt zusammensetzen, um ein zusammengesetztes Ausgabesignal zu generieren. Ein Ausgabe-Anpassungsblock kann das zusammengesetzte Ausgabesignal zusammen mit dynamischen Metadaten zur Anzeige mittels einer Anzeigevorrichtung anpassen. Jeder Verarbeitungskanal umfasst einen Statistikgenerator und einen Eingabe-Anpassungsblock.

An improved rendering device includes a series of processing channels for receiving multiple input content sources and processing that input content. A compositor can assemble processed input content to generate a composite output signal. An output adjustment block may adjust the composite output signal along with dynamic metadata for display by a display device. Each processing channel includes a statistics generator and an input adaptation block.

Description

Die vorliegende Beschreibung betrifft allgemein Videoverarbeitung und insbesondere ein System für verbessertes Rendering von mittels Set-Top-Boxen generiertem Video.The present description relates generally to video processing, and more particularly to a system for improved rendering of set-top box generated video.

Video-Rendering wird im Allgemeinen durchgeführt, wenn eine elektronische Vorrichtung Videodaten empfängt und die Videodaten zur Anzeige an eine Anzeigevorrichtung ausgibt. Die neuesten Fortschritte bei der Anzeigetechnologie haben das Farbvolumen, den Dynamikbereich und die Helligkeit der Anzeigevorrichtungen verbessert. Mit internationalen Standards, wie beispielsweise der ITU-T-Empfehlung BT.2020 und der ITU-T-Empfehlung BT.2100, hat die Internationale Fernmeldeunion (International Telecommunication Union) den Versuch unternommen, Signalformate zu schaffen, die den Transport von Videoinhalt erlauben, der ein größeres in diesen Vorrichtungen verfügbares Farbvolumen einschließlich WCG-Inhalt (Wide Color Gamut, großer Farbumfang), HDR-Inhalt (High Dynamic Range, hoher Dynamikbereich) und/oder HB-Inhalt (High Brightness, große Helligkeit) nutzt. Jedoch müssen die Consumer-Anzeigevorrichtungen auch in der Lage sein, diese unterstützten Signalformate in vollem Umfang wiederzugeben. Zusätzlich unterstützen viele vorhandene Anzeigevorrichtungen nicht einmal den Empfang von Videoinhalt unter Verwendung dieser Standards. Daher muss eine Set-Top-Box (zum Beispiel ein Dongle, ein Empfänger, ein PC, ein Umsetzer, ein Speicherplatten-Abspielgerät oder dergleichen) oder ein ähnlicher Typ von Endgerät möglicherweise HDR-, WCG- und/oder HB-Inhalt an die Leistungsmerkmale der Anzeigeeinheit anpassen. Eine Anzeigevorrichtung, die den Empfang von Videoinhalt unter Verwendung eines oder mehrerer dieser Standards unterstützt, muss HDR-, WCG- und/oder HB-Inhalt möglicherweise an ihre eigenen Leistungsmerkmale anpassen. Diese Arten der Anpassung können als Farbvolumen- und Luminanzanpassung (CVLA, Color Volume and Luminance Adaptation) bezeichnet werden.Video rendering is generally performed when an electronic device receives video data and outputs the video data for display to a display device. Recent advances in display technology have improved the color volume, dynamic range and brightness of the display devices. With international standards, such as ITU-T Recommendation BT.2020 and ITU-T Recommendation BT.2100, the International Telecommunication Union has attempted to create signal formats that allow the transport of video content, which utilizes a larger color volume available in these devices, including Wide Color Gamut (WCG) content, High Dynamic Range (HDR) content, and / or High Brightness (HB) content. However, the consumer display devices must also be able to fully reproduce these supported signal formats. In addition, many existing display devices do not even support the reception of video content using these standards. Therefore, a set-top box (for example, a dongle, a receiver, a PC, a converter, a disk player, or the like) or a similar type of terminal may need to include HDR, WCG, and / or HB content Adjust the features of the display unit. A display device that supports reception of video content using one or more of these standards may need to adapt HDR, WCG, and / or HB content to its own capabilities. These types of adaptation can be referred to as Color Volume and Luminance Adaptation (CVLA).

Eine einfache CVLA, die in einer Anzeigevorrichtung vorhanden sein kann, würde eine statische Luminanz-Neuzuordnung, wobei helle Bereiche des Bildes auf eine geringere als die maximale Helligkeit der Anzeigevorrichtung herunterkomprimiert werden, und eine Farb-Clipping-Funktion, die Farben an dem Rand oder außerhalb des unterstützten Farbumfangs komprimiert oder abschneidet, umfassen. Aber das mit diesen Techniken erreichte Ergebnis des Anzeige-Renderings kann verwaschen erscheinen oder mit Detailverlust oder dem Verlust von Farbgenauigkeit behaftet sein.A simple CVLA, which may be present in a display device, would be static luminance remapping, with bright areas of the image being compressed down to less than the maximum brightness of the display device, and a color clipping function, the colors at the edge or compressed or truncated outside the supported gamut. However, the display rendering result achieved with these techniques may appear washed out, or be subject to loss of detail or loss of color accuracy.

Gemäß einer Erscheinungsform der Erfindung umfasst eine Vorrichtung Folgendes:

eine Vielzahl von zum Empfangen einer Vielzahl von Eingabeinhalten und zum Verarbeiten der Vielzahl von Eingabeinhalten konfigurierten Verarbeitungskanälen;
einen Compositor, der so konfiguriert ist, dass er eine Vielzahl von verarbeiteten Eingabeinhalten zu einem zusammengesetzten Ausgabesignal zusammensetzt; und
einen zum Anpassen des zusammengesetzten Ausgabesignals konfigurierten Ausgabe-Anpassungsblock,
wobei wenigstens ein Verarbeitungskanal von der Vielzahl von Verarbeitungskanälen einen Statistikgenerator und einen Eingabe-Anpassungsblock umfasst.

According to one aspect of the invention, an apparatus comprises:

a plurality of processing channels configured to receive a plurality of input contents and to process the plurality of input contents;
a compositor configured to compose a plurality of processed input contents into a composite output signal; and
an output adjustment block configured to adjust the composite output signal,
wherein at least one processing channel of the plurality of processing channels comprises a statistics generator and an input adaptation block.

Zweckmäßigerweise umfasst die Vielzahl von Eingabeinhalten Eingabe-Videoinhalt und -Grafikinhalt, wobei die Vorrichtung die Anzeigevorrichtung umfasst oder eine kommunikativ mit der Anzeigevorrichtung gekoppelte Set-Top-Box umfasst.Conveniently, the plurality of input contents comprises input video content and graphic content, the device comprising the display device or comprising a set-top box communicatively coupled to the display device.

Zweckmäßigerweise ist der Statistikgenerator so konfiguriert, dass er Statistiken zu einem jeweiligen Eingabeinhalt von der Vielzahl von Eingabeinhalten berechnet, wobei die Statistiken wenigstens eines von einem Histogramm, einem Histogramm mit Klasseneinteilung, einem 2D-Histogramm, einem 3D-Histogramm, einem Minimum, einem Maximum, einer Summe oder einem Mittelwert einer oder mehrerer Größen umfassen.Conveniently, the statistics generator is configured to compute statistics on a respective input content of the plurality of input contents, the statistics being at least one of a histogram, a classed histogram, a 2D histogram, a 3D histogram, a minimum, a maximum , a sum or an average of one or more variables.

Zweckmäßigerweise umfassen die eine oder die mehreren Größen Luminanzwerte, Werte der Komponenten Rot (R), Grün (G) und Blau (B), einen Wert für MAX (a*R, b*G, c*B) oder einen Wert für SUM (d*R, e*G, f*B) für jedes Pixel des jeweiligen Eingabeinhalts, wobei a, b, c, d, e und f konstante Werte sind und MAX und SUM eine Maximum- bzw. eine Summenfunktion darstellen.Conveniently, the one or more sizes include luminance values, values of the components red (R), green (G) and blue (B), a value for MAX (a * R, b * G, c * B), or a value for SUM (d * R, e * G, f * B) for each pixel of the respective input content, where a, b, c, d, e and f are constant values and MAX and SUM represent a maximum and a sum, respectively.

Zweckmäßigerweise ist der Statistikgenerator so konfiguriert, dass er Statistiken auf der Grundlage einer Darstellung einer Lumad-Differenz oder einer Farbdifferenz berechnet.Conveniently, the statistics generator is configured to calculate statistics based on a representation of a luma difference or a color difference.

Zweckmäßigerweise umfasst der Eingabe-Anpassungsblock einen Farbvolumen- und Luminanz-Anpassungsblock (CVLA-Block), wobei CVLA-Blöcke unterschiedlicher Verarbeitungskanäle der Vielzahl von Verarbeitungskanälen unterschiedlich konfiguriert sind.Conveniently, the input adjustment block comprises a color volume and luminance adjustment block (CVLA block), wherein CVLA blocks of different processing channels of the plurality of processing channels are configured differently.

Zweckmäßigerweise umfasst der CVLA-Block eine Vielzahl von mittels eines Prozessors gesteuerten Hardwaremodulen (HW-Module) für Nichtlinearität und Farbraumtransformation, wobei der CVLA-Block so konfiguriert ist, dass er eine Volumentransformation sowie eine statische und dynamische Tonzuordnung durchführt.Conveniently, the CVLA block comprises a plurality of processor-controlled hardware modules (HW modules) for nonlinearity and color space transformation, wherein the CVLA block is configured to provide a Volume transformation as well as a static and dynamic tone assignment performs.

Zweckmäßigerweise wird die dynamische Tonzuordnung auf der Grundlage der von einem oder mehreren der Vielzahl von Eingabeinhalten umfassten dynamischen Metadaten oder auf der Grundlage einer Analyse möglicher Änderungen an Szenenparametern der Vielzahl von Eingabeinhalten durchgeführt.Conveniently, the dynamic tone mapping is performed based on the dynamic metadata comprised of one or more of the plurality of input contents, or based on an analysis of possible changes to scene parameters of the plurality of input contents.

Zweckmäßigerweise ist der Ausgabe-Anpassungsblock so konfiguriert, dass er das zusammengesetzte Ausgabesignal zusammen mit dynamischen Metadaten zur Anzeige mittels einer Anzeigevorrichtung anpasst, wobei der Ausgabe-Anpassungsblock so konfiguriert ist, dass er das zusammengesetzte Ausgabesignal für die Anzeige anpasst, indem er wenigstens eine von einer Luminanz- oder einer Farbeigenschaft des zusammengesetzten Ausgabesignals anpasst.Conveniently, the output adjustment block is configured to adjust the composite output signal along with dynamic metadata for display by a display device, wherein the output adaptation block is configured to adjust the composite output signal for the display by at least one of Luminance or a color property of the composite output signal adapts.

Zweckmäßigerweise umfasst die Vorrichtung ferner einen Prozessor, der zum Generieren der dynamischen Metadaten auf der Grundlage von mittels Statistikgeneratoren der Vielzahl von Verarbeitungskanälen generierten Statistiken konfiguriert ist, und sie umfasst ferner einen zum Formatieren der dynamischen Metadaten zur Verwendung durch die Anzeigevorrichtung konfigurierten Metadaten-Formatierungsblock.Conveniently, the apparatus further comprises a processor configured to generate the dynamic metadata based on statistics generated by statistics generators of the plurality of processing channels, and further comprises a metadata formatting block configured to format the dynamic metadata for use by the display device.

Zweckmäßigerweise ist der Prozessor so konfiguriert, dass er die von einem oder mehreren Eingabeinhalten der Vielzahl von Eingabeinhalten umfassten dynamischen Metadaten auf der Grundlage von mittels der Statistikgeneratoren der Vielzahl von Verarbeitungskanälen generierten Statistiken modifiziert.Conveniently, the processor is configured to modify the dynamic metadata comprised of one or more input contents of the plurality of input contents based on statistics generated by the statistics generators of the plurality of processing channels.

Zweckmäßigerweise umfasst ein erster Eingabeinhalt eines ersten Verarbeitungskanals der Vielzahl von Verarbeitungskanälen die dynamischen Metadaten, die direkt an die Anzeigevorrichtung übergeben werden, wobei jeder Eingabe-Anpassungsblock von anderen Verarbeitungskanälen als dem ersten Verarbeitungskanal der Vielzahl von Verarbeitungskanälen so konfiguriert ist, dass er eine Umkehrfunktion einer voraussichtlichen Anzeigeverarbeitung der dynamischen Metadaten durchführt.Conveniently, a first input content of a first processing channel of the plurality of processing channels comprises the dynamic metadata passed directly to the display device, each input adaptation block of processing channels other than the first processing channel of the plurality of processing channels configured to be an inverse of an expected one Display processing of the dynamic metadata.

Zweckmäßigerweise umfasst ein erster Eingabeinhalt eines ersten Verarbeitungskanals der Vielzahl von Verarbeitungskanälen die dynamischen Metadaten, die direkt an die Anzeigevorrichtung übergeben werden, wobei der Ausgabe-Anpassungsblock einen Anpassungsblock entsprechend jedem anderen Verarbeitungskanal als dem ersten Verarbeitungskanal der Vielzahl von Verarbeitungskanälen umfasst, wobei der Anpassungsblock so konfiguriert ist, dass er eine Umkehrfunktion einer voraussichtlichen Anzeigeverarbeitung der dynamischen Metadaten durchführt.Conveniently, a first input content of a first processing channel of the plurality of processing channels comprises the dynamic metadata passed directly to the display device, the output adaptation block comprising an adaptation block corresponding to each processing channel other than the first processing channel of the plurality of processing channels, the adaptation block configured is that it performs an inverse function of prospective dynamic metadata display processing.

Gemäß einer Erscheinungsform umfasst ein Verfahren zum verbesserten Rendering von Videoinhalt aus einer Vielzahl von Quellen Folgendes:

Umsetzen des Videoinhalts aus der Vielzahl von Quellen unter Verwendung von dynamischen Metadaten in eine von einer Zusammensetzungsdomäne oder einer Ausgabedomäne; und
Aufnehmen einer Umkehrfunktion einer voraussichtlichen Anzeigeverarbeitung der dynamischen Metadaten in wenigstens eine von der Zusammensetzungsdomäne oder der Ausgabedomäne,
wobei das Verwenden der dynamischen Metadaten das direkte Verwenden oder Modifizieren von dynamischen Metadaten aus einer oder mehreren ausgewählten Quellen der Vielzahl von Quellen umfasst.

In one aspect, a method for improved rendering of video content from a variety of sources includes:

Converting the video content from the plurality of sources using dynamic metadata into one of a composition domain or an output domain; and
Incorporating an inverse function of prospective dynamic metadata display processing into at least one of the composition domain or the output domain,
wherein using the dynamic metadata comprises directly using or modifying dynamic metadata from one or more selected sources of the plurality of sources.

Zweckmäßigerweise umfasst eine Domäne wenigstens einige von einer Sollhelligkeits-, einer Farbvolumen-, einer Weißpunkt- oder einer Farbkomponentendarstellung, wobei:

die Zusammensetzungsdomäne mit einem Compositor und die Ausgabedomäne mit einer Ausgabe-Anpassungsschaltung verbunden ist,
die Umsetzung mittels einer Vielzahl von Verarbeitungskanälen durchgeführt wird, und
jeder Verarbeitungskanal der Vielzahl von Verarbeitungskanälen eine Statistikgeneratorschaltung und eine Eingabe-Anpassungsschaltung umfasst.

Conveniently, a domain comprises at least some of a desired brightness, a color volume, a white point, or a color component representation, wherein:

the composition domain is connected to a compositor and the output domain is connected to an output matching circuit,
the conversion is performed by means of a plurality of processing channels, and
each processing channel of the plurality of processing channels comprises a statistics generator circuit and an input matching circuit.

Zweckmäßigerweise umfasst das Verfahren ferner Folgendes:

Bereitstellen von Ausgabedomäneninhalt für eine Ausgabevorrichtung einschließlich eines von einer Anzeigevorrichtung oder einem Video-Decodierer, und
Empfangen von Ausgabeparametern, einschließlich Spitzenhelligkeit, Primärfarben und Dynamikbereich der gesamten Ausgabe mittels des Compositors.

Conveniently, the method further comprises:

Providing output domain content for an output device including one of a display device or a video decoder, and
Receive output parameters, including peak brightness, primary colors, and dynamic range of the entire output using the compositor.

Zweckmäßigerweise basiert das Modifizieren der dynamischen Metadaten der einen oder mehreren ausgewählten Quellen der Vielzahl von Quellen auf einer gewichteten Summe von mittels Statistikgeneratorschaltungen der Vielzahl von Verarbeitungskanälen generierten Statistiken.Conveniently, modifying the dynamic metadata of the one or more selected sources of the plurality of sources is based on a weighted sum of statistics generated by statistics generator circuits of the plurality of processing channels.

Gemäß einer Erscheinungsform umfasst eine Set-Top-Box (STB) Folgendes:

einen Prozessor;
einen zum Empfangen von Videoinhalt aus einer Vielzahl von Quellen konfigurierten Videoverarbeitungsblock, wobei der Videoverarbeitungsblock Folgendes umfasst:
- eine Vielzahl von zum Verarbeiten des empfangenen Videoinhalts konfigurierten Eingabe-Verarbeitungskanälen;
- einen Compositor, der so konfiguriert ist, dass er eine Vielzahl von verarbeiteten Eingabeinhalten zusammensetzt, um ein zusammengesetztes Ausgabesignal zu generieren; und
- einen zum Anpassen des zusammengesetzten Ausgabesignals für eine Ausgabevorrichtung konfigurierten Ausgabe-Anpassungsblock,
wobei der Prozessor so konfiguriert ist, dass er dynamische Metadaten in ein angepasstes, zusammengesetztes Ausgabesignal des Ausgabe-Anpassungsblocks aufnimmt.

According to one aspect, a set top box (STB) comprises:

a processor;
a video processing block configured to receive video content from a plurality of sources, the video processing block comprising:
- a plurality of input processing channels configured to process the received video content;
- a compositor configured to compose a plurality of processed input contents to generate a composite output signal; and
- an output adjustment block configured to adjust the composite output signal for an output device;
wherein the processor is configured to receive dynamic metadata in a customized composite output signal of the output adjustment block.

Zweckmäßigerweise umfasst jeder Eingabe-Verarbeitungskanal der Vielzahl von Eingabe-Verarbeitungskanälen einen Statistikgenerator und einen Eingabe-Anpassungsblock, wobei der Statistikgenerator so konfiguriert ist, dass er Statistiken zu einem jeweiligen Eingabeinhalt berechnet, und wobei jeder der Eingabe-Anpassungsblöcke und der Ausgabe-Anpassungsblock einen Farbvolumen- und Luminanz-Anpassungsblock (CVLA-Block) umfasst.Conveniently, each input processing channel of the plurality of input processing channels includes a statistics generator and an input adjustment block, wherein the statistics generator is configured to compute statistics about a respective input content, and wherein each of the input adjustment blocks and the output adjustment block is a color volume and luminance adjustment block (CVLA block).

Zweckmäßigerweise ist der Prozessor so konfiguriert, dass er die dynamischen Metadaten auf der Grundlage von mittels Statistikgeneratoren der Vielzahl von Eingabe-Verarbeitungskanälen generierten Statistiken generiert, wobei der Prozessor so konfiguriert ist, dass er die von einem oder mehreren Eingabeinhalten umfassten Video-Metadaten auf der Grundlage von mittels der Statistikgeneratoren der Vielzahl von Eingabe-Verarbeitungskanälen generierten Statistiken modifiziert, wobei jeder Eingabe-Anpassungsblock von anderen Eingabe-Verarbeitungskanälen als einem ersten Eingabe-Verarbeitungskanal der Vielzahl von Eingabe-Verarbeitungskanälen so konfiguriert ist, dass er eine Umkehrfunktion einer voraussichtlichen Anzeigeverarbeitung der dynamischen Metadaten durchführt.Conveniently, the processor is configured to generate the dynamic metadata based on statistics generated by statistics generators of the plurality of input processing channels, the processor being configured to base the video metadata comprised of one or more input contents modifies statistics generated by the statistics generators of the plurality of input processing channels, each input adaptation block of input processing channels other than a first input processing channel of the plurality of input processing channels being configured to be an inverse of prospective display processing of the dynamic metadata performs.

Figurenlistelist of figures

Bestimmte Merkmale der beanspruchten Technologie sind in den beigefügten Ansprüchen dargelegt. Zu Zwecken der Erläuterung sind jedoch mehrere Ausführungsbeispiele der beanspruchten Technologie in den folgenden Figuren dargelegt.

1 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine Set-Top-Box (STB) mit verbessertem Video-Rendering gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.
2 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine STB mit verbessertem Video-Rendering unter Verwendung von statistikbasierten dynamischen Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.
3 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine STB mit verbessertem Video-Rendering unter Verwendung von Eingaben und statistikbasierten dynamischen Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.
4 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine STB mit verbessertem Video-Rendering unter Verwendung von dynamischen Eingabe-Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.
5 ist ein Blockdiagramm, das ein Beispiel für eine STB mit verbessertem Video-Rendering unter Verwendung von dynamischen Eingabe-Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.
6A und 6B sind Ablaufdiagramme, die Beispiele von Programmabläufen eines Farbvolumen- und Luminanz-Anpassungsblocks (CVLA-Block) gemäß Erscheinungsformen der beanspruchten Technologie veranschaulichen.
7 ist ein Diagramm, das ein Beispiel für eine nichtlineare Luminanz-Zuordnungsfunktion veranschaulicht.
8 ist ein Ablaufdiagramm, das ein beispielhaftes Verfahren für ein verbessertes Rendering von Videoinhalten aus einer Reihe von Quellen gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.
9 ist ein Blockdiagramm, das eine beispielhafte Umgebung veranschaulicht, in der das verbesserte Video-Rendering der beanspruchten Technologie implementiert ist.
10 ist ein Blockdiagramm, das eine beispielhafte Systemarchitektur einer STB veranschaulicht, in der das verbesserte Video-Rendering der beanspruchten Technologie implementiert ist.
11 ist ein Blockdiagramm, das ein Beispiel für eine STB mit verbessertem Video-Rendering unter Verwendung von Eingaben und statistikbasierten dynamischen Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht.

Certain features of the claimed technology are set forth in the appended claims. However, for purposes of explanation, several embodiments of the claimed technology are set forth in the following figures.

1 Figure 9 is a high level block diagram illustrating an example of a set-top box (STB) with enhanced video rendering according to aspects of the claimed technology.
2 Figure 10 is a high level block diagram illustrating an example of an STB with enhanced video rendering using statistics-based dynamic metadata according to aspects of the claimed technology.
3 is a high-level block diagram illustrating an example of an STB with enhanced video rendering using inputs and statistic-based dynamic metadata according to aspects of the claimed technology.
4 Figure 10 is a high level block diagram illustrating an example of an enhanced video rendering STB using dynamic input metadata according to aspects of the claimed technology.
5 FIG. 10 is a block diagram illustrating an example of an STB with enhanced video rendering using dynamic input metadata according to aspects of the claimed technology.
6A and 6B 13 are flowcharts illustrating examples of program runs of a color volume and luminance adjustment block (CVLA block) in accordance with aspects of the claimed technology.
7 Figure 13 is a diagram illustrating an example of a non-linear luminance mapping function.
8th FIG. 5 is a flow chart illustrating an exemplary method for improved rendering of video content from a variety of sources in accordance with aspects of the claimed technology.
9 Figure 10 is a block diagram illustrating an exemplary environment in which the enhanced video rendering of the claimed technology is implemented.
10 Figure 13 is a block diagram illustrating an exemplary system architecture of an STB implementing the enhanced video rendering of the claimed technology.
11 Figure 10 is a block diagram illustrating an example of an STB with enhanced video rendering using inputs and statistics-based dynamic metadata in accordance with aspects of the claimed technology.

AUSFÜHRLICHE BESCHREIBUNGDETAILED DESCRIPTION

Die nachfolgend dargelegte Beschreibung ist als Beschreibung verschiedener Konfigurationen der beanspruchten Technologie vorgesehen, und sie soll nicht die einzigen Konfigurationen verkörpern, in denen die beanspruchte Technologie in der Praxis ausgeführt werden kann. Die beigefügten Zeichnungen sind in das vorliegende Dokument aufgenommen und bilden einen Bestandteil der ausführlichen Beschreibung. Die ausführliche Beschreibung enthält spezifische Details, die dem Zweck dienen sollen, ein besseres Verständnis der beanspruchten Technologie zu ermöglichen. Die beanspruchte Technologie ist jedoch nicht auf die in diesem Dokument dargelegten spezifischen Details beschränkt und kann ohne eines oder mehrere der spezifischen Details ausgeführt werden. In einigen Fällen sind Strukturen und Komponenten in der Form eines Blockdiagramms gezeigt, um zu verhindern, dass die Konzepte der beanspruchten Technologie unverständlich werden.The description set forth below is intended to describe various configurations of the claimed technology, and is not intended to embody the only configurations in which the claimed technology may be practiced. The accompanying drawings are incorporated in and constitute a part of the detailed description. The detailed description contains specific details intended to provide a better understanding of the claimed technology. However, the claimed technology is not limited to the specific details set forth in this document and may be practiced without one or more of the specific details. In some cases, structures and components are shown in the form of a block diagram to prevent the concepts of the claimed technology from becoming obscure.

Bei einer oder mehreren Erscheinungsformen der beanspruchten Technologie sind Systeme und Konfigurationen zum Verbessern des Video-Renderings beschrieben. Die beanspruchte Technologie kann das Nutzererlebnis steigern, zum Beispiel durch Verbesserung des Erscheinungsbildes von Videoinhalt mit Standard-Dynamikbereich (SDR), hohem Dynamikbereich (HDR), großem Farbumfang (WCG) und/oder großer Helligkeit (HB), der von einer Set-Top-Box (STB) an eine Anzeigevorrichtung übergeben wird. Bei dem Videoinhalt kann es sich um eine Zusammensetzung von auf der Grundlage einer Reihe von Eingabe-Videodaten und/oder einem oder mehreren Grafikdatenelementen bereitgestelltem Videoinhalt handeln. Die beanspruchte Technologie erlaubt ferner das Bereitstellen von mit einem oder mehreren Eingabe-Videodatenelementen verbundenen dynamischen Metadaten sowie Korrekturen an Verarbeitungspfaden anderer Eingabe-Videodaten und/oder einem oder mehreren Grafikdatenelementen. Bei einigen Implementierungen können die Prozesse der beanspruchten Technologie mittels Hardware (HW) und Software (SW) in einer STB (zum Beispiel einem Dongle, einem Empfänger, einem PC, einem Umsetzer, einem Speicherplatten-Abspielgerät oder dergleichen) implementiert werden. Bei einer oder mehreren Implementierungen können die Prozesse der beanspruchten Technologie in einer Anzeigevorrichtung (zum Beispiel einem Fernsehgerät oder einem Monitor) und/oder in einem Übertragungssystem (zum Beispiel den Einrichtungen einer Sendeanstalt, einer Computervorrichtung, wie beispielsweise einem PC oder Laptop-Computer, einer intelligenten Kommunikationsvorrichtung, wie beispielsweise einem Mobiltelefon oder einem Tablet-Computer, oder anderen Sendeeinrichtungen und -vorrichtungen) implementiert sein. Die dynamischen Metadaten, wie sie von den bestehenden Lösungen bereitgestellt werden, berücksichtigen möglicherweise nicht die Modifikationen in dem mittels der STB verarbeiteten Videoinhalt und können zu einem fehlerhaften Rendering, zu Flackern, einem verwaschenen Videobild und/oder anderen unerwünschten visuellen Artefakten führen. Die beanspruchte Technologie ermöglicht, neben einer Reihe von vorteilhaften Merkmalen, die Berücksichtigung von Modifikationen und Änderungen aufgrund der Verarbeitung des Eingabe-Videoinhalts mittels dem Zusammensetzen und/oder Mischen folgenden Verarbeitungsschritten, zum Beispiel mittels einer Anzeigevorrichtung unter Verwendung geeigneter Umkehrfunktionen, wie in dem vorliegenden Dokument beschrieben.In one or more aspects of the claimed technology, systems and configurations for enhancing video rendering are described. The claimed technology can enhance the user experience, for example, by improving the appearance of video content with standard dynamic range (SDR), high dynamic range (HDR), large color gamut (WCG), and / or high brightness (HB) from a set-top Box (STB) is passed to a display device. The video content may be a composite of video content provided based on a series of input video data and / or one or more graphics data items. The claimed technology further allows for providing dynamic metadata associated with one or more input video data items as well as corrections to processing paths of other input video data and / or one or more graphics data items. In some implementations, the processes of the claimed technology may be implemented by hardware (HW) and software (SW) in an STB (eg, a dongle, a receiver, a PC, a converter, a disk player, or the like). In one or more implementations, the processes of the claimed technology may reside in a display device (eg, a television or a monitor) and / or in a transmission system (e.g., broadcast facilities, computer equipment such as a personal computer or laptop computer, etc.) intelligent communication device, such as a mobile phone or a tablet computer, or other transmitting devices and devices). The dynamic metadata provided by the existing solutions may not take into account the modifications in the video content processed by the STB, and may result in flawed rendering, flickering, a blurred video image, and / or other unwanted visual artifacts. The claimed technology allows, besides a number of advantageous features, the consideration of modifications and changes due to the processing of the input video content by means of compositing and / or mixing following processing steps, for example by means of a display device using suitable inverse functions, as in the present document described.

1 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine Set-Top-Box (STB) 100 mit verbessertem Video-Rendering gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Im Allgemeinen kann der Begriff STB in dem Sinne der vorliegenden Offenbarung ein Subsystem zur Videozusammensetzung darstellen. Die STB 100 umfasst eine Reihe von Statistikgeneratorblöcken (zum Beispiel Schaltungen) 110 (zum Beispiel 110-1 bis 110-n), eine Reihe von Eingabe-Anpassungsblöcken, wie beispielsweise Farbvolumen- und Luminanz-Anpassungsblöcke (CVLA-Blöcke) (zum Beispiel Schaltungen) 120 (zum Beispiel 120-1 bis 120-n), einen Compositor 130 und einen Ausgabe-Anpassungsblock, wie beispielsweise einen CVLA-Block (zum Beispiel Schaltungen) 140. Bei einer oder mehreren Implementierungen empfangen die Statistikgeneratorblöcke 110-1 bis 110-m Eingabe-Videoinhalt 102 (zum Beispiel 102-1 bis 102-m) aus einer Reihe von (zum Beispiel m) Quellen von Videoinhalt. Die Statistikgeneratorblöcke 110-n empfangen eine Grafikeingabe 102-n von einer Grafikquelle. Bei einer oder mehreren Implementierungen kann es sich bei den Quellen von Videoinhalt und bei der Grafikquelle um Speichervorrichtungen (zum Beispiel Festplattenlaufwerk, Flash-Speicher, oder anderer Speichertyp), eine Funkfrequenzschaltung (RF-Schaltung) (zum Beispiel ein analoges Front-End, eine Abstimmeinheit (Tuner) oder ein Demodulator) oder um die Schnittstelle eines drahtgebundenen Netzwerks (zum Beispiel Ethernet) oder um die Schnittstelle eines drahtlosen Netzwerks (zum Beispiel Wi-Fi, Bluetooth oder anderes Netzwerk) handeln. 1 is a high-level block diagram that shows an example of a set-top box (STB) 100 with improved video rendering according to aspects of the claimed technology. In general, the term STB in the sense of the present disclosure may represent a video composition subsystem. The STB 100 includes a series of statistics generator blocks (eg, circuits) 110 (e.g. 110-1 to 110-n ), a series of input matching blocks, such as color volume and luminance adjustment blocks (CVLA blocks) (e.g., circuits) 120 (e.g. 120-1 to 120-n ), a compositor 130 and an output adaptation block, such as a CVLA block (e.g., circuits) 140. In one or more implementations, the statistics generator blocks receive 110 - 1 to 110 m Input video content 102 (for example 102-1 to 102-m ) from a series of (for example m) sources of video content. The statistics generator blocks 110-n receive a graphic input 102-n from a graphics source. In one or more implementations, the sources of video content and graphics source may be memory devices (eg, hard disk drive, flash memory, or other type of memory), a radio frequency (RF) circuit (eg, an analog front end, a Tuner or a demodulator) or the interface of a wired network (for example, Ethernet) or the interface of a wireless network (for example, Wi-Fi, Bluetooth or other network) act.

Bei einer oder mehreren Implementierungen umfasst der Statistikgenerator 110 eine geeignete Schaltungsanordnung, Logik und/oder geeigneten Code zum Berechnen von Statistiken zu einem jeweiligen Eingabe-Video- oder -Grafikinhalt 102. Ein Beispiel für die Statistiken umfasst ein Histogramm, ein Histogramm mit Klasseneinteilung, ein zweidimensionales Histogramm, ein dreidimensionales Histogramm, ein Minimum, ein Maximum, eine Summe oder einen Mittelwert einer oder mehrerer Größen. Beispiele für diese Größen umfassen Luminanzwerte, Werte der Komponenten Rot (R), Grün (G) und Blau (B), einen Wert für MAX (a*R, b*G, c* B) oder einen Wert für SUM (d*R, e*G, f*B) für jedes Pixel des jeweiligen Eingabeinhalts 102, wobei a, b, c, d, e und f konstante Werte sind und MAX und SUM eine Maximum- bzw. eine Summenfunktion darstellen. Bei einer oder mehreren Implementierungen kann der Statistikgenerator 110 Statistiken auf der Grundlage einer Darstellung einer Luma-Differenz oder einer Farbdifferenz berechnen, zum Beispiel die Farbfamilienräume Y'Cb'Cr' und I'Ct'Cp', wobei Y' und I' jeweils Luma und Volumen und Cb', Cr', Ct' und Cp' Farbdifferenzen darstellen.In one or more implementations, the statistics generator includes 110 suitable circuitry, logic and / or appropriate code for calculating statistics at a 102. An example of the statistics includes a histogram, a classed histogram, a two-dimensional histogram, a three-dimensional histogram, a minimum, a maximum, a sum, or an average of one or more quantities. Examples of these quantities include luminance values, values of the components red (R), green (G) and blue (B), a value for MAX (a * R, b * G, c * B) or a value for SUM (d * R, e * G, f * B) for each pixel of the respective input content 102 where a, b, c, d, e and f are constant values and MAX and SUM represent a maximum and a sum function, respectively. In one or more implementations, the statistics generator 110 Calculate statistics based on a representation of a luma difference or a color difference, for example, the color family spaces Y'Cb'Cr 'and I'Ct'Cp', where Y 'and I' are respectively luma and volume, and Cb ', Cr', Ct 'and Cp' represent color differences.

Bei einer oder mehreren Implementierungen umfasst der CVLA-Block 120 eine geeignete Schaltungsanordnung, Logik und/oder geeigneten Code zum Durchführen von Volumentransformation und statischer und dynamischer Tonzuordnung. Der CVLA-Block 120 kann zum Beispiel prozessorgesteuerte Hardwaremodule (HW-Module) für Nichtlinearität und Farbraumtransformation umfassen, wie in dem vorliegenden Dokument noch ausführlicher beschrieben wird. Bei einer oder mehreren Implementierungen können die CVLA-Blöcke 120 (zum Beispiel 120-1 bis 120-n) auf der Grundlage des Eingabeinhalts 102 unterschiedlich konfiguriert sein. Die für die CVLA-Blöcke 120 verwendeten Parameter können für einen bestimmten Nutzungsmodus statisch sein, oder sie können sich als Reaktion auf den anzupassenden Video- oder Grafikinhalt ändern. Informationen über das Quellvideo oder die Quellgrafik (zum Beispiel Luminanz-/Helligkeits- oder Farbdifferenzhistogramme, APL (mittlerer Bildpegel), mittlere Helligkeit, Spitzenhelligkeit, Szenestatistiken, bereichsbasierte Statistiken, Metadaten, usw.) können als Hilfe zur Ableitung geeigneter Parameter verwendet werden.One or more implementations include the CVLA block 120 suitable circuitry, logic, and / or suitable code for performing volume transformation and static and dynamic tone mapping. The CVLA block 120 For example, it may include processor-controlled hardware modules (HW modules) for nonlinearity and color space transformation, as described in greater detail herein. In one or more implementations, the CVLA blocks 120 (for example 120-1 to 120-n ) based on the input content 102 be configured differently. The for the CVLA blocks 120 parameters used may be static for a particular usage mode, or they may change in response to the video or graphic content to be adjusted. Information about the source video or source graphics (for example, luminance / brightness or color difference histograms, APL (average image level), medium brightness, peak brightness, scene statistics, area-based statistics, metadata, etc.) can be used as an aid in deriving appropriate parameters.

Der Compositor 130 umfasst eine geeignete Schaltungsanordnung, Logik und/oder geeigneten Code, um von Verarbeitungskanälen, einschließlich den Statistikgeneratoren 110 und den CVLA-Blöcken 120, Ausgabe-Videoinhalt zu empfangen und nach Bedarf Video- und Grafikinhalt zusammenzusetzen und/oder Farben zu mischen. Die Ausgabe des Compositors 130 wird mittels eines Ausgabe-Anpassungsblocks, wie beispielsweise dem CVLA-Block 140, weiterverarbeitet. Eine Zusammensetzungsfunktion des Compositors 130 kann Grafik- oder Videoinhalt von mehreren Quellen zu einem einzelnen Ausgabeformat kombinieren. Die Zusammensetzungsfunktion kann Alpha-Blending umfassen, das für jede Pixelposition von einem oder mehreren Video- und/oder Grafikinhalten aus verschiedenen Quellen unterschiedliche Teile kombiniert. Bevor das Zusammensetzen stattfindet, kann der Video- und/oder Grafikinhalt (zum Beispiel unter Verwendung des CVLA-Blocks 120) in ein gemeinsames Videoformat umgesetzt werden (zum Beispiel unter Verwendung desselben Farbraums und derselben Übertragungsfunktion oder keiner Übertragungsfunktion). Nach dem Zusammensetzen mittels des Compositors 130 kann der entstandene, zusammengesetzte Videoinhalt unter Verwendung eines Ausgabe-Anpassungsblocks, wie beispielsweise des CVLA-Blocks 140, umgesetzt werden, um ihn an ein gewünschtes Ausgabeformat anzupassen, das optional statische oder dynamische Metadaten umfassen kann, wie weiter unten beschrieben wird.The compositor 130 includes suitable circuitry, logic, and / or suitable code for processing channels, including the statistics generators 110 and the CVLA blocks 120 Receive output video content and compose video and graphics content and / or mix colors as needed. The output of the compositor 130 is determined by means of an output adaptation block, such as the CVLA block 140 , further processed. A composition function of the compositor 130 Can combine graphics or video content from multiple sources into a single output format. The compositing function may include alpha blending that combines different parts for each pixel position of one or more video and / or graphic content from different sources. Before compositing takes place, the video and / or graphic content (for example, using the CVLA block 120 ) are converted into a common video format (for example, using the same color space and the same transfer function or no transfer function). After assembly by means of the compositor 130 For example, the resulting composite video content may be rendered using an output adaptation block, such as the CVLA block 140 , to adapt it to a desired output format, which may optionally include static or dynamic metadata, as described further below.

Eine Video- oder Grafikquelle kann außerdem in einer 3D-Grafik-Engine als Textur verwendet werden. Viele der in dem vorliegenden Dokument beschriebenen Verfahren können auch für das Rendering solcher Texturen verwendet werden. Die Helligkeit der Texturen kann sich mit der Beleuchtung ändern, und solche Effekte können beim Bestimmen der Parameter für den CVLA-Block 140 berücksichtigt werden. Der CVLA-Block 140 kann das zusammengesetzte Ausgabesignal des Compositors 130 für eine Ausgabevorrichtung (zum Beispiel eine Anzeigevorrichtung oder einen Decodierer) anpassen. Bei dem zusammengesetzten Ausgabesignal kann es sich um ein Videosignal handeln, bei dem mehrere Video- und/oder Grafikquellen zur Präsentation zusammengemischt oder zusammengesetzt wurden. Bei einigen Implementierungen empfängt der Compositor 130 von der Ausgabevorrichtung Ausgabeparameter einschließlich Spitzenhelligkeit, Primärfarben, einen Dynamikbereich der gesamten Ausgabe.A video or graphics source can also be used as a texture in a 3D graphics engine. Many of the methods described in the present document can also be used for the rendering of such textures. The brightness of the textures can change with the lighting, and such effects can be used when determining the parameters for the CVLA block 140 be taken into account. The CVLA block 140 can be the composite output signal of the compositor 130 for an output device (eg, a display device or a decoder). The composite output signal may be a video signal in which multiple video and / or graphics sources have been mixed together or assembled for presentation. In some implementations, compositor receives 130 from the output device, output parameters including peak brightness, primary colors, a dynamic range of the entire output.

Unter nochmaliger Bezugnahme auf die Generierung von Statistiken wird die Statistikberechnung bei einem oder mehreren Ausführungsbeispielen mittels des Statistikgenerators 110 für die Quellen durchgeführt, bevor das Zusammensetzen wie oben beschrieben erfolgt. Bei einigen Ausführungsbeispielen wird die Statistikberechnung für das Ausgabevideo durchgeführt, bevor es zur Anzeige abgeschickt wird, zum Beispiel nach dem CVLA-Block 140. Bei einem oder mehreren Ausführungsbeispielen wird die Statistikberechnung für das Ausgabevideo durchgeführt, aber auf den nachfolgenden Ausgabevideorahmen angewendet. Bei anderen Ausführungsbeispielen wird das Videosignal in einem Speicher erfasst, um zu erlauben, dass die Statistiken und die entsprechenden Metadaten berechnet und mit dem verzögerten Ausgaberahmen in Übereinstimmung gebracht werden. Die Statistiken können in Verbindung mit einem Algorithmus zur Erkennung von Szenenwechseln verwendet werden, um Schätzungen für die kumulierten Statistiken für eine Szene abzuleiten.Referring again to the generation of statistics, in one or more embodiments, the statistics calculation is done by means of the statistics generator 110 for the sources before assembling as described above. In some embodiments, statistics computation is performed on the output video before it is submitted for display, for example after the CVLA block 140 , In one or more embodiments, the statistics computation for the output video is performed but applied to the subsequent output video frames. In other embodiments, the video signal is captured in a memory to allow the statistics and the corresponding metadata to be calculated and matched with the delayed output frame. The statistics can be used in conjunction with a scene change detection algorithm to derive estimates for the cumulative statistics for a scene.

2 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine STB 200 mit verbessertem Video-Rendering unter Verwendung von statistikbasierten dynamischen Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Die STB 200 ist ähnlich der STB 100 von 1, mit der Ausnahme, dass die STB 200, wie in dem vorliegenden Dokument beschrieben, ferner den Prozessor 250 und einen Metadaten-Formatierungsblock 260 umfasst. Die Statistikgeneratoren 110, die CVLA-Blöcke 120, der Compositor 130 und der CVLA-Block 140 sind ähnlich den analogen Blöcken in der STB 100. Die STB 200 weist das zusätzliche Leistungsmerkmal zum Ausgeben von Videorahmen mit entsprechenden dynamischen Metadaten an die Anzeigevorrichtung auf. Bei einer oder mehreren Implementierungen werden die dynamischen Metadaten auf der Grundlage der mittels der STB 200 generierten Statistiken generiert. Zum Beispiel kann der Prozessor 250 Statistiken von den Statistikgeneratoren 110 (zum Beispiel 110-1 bis 110-n) empfangen. Die Statistiken können, wie oben beschrieben, Informationen (zum Beispiel ein Histogramm, ein Histogramm mit Klasseneinteilung, ein 2D-Histogramm, ein 3D-Histogramm, ein Minimum, ein Maximum, eine Summe oder einen Mittelwert) von einer oder mehreren Größen, wie beispielsweise Luminanzwerte, Werte der Komponenten Rot (R), Grün (G) und Blau (B), einen Wert für MAX (a*R, b*G, c*B) oder einen Wert für SUM (d*R, e*G, f*B) für jedes Pixel des jeweiligen Eingabeinhalts 102 umfassen. Die mittels des Prozessors 250 generierten dynamischen Metadaten können mittels des Metadaten-Formatierungsblocks 260 zur besseren Anpassung an ein Format der Ausgabevorrichtung (zum Beispiel einer Anzeigeeinheit oder eines Decodierers) formatiert werden. Die formatierten Metadaten werden zusammen mit dem Ausgabe-Videoinhalt des CVLA-Blocks 140 der Ausgabevorrichtung, wie beispielsweise einer Anzeigevorrichtung, als Ausgabe 242 bereitgestellt. 2 is a high-level block diagram that provides an example of an STB 200 with improved video rendering using statistics-based dynamic metadata according to aspects of the claimed technology. The STB 200 is similar to the STB 100 from 1 with the exception that the STB 200 as described in the present document, further the processor 250 and a metadata formatting block 260 includes. The statistics generators 110 , the CVLA blocks 120 , the compositor 130 and the CVLA block 140 are similar to the analog blocks in the STB 100 , The STB 200 has the additional feature of outputting video frames with corresponding dynamic metadata to the display device. In one or more implementations, the dynamic metadata is based on the STB 200 Generated generated statistics. For example, the processor 250 Statistics from the statistics generators 110 (for example 110-1 to 110-n ) received. The statistics may, as described above, provide information (eg, a histogram, a histogram with a classification, a 2D histogram, a 3D histogram, a minimum, a maximum, a sum, or a mean) of one or more quantities, such as Luminance values, values of the components red (R), green (G) and blue (B), a value for MAX (a * R, b * G, c * B) or a value for SUM (d * R, e * G , f * B) for each pixel of the respective input content 102 include. The by means of the processor 250 Generated dynamic metadata can be generated using the metadata formatting block 260 to better suit a format of the output device (for example, a display unit or a decoder) are formatted. The formatted metadata is merged with the output video content of the CVLA block 140 the output device, such as a display device, as an output 242 provided.

3 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine STB 300 mit verbessertem Video-Rendering unter Verwendung von Eingaben und statistikbasierten dynamischen Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Die STB 300 ist ähnlich der STB 200 von 2, mit der Ausnahme, dass in der STB 300 der Prozessor 350 ferner dynamische Metadaten aus einer der Eingabequellen empfängt. Die Statistikgeneratoren 110, die CVLA-Blöcke 120, der Compositor 130, der CVLA-Block 140, der Prozessor 250 und der Metadaten-Formatierungsblock 260 sind ähnlich den analogen Blöcken in der STB 200. Die STB 300 empfängt die Eingabe-Metadaten 305, welche eine Eingabequelle beschreiben, zum Beispiel die Metadaten von dem Eingabe-Videoinhalt 102-1 und Statistiken von den Statistikgeneratoren 110. Dieses zusätzliche Leistungsmerkmal erlaubt es dem Prozessor 350, die Eingabe-Metadaten 305 auf der Grundlage von Statistiken zu modifizieren, die aus dem Eingabe-Videoinhalt 102-1 sowie aus anderem Eingabe-Video- und -Grafikinhalt (zum Beispiel 102-2 bis 102-n) von anderen Quellen generiert wurden. Die modifizierten Metadaten aus dem Prozessor 350 werden dem Metadaten-Formatierungsblock 360 zur Formatanpassung an die Ausgabevorrichtung, wie beispielsweise eine Anzeigevorrichtung, bereitgestellt. Der Prozessor 350 stellt die modifizierten Metadaten 310 außerdem dem CVLA-Block 140 zur Verfügung. Der CVLA-Block 140 kann die modifizierten Metadaten 310 in einigen seiner Funktionalitäten verwenden, zum Beispiel in der in dem vorliegenden Dokument beschriebenen Farbraumtransformation, um ein Ausgabevideo 342 zu generieren. 3 is a high-level block diagram that provides an example of an STB 300 with improved video rendering using inputs and statistics-based dynamic metadata according to aspects of the claimed technology. The STB 300 is similar to the STB 200 from 2 , except that in the STB 300 the processor 350 and receive dynamic metadata from one of the input sources. The statistics generators 110 , the CVLA blocks 120 , the compositor 130 , the CVLA block 140 , the processor 250 and the metadata formatting block 260 are similar to the analog blocks in the STB 200 , The STB 300 receives the input metadata 305 which describe an input source, for example, the metadata from the input video content 102 - 1 and statistics from the statistics generators 110 , This extra feature allows the processor 350 , the input metadata 305 to modify based on statistics from the input video content 102 - 1 as well as other input video and graphics content (for example 102-2 to 102-n ) were generated by other sources. The modified metadata from the processor 350 be the metadata formatting block 360 for format matching to the output device, such as a display device. The processor 350 presents the modified metadata 310 also the CVLA block 140 to disposal. The CVLA block 140 can the modified metadata 310 in some of its functionalities, for example, in the color space transformation described in the present document, use an output video 342 to generate.

4 ist ein übergeordnetes Blockdiagramm, das ein Beispiel für eine STB 400 mit verbessertem Video-Rendering unter Verwendung von dynamischen Eingabe-Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Die STB 400 ist ähnlich der STB 100 von 1, mit der Ausnahme, dass in der STB 400 Metadaten 405 aus einer Eingabequelle zu dem Ausgabevideo hinzugefügt werden, und die Funktionalitäten verschiedener CVLA-Blöcke 120 können, wie in dem vorliegenden Dokument beschrieben, anders sein. Die Statistikgeneratoren 110, der Compositor 130 und der CVLA-Block 140 sind ähnlich den analogen Blöcken in der STB 100. Die STB 400 weist das zusätzliche Leistungsmerkmal zum Ausgaben an die Anzeigevorrichtung unter Verwendung von dynamischen Metadaten auf. Jedoch kann das Erscheinungsbild der Video- und Grafikzusammensetzung verbessert werden, wenn die STB (zum Beispiel STB 400) so konzipiert ist, dass sie Kenntnis davon hat, wie eine Ausgabevorrichtung, wie beispielsweise eine Anzeigevorrichtung, diese dynamischen Metadaten verwenden kann, um Farbwerte der Anzeige zuzuordnen. 4 is a high-level block diagram that provides an example of an STB 400 with improved video rendering using dynamic input metadata according to aspects of the claimed technology. The STB 400 is similar to the STB 100 from 1 , except that in the STB 400 metadata 405 from an input source to the output video, and the functionalities of different CVLA blocks 120 may be different as described in the present document. The statistics generators 110 , the compositor 130 and the CVLA block 140 are similar to the analog blocks in the STB 100 , The STB 400 has the additional feature of outputting to the display device using dynamic metadata. However, the appearance of the video and graphics composition can be improved if the STB (for example, STB 400 ) is designed to be aware of how an output device, such as a display device, can use this dynamic metadata to associate color values with the display.

Es versteht sich, dass ein Teil der dynamischen Metadaten global sein kann und sich auf alle Pixel in dem Raster gleich auswirken kann, und dass ein anderer Teil der dynamischen Metadaten lokal sein kann und sich auf eine ausgewählte Anzahl der Pixel in dem Raster auswirken kann. In der STB 400 umfasst dieser erste Eingabe-Videoinhalt 102-1 die dynamischen Metadaten 405, die der Ausgabevorrichtung direkt bereitgestellt werden. Bei einigen Implementierungen wird der erste Eingabe-Anpassungsblock (zum Beispiel der CVLA-Block 120-1) umgangen. Bei einer oder mehreren Implementierungen umfassen die Funktionalitäten anderer Eingabe-Anpassungsblöcke (zum Beispiel der CVLA-Blöcke 120-2 bis 120-n) das Durchführen einer Umkehrfunktion einer voraussichtlichen Anzeigeverarbeitung der dynamischen Metadaten. Die Ausgabevorrichtung (zum Beispiel ein Fernsehgerät oder eine andere Anzeigevorrichtung) kann die statischen oder dynamischen Metadaten auf eine einigermaßen vorhersehbare Weise verwenden, so kann bzw. können die Domän(en) der CVLA-Blöcke 120-2 bis 120-n so gewählt werden, dass sie eine Umkehrung der erwarteten Verarbeitung darstellt bzw. darstellen, sodass die endgültige Ausgabe 442 den gewünschten Videoinhalt exakter darstellen kann. Eine Domäne in dem Zusammenhang mit der vorliegenden Erörterung kann eine Sollhelligkeits-, eine Farbvolumen-, eine Weißpunkt- oder eine Farbkomponentendarstellung umfassen.It is understood that some of the dynamic metadata may be global and may affect all pixels in the raster in the same way, and that another portion of the dynamic metadata may be local and may affect a selected number of pixels in the raster. In the STB 400 includes this first input video content 102 - 1 the dynamic metadata 405 which are provided directly to the dispenser. In some implementations, the first input adaptation block (for example, the CVLA block 120 - 1 ) bypassed. In one or more implementations, the functionalities of other input adaptation blocks (for example, the CVLA blocks 120 - 2 to 120-n) performing an inverse function of prospective display processing of the dynamic metadata. The Output device (eg, a TV or other display device) may use the static or dynamic metadata in a reasonably predictable manner, so the domain (s) of the CVLA blocks may 120 - 2 to 120-n be chosen so that it represents a reversal of the expected processing, so that the final output 442 can display the desired video content more accurately. A domain in the context of the present discussion may include a desired brightness, a color volume, a white point, or a color component representation.

5 ist ein Blockdiagramm, das ein Beispiel für eine STB 500 mit verbessertem Video-Rendering unter Verwendung von dynamischen Eingabe-Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Die STB 500 ist ähnlich der STB 400 von 4, mit der Ausnahme, dass in der STB 500 der CVLA-Block 140 von 4 durch einen Ausgabe-Anpassungsblock 510 ersetzt wurde. Die Statistikgeneratoren 110, der CVLA-Block 120 und der Compositor 130 sind ähnlich den analogen Blöcken in 1, wie oben beschrieben. Der Ausgabe-Anpassungsblock 510 umfasst die CVLA-Blöcke 540 (zum Beispiel 540-1 bis 540-n) und einen Multiplexer (MUX) 550. Die CVLA-Blöcke 540 sind im Allgemeinen ähnlich den CVLA-Blöcken 140 von 1. 5 is a block diagram illustrating an example of an STB 500 with improved video rendering using dynamic input metadata according to aspects of the claimed technology. The STB 500 is similar to the STB 400 from 4 , except that in the STB 500 the CVLA block 140 from 4 through an output adaptation block 510 was replaced. The statistics generators 110 , the CVLA block 120 and the compositor 130 are similar to the analog blocks in 1 , as described above. The output adjustment block 510 includes the CVLA blocks 540 (for example 540-1 to 540-n ) and a multiplexer (MUX) 550 , The CVLA blocks 540 are generally similar to the CVLA blocks 140 from 1 ,

Bei einer oder mehreren Implementierungen wird der dem ersten Verarbeitungskanal entsprechende CVLA-Block 540-1 umgangen, und die CVLA-Blöcke 540-2 bis 540-n umfassen ferner das Durchführen einer Umkehrfunktion einer voraussichtlichen Anzeigeverarbeitung der von dem Eingabe-Videoinhalt 102-1 aus der ersten Quelle umfassten dynamischen Metadaten 505. Der MUX 550 kann ein Ausgabevideo eines ausgewählten Exemplars der CVLA-Blöcke 540 an eine Ausgabevorrichtung (zum Beispiel eine Anzeigevorrichtung oder einen Decodierer) übergeben. Bei einigen Implementierungen wird das Auswahlsignal 532 für den MUX 550 von dem Compositor 130 abgesetzt. Das Auswahlsignal 532 ermöglicht die Auswahl eines der Verarbeitungskanäle, welcher einer Quelle entspricht, die den größten Beitrag für das Ausgabesignal des Compositors leistet (zum Beispiel größter Alpha-Wert, größte Pixelanzahl, usw.). Die dynamischen Metadaten 505 werden zusammen mit der Ausgabe des ausgewählten Exemplars der CVLA-Blöcke 540 als endgültiger Ausgabe-Videoinhalt 552 bereitgestellt.In one or more implementations, the CVLA block corresponding to the first processing channel becomes 540 - 1 bypassed, and the CVLA blocks 540 - 2 to 540-n further comprising performing an inverse function of prospective display processing of the input video content 102 - 1 from the first source included dynamic metadata 505 , The MUX 550 can be an output video of a selected copy of the CVLA blocks 540 to an output device (for example, a display device or a decoder). In some implementations, the selection signal becomes 532 for the MUX 550 from the compositor 130 discontinued. The selection signal 532 allows selection of one of the processing channels corresponding to a source that makes the largest contribution to the output signal of the compositor (eg, largest alpha value, largest pixel count, etc.). The dynamic metadata 505 will be together with the output of the selected copy of the CVLA blocks 540 as final output video content 552 provided.

6A und 6B sind Ablaufdiagramme, die Beispiele für die Programmabläufe 600A und 600B eines Farbvolumen- und Luminanz-Anpassungsblocks (CVLA-Block) gemäß Erscheinungsformen der beanspruchten Technologie veranschaulichen. Jeder der CVLA-Blöcke 120 und 140 von 1 bis 4 kann im Allgemeinen den Programmabläufen 600A oder 600B folgen. Der Programmablauf 600A umfasst mehrere Blöcke zur Farbsättigungstransformation (CST-Blöcke) 602 (zum Beispiel 602-1 bis 602-5), eine Reihe von nichtlinearen Funktionsanwendungsblöcken (NFA-Blöcke) 604 (zum Beispiel 604-1 bis 604-4) und Blöcke zum Farbsättigungsabgleich (CSA-Blöcke) (zum Beispiel 606-1 und 606-2). Es versteht sich, dass die nichtlinearen Funktionen als Untermenge auch lineare Funktionen umfassen können. Bei der Eingabe 601 der Programmabläufe 600A kann es sich um eine Ausgabe des Statistikgenerators 110 oder des Compositors 130 von 1 handeln. Bei einigen Implementierungen kann es sich bei der Eingabe 601 um SDR-Videoinhalt handeln. Die Eingabe 601 wird dem CST-Block 602-1 bereitgestellt. Die CST-Blöcke 602-1 können eine Farbraumtansformation zu (oder von) einer Luminanz-/Farbdifferenzdarstellung von (oder zu) einer anderen Gruppe von Primärfarben, zum Beispiel Mastering-Anzeigeprimärfarben, Primärfarben gemäß BT.2020 und Primärfarben gemäß DCI-P3, wie in dem Standard RP 431-2 der Society of Motion Picture and Television Engineers (SMPTE) definiert, oder von und zu anderen Primärfarben durchführen. Bei einigen Implementierungen kann die Farbraumtransformation unter Verwendung einer Matrixmultiplikation und/oder einer Farb-Clipping- oder Farbkomprimierungsfunktion implementiert sein. Bei einer oder mehreren Implementierungen kann der CST-Block 602-1 eine oder mehrere Farbraumtransformationen durchführen, zum Beispiel von YCbCr zu RGB. Die Ausgabe des CST-Blocks 602-1 wird an den NFA-Block 604-1 übergeben. 6A and 6B are flow charts, the examples of the program sequences 600A and 600B of a color volume and luminance adjustment block (CVLA block) according to aspects of the claimed technology. Each of the CVLA blocks 120 and 140 from 1 to 4 can generally follow the program flow 600A or 600B consequences. The program flow 600A includes several blocks for color saturation transformation (CST blocks) 602 (for example 602-1 until 602 5 ), a set of non-linear function blocks (NFA blocks) 604 (for example 604-1 until 604 4 ) and color saturation matching blocks (CSA blocks) (for example 606-1 and 606- 2 ). It is understood that the nonlinear functions as a subset can also comprise linear functions. When entering 601 the program sequences 600A it can be an output of the statistics generator 110 or the compositor 130 from 1 act. In some implementations, the input may be 601 to act on SDR video content. The input 601 becomes the CST block 602 - 1 provided. The CST blocks 602 - 1 may represent a color space transformation to (or from) a luminance / color difference representation of (or to) another group of primary colors, for example mastering display primary colors, primary colors according to BT.2020, and primary colors according to DCI P3 as in the standard RP 431 - 2 of the Society of Motion Picture and Television Engineers (SMPTE), or to and from other primary colors. In some implementations, the color space transformation may be implemented using a matrix multiplication and / or a color-clipping or color-compression function. In one or more implementations, the CST block may 602 - 1 perform one or more color space transformations, for example from YCbCr to RGB. The output of the CST block 602 -1 is sent to the NFA block 604 - 1 to hand over.

Bei den NFA-Blöcken 604 kann eine nichtlineare Funktion auf die Luminanz angewendet werden, um die Quell-Übertragungsfunktion (OOTF) und/oder zuvor angewendete perzeptuelle Übertragungsfunktionen aufzuheben (zum Beispiel opto-optische Übertragungsfunktion, Gamma, usw.). Bei einigen Implementierungen hebt der NFA-Block 604-1 die elektrooptische Übertragungsfunktion (EOTF) gemäß ST.2084 auf. Die Ausgabe des NFA-Blocks 604-1 wird der CST 602-2 bereitgestellt, die eine weitere Farbraumtransformation auf eine andere Gruppe von Primärfarben (zum Beispiel Primärfarben gemäß BT.2020, Primärfarben gemäß DCI-P3, Anzeigeprimärfarben, usw.) anwenden kann. Die farbtransformierte Ausgabe der CST 602-2 wird an den NFA-Block 604-2 übergeben. Der NFA-Block 604-2 kann eine nichtlineare Funktion anwenden, um zum Beispiel die voraussichtlichen Effekte einer nachfolgenden Anzeigeverarbeitung, wie beispielsweise die voraussichtlichen Luminanzeffekte dynamischer Metadaten, die helligkeitsbegrenzende Kniefunktion in der Anzeige und weitere nachfolgende Anzeigeverarbeitungen umzukehren. Die Anwendung der nichtlinearen Funktion, wie beispielsweise eines globalen oder lokalen Tonzuordnungsoperators, der das Aussehen des Video- oder Grafikinhalts verbessert, mittels des NFA-Blocks 604-2 kann die wahrgenommene Qualität der Ausgabe verbessern.For the NFA blocks 604 For example, a non-linear function may be applied to the luminance to cancel the source transfer function (OOTF) and / or previously applied perceptual transfer functions (eg, opto-optic transfer function, gamma, etc.). In some implementations, the NFA block is raised 604 - 1 the electro-optical transfer function (EOTF) according to ST.2084. The output of the NFA block 604 -1 becomes the CST 602 - 2 provided a further color space transformation to another group of primary colors (for example primary colors according to BT.2020, primary colors according to DCI P3 , Display primaries, etc.). The color-transformed output of the CST 602 - 2 gets to the NFA block 604 - 2 to hand over. The NFA block 604 - 2 may apply a nonlinear function, for example, to reverse the likely effects of subsequent display processing, such as the anticipated luminance effects of dynamic metadata, the brightness limiting knee function in the display, and other subsequent display processing. The application of the nonlinear function, such as a global or local sound mapping operator, which improves the appearance of the video or graphic content, by means of the NFA block 604 - 2 can improve the perceived quality of the issue.

Bei dem CSA-Block 606-1 kann optional ein Farbsättigungsabgleich angewendet werden. Der mittels des CSA-Blocks 606-1 durchgeführte Farbsättigungsabgleich kann zum Beispiel die voraussichtlichen Farbeffekte der dynamischen Metadaten umkehren, um die wahrgenommenen Farben des Ausgabe-Videoinhalts zu verbessern. Der CSA-Block 606-1 kann von dem NFA-Block 604-2 außerdem einen Parameter, wie beispielsweise das Verhältnis Yin/Yout, empfangen. Bei einigen Implementierungen wird die Ausgabe des CSA-Blocks 606-1 nicht mittels eines CST-Blocks 602-3, des NFA-Blocks 604-3 und des CSA-Blocks 606-2 (die umgangen werden) verarbeitet, und sie wird an den CST-Block 602-4 übergeben, in dem eine oder mehrere Farbraumtransformationen, zum Beispiel von dem Mastering- auf den DCI-P3-Farbraum, stattfinden können.At the CSA block 606 - 1 Optionally, a color saturation adjustment can be applied. The by means of the CSA block 606 - 1 For example, performing color saturation adjustment may reverse the expected color effects of the dynamic metadata to improve the perceived colors of the output video content. The CSA block 606 - 1 can from the NFA block 604 - 2 also receive a parameter, such as the Yin / Yout ratio. In some implementations, the output of the CSA block becomes 606 - 1 not by means of a CST block 602 - 3 , the NFA block 604 - 3 and the CSA block 606 - 2 (which will be bypassed) and it will be sent to the CST block 602 - 4 in which one or more color space transformations, for example from the mastering to the DCI P3 Color space, can take place.

Der NFA-Block 604-4 kann eine nichtlineare Funktion anwenden, um Pixelwerte in das Standard-Signalformat, zum Beispiel auf der Grundlage der ITU-T-Empfehlung BT.709, Spezifikationen BT.2100 Perceptual Quantization und BT.2100 Hybrid Log-Gamma, oder in ein anderes nichtlineares Format umzusetzen. Bei einigen Implementierungen kann der NFA-Block 604-4 eine nichtlineare Funktion anwenden, um mittels einer Umkehr-EOTF Pixelwerte in das Signalformat gemäß ST.2084 umzusetzen. Bei einigen Implementierungen wird die Ausgabe des NFA-Blocks 604-4 als Ausgabe 610 des Programmablaufs 600A bereitgestellt, und der CST-Block 606-5 wird umgangen.The NFA block 604 - 4 can apply a nonlinear function to convert pixel values to standard signal format, for example, based on ITU-T recommendation BT.709, specifications BT.2100 Perceptual Quantization and BT.2100 Hybrid Log-Gamma, or other non-linear format implement. In some implementations, the NFA block 604 - 4 apply a nonlinear function to convert pixel values into the signal format according to ST.2084 using a reverse EOTF. In some implementations, the output of the NFA block becomes 604 - 4 as an issue 610 of the program sequence 600A provided, and the CST block 606 - 5 is bypassed.

Der Programmablauf 600B ist ähnlich dem Programmablauf 600A, mit der Ausnahme, dass der NFA-Block 604-2 des Programmablaufs 600A durch einen NFA-Block 614-2 ersetzt wurde. Der NFA-Block 614-2 kann eine nichtlineare Funktion anwenden, bei der es sich um eine Tonzuordnungsfunktion handeln kann, die mittels der mittels des Statistikgenerators 110 von 1 bereitgestellten, gesammelten Eingabestatistiken oder mittels der dynamischen Eingabe-Metadaten aus den Eingabeströmen 102 parametriert wurde. Die mittels der gesammelten Eingabestatistiken parametrierte Tonzuordnungskurve kann eine spezifische Helligkeit (zum Beispiel 1000 Nits, die Spitzenhelligkeit der Anzeige, den erwarteten Signalpegel der Spitzenhelligkeit, deren Empfang von der Anzeige erwartet wird, usw.) anstreben.The program flow 600B is similar to the program flow 600A , except that the NFA block 604 - 2 of the program sequence 600A through an NFA block 614 - 2 was replaced. The NFA block 614 - 2 may use a non-linear function, which may be a tone mapping function, by means of the statistics generator 110 from 1 provided, collected input statistics or by means of the dynamic input metadata from the input streams 102 was parameterized. The tone allocation curve parameterized by means of the collected input statistics may aim for a specific brightness (e.g. 1000 nits, the peak brightness of the display, the expected peak level signal level expected to be received by the display, etc.).

7 ist ein Diagramm, das ein Beispiel für eine grafische Darstellung 700 einer nichtlinearen Luminanz-Zuordnungsfunktion 710 veranschaulicht. Die nichtlineare Luminanz-Zuordnungsfunktion 710 zeigt die Veränderung der Ausgabeluminanz gegenüber der Eingabeluminanz, die mit einer steigenden Flanke beginnt und mit einer Sättigung als bestimmtem Wert für die Eingabeluminanz endet. Die nichtlineare Luminanz-Zuordnungsfunktion 710 ist ein Beispiel für die von den NFA-Blöcken 604 von 6A und 6B verwendeten nichtlinearen Funktionen. 7 is a diagram that is an example of a graphical representation 700 a nonlinear luminance mapping function 710 illustrated. The nonlinear luminance mapping function 710 Figure 12 shows the change in output luminance versus input luminance starting with a rising edge and ending with a saturation as a certain value for the input luminance. The nonlinear luminance mapping function 710 is an example of the NFA blocks 604 from 6A and 6B used nonlinear functions.

8 ist ein Ablaufdiagramm, das ein beispielhaftes Verfahren 800 für ein verbessertes Rendering von Videoinhalten aus einer Reihe von Quellen gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Das Verfahren 800 umfasst das Umsetzen, unter Verwendung dynamischer Metadaten (zum Beispiel 105 von 3), des Videoinhalts (zum Beispiel 102 von 3) aus einer Reihe von Quellen in einen Videoinhalt einer von einer Zusammensetzungsdomäne (zum Beispiel in Bezug auf 110, 120 und 130 von 1) oder einer Ausgabedomäne (zum Beispiel in Bezug auf 342 von 3) (810). Eine Domäne in dem Zusammenhang mit der vorliegenden Erörterung kann eine Sollhelligkeits-, eine Farbvolumen-, eine Weißpunkt- oder eine Farbkomponentendarstellung umfassen. Das Verfahren 800 umfasst ferner das Aufnehmen einer approximativen Umkehrfunktion (zum Beispiel mittels NFA 604 von 6A) einer voraussichtlichen Anzeigeverarbeitung, die möglicherweise mit den dynamischen Metadaten durchgeführt werden kann, in wenigstens eine von der Zusammensetzungsdomäne oder der Ausgabedomäne (820). Das Verwenden der dynamischen Metadaten umfasst bei einigen Ausführungsbeispielen (zum Beispiel wie in 4) das direkte Verwenden, oder bei anderen Ausführungsbeispielen (zum Beispiel wie in 2 und 3) das Modifizieren dynamischer Metadaten aus einer oder mehreren ausgewählten Quellen. Die Zusammensetzungsdomäne kann mit einem Compositor verbunden sein, und die Ausgabedomäne kann mit einer Ausgabe-Anpassungsschaltung oder einem Ausgabe-Anpassungsblock (zum Beispiel 140 von 1) verbunden sein. Die Umsetzung kann mittels einer Reihe von Verarbeitungskanälen durchgeführt werden, und jeder Verarbeitungskanal kann eine Statistikgeneratorschaltung (zum Beispiel 110 von 1) und eine Eingabe-Anpassungsschaltung (zum Beispiel 120 von 1) umfassen. 8th FIG. 10 is a flowchart illustrating an example method. FIG 800 for improved rendering of video content from a variety of sources in accordance with aspects of the claimed technology. The procedure 800 includes transforming using dynamic metadata (for example, 105 of 3 ), of the video content (for example 102 from 3 ) from a number of sources into a video content of one of a composition domain (for example, in terms of 110 . 120 and 130 from 1 ) or an output domain (for example, in relation to 342 from 3 ) ( 810 ). A domain in the context of the present discussion may include a desired brightness, a color volume, a white point, or a color component representation. The procedure 800 further comprises taking an approximate inverse function (for example by NFA 604 from 6A) prospective display processing that may possibly be performed on the dynamic metadata into at least one of the composition domain or the output domain ( 820 ). Using dynamic metadata in some embodiments (for example, as in FIG 4 ) direct use, or in other embodiments (for example as in 2 and 3 ) modifying dynamic metadata from one or more selected sources. The composition domain may be connected to a compositor, and the output domain may be connected to an output matching circuit or output matching block (for example, 140 of FIG 1 ). The conversion may be performed by means of a series of processing channels, and each processing channel may include a statistics generator circuit (for example, 110 of 1 ) and an input matching circuit (for example, 120 of 1 ).

9 veranschaulicht eine beispielhafte Netzwerkumgebung 900, in der ein Video-Rendering-System gemäß der beanspruchten Technologie implementiert werden kann. Es sind jedoch möglicherweise nicht alle der abgebildeten Komponenten erforderlich, und eine oder mehrere Implementierungen können zusätzliche, in der Figur nicht gezeigte Komponenten umfassen. Abweichungen bei der Anordnung und dem Typ der Komponenten sind möglich, ohne dass von dem Wesen oder Schutzumfang der in dem vorliegenden Dokument dargelegten Patentansprüche abgewichen wird. Zusätzliche Komponenten, andere Komponenten oder weniger Komponenten können vorgesehen sein. 9 illustrates an example network environment 900 in which a video rendering system according to the claimed technology can be implemented. However, not all of the components depicted may be required, and one or more implementations may include additional components not shown in the figure. Deviations in the arrangement and type of components are possible without departing from the spirit or scope of the claims set forth in this document. additional Components, other components or fewer components may be provided.

Die beispielhafte Netzwerkumgebung 900 umfasst ein CDN (Content Delivery Network, Inhaltsauslieferungsnetzwerk) 910, das kommunikativ mit einer elektronischen Vorrichtung 920 gekoppelt ist, wie beispielsweise mittels eines Netzwerks 908. Das CDN 910 kann einen Inhaltsserver 912 zum Codieren und/oder Übertragen von codierten Datenströmen, wie beispielsweise HEVC-codierten Videoströmen, AVI-codierten Videoströmen und/oder H.266-codierten Videoströmen über das Netzwerk 908, eine Antenne 916 zum luftgestützten Senden von codierten Datenströmen und eine Satelliten-Sendevorrichtung 918 zum Übertragen von codierten Datenströmen an einen Satelliten 915 umfassen und/oder kommunikativ damit gekoppelt sein.The exemplary network environment 900 includes a CDN (Content Delivery Network) 910 Communicating with an electronic device 920 coupled, such as by means of a network 908 , The CDN 910 can be a content server 912 for encoding and / or transmitting encoded data streams, such as HEVC encoded video streams, AVI encoded video streams and / or H.266 encoded video streams over the network 908 , an antenna 916 for the airborne transmission of coded data streams and a satellite transmission device 918 for transmitting coded data streams to a satellite 915 include and / or communicatively coupled with it.

Die elektronische Vorrichtung 920 kann eine Satelliten-Empfangsvorrichtung 922, wie beispielsweise eine Satellitenschüssel, die von dem Satelliten 915 codierte Datenströme empfängt, umfassen und/oder damit gekoppelt sein. Bei einer oder mehreren Implementierungen kann die elektronische Vorrichtung 920 ferner eine Antenne zum luftgestützten Empfangen von codierten Datenströmen, wie beispielsweise codierten Videoströmen von der Antenne 916 des CDN 910, umfassen. Bei dem Inhaltsserver 912 und/oder der elektronischen Vorrichtung 920 kann es sich um eine oder mehrere Komponenten des unten unter Bezugnahme auf 10 erörterten elektronischen Systems handeln, oder er bzw. sie kann diese umfassen.The electronic device 920 can be a satellite receiving device 922 such as a satellite dish coming from the satellite 915 encoded data streams receive, include and / or be coupled. In one or more implementations, the electronic device may 920 an antenna for airborne receiving encoded data streams, such as encoded video streams from the antenna 916 of the CDN 910 , include. At the content server 912 and / or the electronic device 920 It may be one or more components of the below with reference to 10 act or he or she may include these.

Bei dem Netzwerk 908 kann es sich um ein öffentliches Kommunikationsnetzwerk (wie beispielsweise das Internet, ein Mobildatennetz, Einwahlmodems über ein Telefonnetz) oder um ein privates Kommunikationsnetzwerk (wie beispielsweise ein privates lokales Netzwerk („LAN“), Standleitungen) handeln. Das Netzwerk 908 kann außerdem eine oder mehrere beliebige der folgenden Netzwerktopologien, einschließlich eines Bus-Netzwerks, eines Stern-Netzwerks, eines Ring-Netzwerks, eines Maschen-Netzwerks, eines Stern-Bus-Netzwerks, eines Baum- oder hierarchischen Netzwerks und dergleichen, umfassen, ist aber nicht darauf beschränkt. Bei einer oder mehreren Implementierungen kann das Netzwerk 908 Übertragungsleitungen, wie beispielsweise koaxiale Übertragungsleitungen, Glasfaser-Übertragungsleitungen oder allgemein beliebige Übertragungsleitungen umfassen, die den Inhaltsserver 912 und die elektronische Vorrichtung 920 kommunikativ koppeln.At the network 908 it may be a public communications network (such as the Internet, a mobile data network, a dial-up modem over a telephone network) or a private communications network (such as a private local area network ("LAN"), leased lines). The network 908 may also include any one or more of the following network topologies, including a bus network, a star network, a ring network, a mesh network, a star-bus network, a tree or hierarchical network, and the like but not limited to that. In one or more implementations, the network may 908 Transmission lines, such as coaxial transmission lines, fiber optic transmission lines or in general any transmission lines that the content server 912 and the electronic device 920 couple communicatively.

Der Inhaltsserver 912 kann eine oder mehrere Verarbeitungsvorrichtungen, einen Datenspeicher 914 und/oder einen Codierer umfassen oder damit gekoppelt sein. Die eine oder die mehreren Verarbeitungsvorrichtungen führen in dem Datenspeicher 914 gespeicherte Computeranweisungen durch, um zum Beispiel ein Inhaltsauslieferungsnetzwerk zu implementieren. Der Datenspeicher 914 kann die Computeranweisungen auf einem nichtflüchtigen, computerlesbaren Medium speichern. Der Datenspeicher 914 kann ferner ein oder mehrere von dem CDN 910 gelieferte Programme, zum Beispiel Video- und/oder Audioströme, speichern. Der Codierer kann einen Codec, wie beispielsweise einen HEVC-Codec, einen AVI-Codec, einen H.266-Codec oder einen beliebigen anderen geeigneten Codec verwenden, um Videoströme zu codieren.The content server 912 may be one or more processing devices, a data store 914 and / or include or be coupled to an encoder. The one or more processing devices run in the data store 914 stored computer instructions to implement, for example, a content delivery network. The data store 914 can store the computer instructions on a non-volatile, computer-readable medium. The data store 914 may also include one or more of the CDN 910 supplied programs, for example, video and / or audio streams store. The encoder may use a codec such as a HEVC codec, AVI codec, H.266 codec, or any other suitable codec to encode video streams.

Bei einer oder mehreren Implementierungen kann es sich bei dem Inhaltsserver 912 um eine einzelne Computervorrichtung, wie beispielsweise einen Computerserver, handeln. Alternativ kann der Inhaltsserver 912 mehrere Computervorrichtungen darstellen, die zusammenarbeiten, um die Vorgänge eines Servercomputers durchzuführen (wie beispielsweise eine „Cloud“ aus Computern und/oder ein verteiltes System). Der Inhaltsserver 912 kann mit verschiedenen Datenbanken, Speicherdiensten oder anderen Computervorrichtungen gekoppelt sein, wie beispielsweise einem ABR-Server (Server mit adaptiver Bitrate), der sich an demselben Standort wie der Inhaltsserver 912 befinden oder räumlich von dem Inhaltsserver 912 getrennt sein kann.In one or more implementations, the content server may be 912 to act a single computing device, such as a computer server. Alternatively, the content server 912 represent a plurality of computing devices working together to perform the operations of a server computer (such as a "cloud" of computers and / or a distributed system). The content server 912 may be coupled to various databases, storage services, or other computing devices, such as an adaptive bit rate (ABR) server located at the same site as the content server 912 or spatially from the content server 912 can be separated.

Die elektronische Vorrichtung 920 kann eine oder mehrere Verarbeitungsvorrichtungen, einen Speicher und/oder einen Decodierer, wie beispielsweise einen Hardwaredecodierer, umfassen oder damit gekoppelt sein. Bei der elektronischen Vorrichtung 920 kann es sich um eine beliebige Vorrichtung handeln, die in der Lage ist, einen codierten Datenstrom, wie beispielsweise einen codierten Videostrom, zu decodieren. Bei einer oder mehreren Implementierungen kann der Decodierer eine oder mehrere der unten beschriebenen Decodierungstechniken implementieren.The electronic device 920 may include or be coupled to one or more processing devices, a memory, and / or a decoder, such as a hardware decoder. In the electronic device 920 it may be any device capable of decoding a coded data stream, such as a coded video stream. In one or more implementations, the decoder may implement one or more of the decoding techniques described below.

Bei einer oder mehreren Implementierungen kann es sich bei der elektronischen Vorrichtung 920 um einen Laptop- oder einen Desktop-Computer, ein Smartphone, eine Tablet-Vorrichtung, eine tragbare elektronische Vorrichtung, wie beispielsweise eine Brille oder eine Armbanduhr mit einem oder mehreren damit gekoppelten und/oder darin eingebetteten Prozessoren, eine Set-Top-Box, ein Fernsehgerät oder eine andere Anzeigeeinheit mit einem oder mehreren damit gekoppelten und/oder darin eingebetteten Prozessoren oder um andere geeignete elektronische Vorrichtungen handeln, die zum Decodieren eines codierten Datenstroms, wie beispielsweise eines codierten Videostroms, verwendet werden können, oder sie kann alle oder einen Teil dieser Komponenten umfassen.In one or more implementations, the electronic device may be 920 a laptop or a desktop computer, a smartphone, a tablet device, a portable electronic device such as a pair of glasses or a wristwatch with one or more processors coupled thereto and / or embedded therein, a set-top box, a television or other display unit having one or more processors coupled thereto and / or embedded therein, or other suitable electronic devices that may be used to decode a coded data stream, such as a coded video stream, or may all or part of these components.

In 9 ist die elektronische Vorrichtung 920 als Set-Top-Box abgebildet, zum Beispiel eine Vorrichtung, die mit einer Anzeigeeinheit 924, wie beispielsweise einem Fernsehgerät, einem Monitor oder einer beliebigen Vorrichtung, die in der Lage ist, Videoinhalt anzuzeigen, gekoppelt ist und in der Lage ist, Videoinhalt auf dieser anzuzeigen. Bei einer oder mehreren Implementierungen kann die elektronische Vorrichtung 920 in die Anzeigeeinheit 924 integriert sein, und/oder die Anzeigeeinheit 924 kann in der Lage sein, zusätzlich zu Videoinhalt Audioinhalt auszugeben. Die elektronische Vorrichtung 920 kann von dem CDN 910 Ströme, wie beispielsweise codierte Datenströme, empfangen, die Inhaltselemente, wie beispielsweise Fernsehsendungen, Filme oder allgemein beliebige Inhaltselemente umfassen. Die elektronische Vorrichtung 920 kann die codierten Datenströme von dem CDN 910 über die Antenne 916, über das Netzwerk 908 und/oder über den Satelliten 915 empfangen und die codierten Datenströme zum Beispiel unter Verwendung des Hardwaredecodierers decodieren. Bei einer oder mehreren Implementierungen kann die elektronische Vorrichtung 920 eine oder mehrere der Video-Rendering-Techniken zum Anzeigen von Videoinhalt auf einer Anzeigeeinheit 924 implementieren, wie es in dem vorliegenden Dokument weiter beschrieben ist. In 9 is the electronic device 920 mapped as a set-top box, for example, a device with a display unit 924 , such as a television, monitor, or any device capable of displaying video content, is coupled and capable of displaying video content thereon. In one or more implementations, the electronic device may 920 in the display unit 924 be integrated, and / or the display unit 924 may be able to output audio content in addition to video content. The electronic device 920 can from the CDN 910 Streams, such as encoded data streams, receive the content elements, such as television broadcasts, movies or any content elements include. The electronic device 920 The coded data streams from the CDN 910 over the antenna 916 , over the network 908 and / or via the satellite 915 and decode the encoded data streams using, for example, the hardware decoder. In one or more implementations, the electronic device may 920 one or more of the video rendering techniques for displaying video content on a display unit 924 implement as further described in the present document.

Wenn zum Beispiel eine Set-Top-Box oder eine ähnliche Art von Endvorrichtung, wie beispielsweise die elektronische Vorrichtung 920, Videodaten empfängt und die Videodaten an eine Anzeigevorrichtung, wie beispielsweise die Anzeigeeinheit 924, ausgibt, können die Videodaten verarbeitet werden, um den Inhalt der Videodaten an die Leistungsmerkmale der Anzeigeeinheit anzupassen. Eine solche Anpassung kann unter Verwendung von Farbvolumen- und Luminanzanpassung (CVLA) durchgeführt werden. Um die Anpassung an die Anzeigeeinheit zu verbessern, kann die CVLA-Verarbeitung so durchgeführt werden, dass eine nichtlineare Funktion und/oder eine Tonzuordnung auf die Videodaten angewendet wird, bevor die Videodaten an die Anzeigeeinheit ausgegeben werden. Die nichtlineare Funktion kann angewendet werden, um verschiedene Anzeigemerkmale der Videodaten abzugleichen. Bei einer oder mehreren Implementierungen kann eine statische Tonzuordnung oder eine dynamische Tonzuordnung angewendet werden, um verschiedene Anzeigemerkmale der Videodaten abzugleichen. Bei einer oder mehreren Implementierungen wird die nichtlineare Funktion auf jede Komponente angewendet. Bei einer oder mehreren Implementierungen handelt es sich bei der nichtlinearen Funktion um eine nichtlineare Funktion in Bezug auf Luminanz, um wenigstens eine von einer Quell-Übertragungsfunktion oder einer zuvor angewendeten perzeptuellen Übertragungsfunktion aufzuheben oder um einen Farbsättigungsabgleich anzuwenden. Bei einer oder mehreren Implementierungen handelt es sich bei der nichtlinearen Funktion um eine nichtlineare Übertragungsfunktion, die einen oder mehrere Luminanzwerte neu zuordnet, um eine Differenz zwischen quellenbezogenen Pixelwerten und szenenbezogenen linearen Pixelwerten zu berücksichtigen. Bei einer oder mehreren Implementierungen handelt es sich bei der nichtlinearen Funktion um eine nichtlineare Funktion, die wenigstens einen der folgenden Vorgänge durchführt: Umkehren von voraussichtlichen Effekten einer nachfolgenden Anzeigeverarbeitung oder Anwenden eines Farbsättigungsabgleichs.For example, if a set-top box or similar type of end device, such as the electronic device 920 Receives video data and the video data to a display device such as the display unit 924 , the video data can be processed to adapt the content of the video data to the performance of the display unit. Such an adjustment can be made using color volume and luminance adjustment (CVLA). In order to improve the adaptation to the display unit, CVLA processing may be performed such that a non-linear function and / or a sound mapping is applied to the video data before the video data is output to the display unit. The non-linear function can be used to match various display characteristics of the video data. In one or more implementations, static audio mapping or dynamic audio mapping may be applied to match various display characteristics of the video data. In one or more implementations, the nonlinear function is applied to each component. In one or more implementations, the nonlinear function is a nonlinear function with respect to luminance to cancel at least one of a source transfer function or a previously applied perceptual transfer function, or to apply a chroma match. In one or more implementations, the nonlinear function is a non-linear transfer function that reallocates one or more luminance values to account for a difference between source-related pixel values and scene-related linear pixel values. In one or more implementations, the nonlinear function is a non-linear function that performs at least one of the following: reversing expected effects of subsequent display processing or applying color saturation adjustment.

Bei einer oder mehreren Implementierungen umfassen die Videodaten Videodaten aus mehreren Videoquellen, und die Videodaten werden durch Kombinieren der Videodaten unter Verwendung einer Zusammensetzungsfunktion zu einem einzelnen Ausgabe-Videodatenelement weiter verarbeitet. Die Videodaten werden durch Durchführen einer CVLA-Verarbeitung für jedes der Videodatenelemente und durch Durchführen einer CVLA-Verarbeitung für das einzelne Ausgabe-Videodatenelement verarbeitet.In one or more implementations, the video data includes video data from multiple video sources, and the video data is further processed by combining the video data into a single output video data item using a compositing function. The video data is processed by performing CVLA processing for each of the video data items and performing CVLA processing for the single output video data item.

Bei einer oder mehreren Implementierungen kann die elektronische Vorrichtung 920 die Videodaten verarbeiten, indem sie eine Tonzuordnung durchführt, bei der es sich um eine statische Tonzuordnung zum Zuordnen eines oder mehrerer Eingabe-Farbwerte für die Videodaten zu jeweils einem oder mehreren Ausgabe-Farbwerten für die Anzeige handelt. Der eine oder die mehreren Eingabe-Farbwerte sind szenenbezogen oder anzeigenbezogen. Bei einer oder mehreren Implementierungen umfasst die Tonzuordnung das Umsetzen eines Formats der Videodaten in ein Standard-Signalformat. Bei einer oder mehreren Implementierungen wird das Umsetzen des Formats der Videodaten in das Standard-Signalformat unter Verwendung von dynamischen Metadaten durchgeführt. Bei einer oder mehreren Implementierungen umfasst die Tonzuordnung das Abgleichen eines Helligkeitspegels der Videodaten auf der Grundlage von Bildmerkmalen. Bei einer oder mehreren Implementierungen werden die für die Ausgabe verarbeiteten Videodaten vorverzerrt, um Effekte einer helligkeitsbegrenzenden Kniefunktion und einer Farbkomprimierung der Anzeige zu verringern. Bei einer oder mehreren Implementierungen umfasst die Tonzuordnung das Bereitstellen einer Angabe für die Anzeigeeinheit, dass der Videoinhalt in Bezug auf ein oder mehrere Videomerkmale angepasst wurde, wobei das eine oder die mehreren Videomerkmale wenigstens eines von Primärfarben, einem Weißpunkt oder einem Anzeigehelligkeitsprofil umfassen.In one or more implementations, the electronic device may 920 process the video data by performing a sound mapping, which is a static sound mapping to associate one or more input color values for the video data with one or more output color values for the display, respectively. The one or more input color values are scene-related or display-related. In one or more implementations, the audio mapping includes converting a format of the video data to a standard signal format. In one or more implementations, the conversion of the format of the video data to the standard signal format is performed using dynamic metadata. In one or more implementations, the audio mapping includes matching a brightness level of the video data based on image features. at In one or more implementations, the video data processed for output is predistorted to reduce effects of brightness limiting knee function and color compression of the display. In one or more implementations, the audio association comprises providing an indication to the display unit that the video content has been adjusted with respect to one or more video features, the one or more video features including at least one of primary colors, a white point, or a display brightness profile.

Bei einer oder mehreren Implementierungen umfasst die Tonzuordnung das Bereitstellen von Mastering-Metadaten, die wenigstens eines von Mastering-Anzeigeprimärfarben, Spitzenluminanz oder Inhaltsmerkmalen, die bei der Erstellung der Videodaten verwendet wurden, angeben. Bei einer oder mehreren Implementierungen werden die Mastering-Metadaten verwendet, um einen Sollhelligkeitspegel für die Zusammensetzung aus Grafik und/oder Video zu bestimmen. Bei einer oder mehreren Implementierungen werden die Inhaltsmerkmale innerhalb der Mastering-Metadaten so modifiziert, dass sie den Inhaltmerkmalen des Ergebnisses des Kombinierens der Videodaten entsprechen.In one or more implementations, the audio mapping includes providing mastering metadata that specifies at least one of mastering display primaries, peak luminance, or content features used in creating the video data. In one or more implementations, the mastering metadata is used to determine a desired brightness level for the composition of graphics and / or video. In one or more implementations, the content features within the mastering metadata are modified to correspond to the content features of the result of combining the video data.

Bei einer oder mehreren Implementierungen kann die elektronische Vorrichtung 920 die Videodaten verarbeiten, indem sie eine Tonzuordnung durchführt, bei der es sich um eine dynamische Tonzuordnung unter Verwendung von dynamischen Metadaten zum Zuordnen eines oder mehrerer Eingabe-Farbwerte für die Videodaten zu jeweils einem oder mehreren Ausgabe-Farbwerten für die Anzeige handelt. Bei einer oder mehreren Implementierungen umfasst die Tonzuordnung das Bereitstellen von dynamischen Metadaten aus den Videodaten für die Anzeige. Bei einer oder mehreren Implementierungen werden die dynamischen Metadaten modifiziert, um Merkmale von zusammengesetzten Videodaten zu berücksichtigen, und/oder die dynamischen Metadaten werden vor dem Senden an die Anzeigeeinheit modifiziert. Bei einer oder mehreren Implementierungen werden die dynamischen Metadaten modifiziert, um der Anzeigeeinheit anzugeben, dass sie eine alternative Tonzuordnung auf Bereiche der Videodaten anwenden soll, in denen Videoinhalt oder Grafikinhalt vorhanden ist.In one or more implementations, the electronic device may 920 processing the video data by performing a sound mapping, which is a dynamic sound mapping using dynamic metadata, to associate one or more input color values for the video data with one or more output color values for the display, respectively. In one or more implementations, the audio mapping includes providing dynamic metadata from the video data for the display. In one or more implementations, the dynamic metadata is modified to account for features of composite video data, and / or the dynamic metadata is modified prior to transmission to the display unit. In one or more implementations, the dynamic metadata is modified to indicate to the display unit that it should apply an alternative audio mapping to portions of the video data in which video content or content is present.

Bei einer oder mehreren Implementierungen umfasst die Verarbeitung der Videodaten das Berechnen von Statistiken für die Videoquellen und das Generieren von Werten für die dynamischen Metadaten auf der Grundlage der berechneten Statistiken. Bei einer oder mehreren Implementierungen werden die Statistiken vor dem Kombinieren der Videodaten berechnet. Bei einer oder mehreren Implementierungen werden die Statistiken vor der Bereitstellung an die Anzeigeeinheit auf der Grundlage der Ausgabe-Videodaten berechnet. Bei einer oder mehreren Implementierungen werden die Statistiken auf der Grundlage des Ausgabevideos berechnet, um auf nachfolgende Ausgabe-Videodaten angewendet zu werden.In one or more implementations, the processing of the video data includes calculating statistics for the video sources and generating values for the dynamic metadata based on the calculated statistics. In one or more implementations, the statistics are calculated before combining the video data. In one or more implementations, the statistics are calculated prior to deployment to the display unit based on the output video data. In one or more implementations, the statistics are calculated based on the output video to be applied to subsequent output video data.

10 ist ein Blockdiagramm, das eine beispielhafte Systemarchitektur einer STB 1000 veranschaulicht, in der das verbesserte Video-Rendering der beanspruchten Technologie implementiert ist. Die STB 1000 umfasst Eingabe-Inhaltsquellen 1005, einen Videotransportmanager (VTM) 1040, eine Reihe von Videodecodierern 1050 (1050-1 bis 1050-n), eine Grafikverarbeitungseinheit (GPU) 1060, weitere Eingabe-Videoschnittstellen (IVI) 1070), einen Videoprozessor (VP) 1080, eine Videoanzeigeschnittstelle (VDI) 1082, einen Codierer für komprimierte Videos (CVE) 1084 und weitere Videoschnittstellen (VI) 1086. Die Eingabe-Inhaltsquellen 1005 umfassen zum Beispiel ein analoges Front-End (AFE) 1010 (zum Beispiel einschließlich einer oder mehrerer Abstimmeinheiten (Tuner) und/oder eines oder mehrerer Demodulatoren), das Eingabeinhalt 1002 empfängt, einen Speicher 1020 (zum Beispiel ein Festplattenlaufwerk (HDD), einen Flash-Speicher oder andere Speichertypen) und eine Netzwerkschnittstelle 1030 (zum Beispiel Ethernet, WiFi und/oder andere Netzwerke), die Eingabeinhalt 1004 empfängt. Der VTM 1040 ist für das Übertragen des Eingabe-Videoinhalts von den Videoinhaltsquellen 1005 zuständig und verwaltet dessen Übertragung an die Videdecodierer 1050 zur Decodierung. Der Videoprozessor 1080 empfängt Video- und Grafikinhalt von den Videodecodierern 1050, der GPU 1060 und der IVI 1070. Die IVI 1070 umfasst Eingabe-Videoschnittstellen wie beispielsweise HDMI (High-Definition Multimedia Interface). Die Verarbeitung des Videoinhalts, wie oben unter Bezugnahme auf 1 bis 6B beschrieben, wird mittels des VP 1080 durchgeführt. Unter den von dem VP 1080 durchgeführten Verarbeitungsschritten sind, wie oben erwähnt und in dem vorliegenden Dokument noch ausführlicher beschrieben, statische und dynamische Tonzuordnung. 10 FIG. 10 is a block diagram illustrating an exemplary system architecture of an STB. FIG 1000 illustrating the improved video rendering of the claimed technology is implemented. The STB 1000 includes input content sources 1005 , a video transport manager (VTM) 1040 , a series of video decoders 1050 (1050-1 to 1050-n), a graphics processing unit (GPU) 1060 , additional input video interfaces (IVI) 1070 ), a video processor (VP) 1080 , a video display interface (VDI) 1082 , a Compressed Video Encoder (CVE) 1084 and other video interfaces (VI) 1086 , The input content sources 1005 include, for example, an analog front-end (AFE) 1010 (For example, including one or more tuners and / or one or more demodulators), the input content 1002 receives a memory 1020 (For example, a hard disk drive (HDD), a flash memory or other types of memory) and a network interface 1030 (for example, Ethernet, WiFi and / or other networks), the input content 1004 receives. The VTM 1040 is for transferring the input video content from the video content sources 1005 responsible and manages its transmission to the video decoder 1050 for decoding. The video processor 1080 receives video and graphics content from video decoders 1050 , the GPU 1060 and the IVI 1070 , The IVI 1070 includes input video interfaces such as HDMI (High-Definition Multimedia Interface). The processing of the video content as described above with reference to 1 to 6B is described by the VP 1080 carried out. Among those of the VP 1080 As described above and described in more detail in the present document, the processing steps performed are static and dynamic tone mapping.

Bei einigen Szenarien kann eine Anzeigeeinheit in einen Modus versetzt werden, in dem sie eine statische Tonzuordnung durchführt, das heißt, ein bestimmter Eingabe-Farbwert wird einem spezifischen Ausgabe-Farbwert auf der Anzeigeeinheit zugeordnet. Das Rendering der Ausgabefarbe kann einheitlich sein, oder das Rendering kann auf bestimmten Arten von Anzeigeeinheiten (zum Beispiel Fernsehgeräten) aufgrund von dynamischem Kontrast, Abgleich oder Streckung von Farbtönen/Sättigung/Farben, dynamischer Anpassung der Hintergrundbeleuchtung und anderen Anpassungen unterschiedlich sein. Der Eingabe-Farbwert kann szenenbezogen oder anzeigenbezogen sein. Bei einem oder mehreren Ausführungsbeispielen setzt eine STB (zum Beispiel 1000) Video und Grafik in ein Standard-Signalformat (zum Beispiel gemäß ITU-T-Empfehlung BT.709, Spezifikationen BT.2100 Perceptual Quantization, BT.2100 Hybrid Log-Gamma, usw.) um. In diesen Fällen kann eine CVLA (zum Beispiel 120 von 1) mit Beteiligung von Farbtransformation, Luminanz-Neuzuordnung und/oder Sättigungsabgleich auf den Videoinhalt angewendet werden, wenn ein derartiger Videoinhalt in dem gewählten Standardformat nicht verfügbar ist. Einige Grafiktypen (zum Beispiel solche mit begrenztem Dynamikbereich, Untertiteln, Untertiteln für Hörgeschädigte, usw.) und einige Typen von Videoinhalt (zum Beispiel SDR-Video) können abgeglichen werden, um heller oder dunkler zu erscheinen, sodass sie nach dem Zusammensetzen mit HB-Videoinhalt, Videoinhalt mit niedriger Helligkeit und/oder HDR-Videoinhalt komfortabel betrachtet werden können. Dieser Abgleich kann als Reaktion auf entweder globale oder lokale Bildmerkmale, wie beispielsweise mittlere oder Spitzenhelligkeit oder Luminanz, durchgeführt werden. Bei einigen Ausführungsbeispielen ist das Ausgabevideo vorverzerrt, um die Effekte der helligkeitsbegrenzenden Kniefunktion und der Farbkomprimierung der Anzeige, sofern bekannt, zu verringern.In some scenarios, a display unit may be placed in a mode in which it performs a static sound mapping, that is, a particular input color value is assigned to a specific output color value on the display unit. The rendering of the output color may be uniform, or the rendering may be different on certain types of display units (eg, televisions) due to dynamic contrast, matching or stretching of hues / saturation / colors, dynamic adjustment of the backlight, and other adjustments. The input color value may be scene-related or display-related. In one or more embodiments, an STB (eg, 1000) places video and graphics in a standard signal format (for example, according to ITU-T Recommendation BT.709, BT.2100 Perceptual Quantization Specifications, BT.2100 Hybrid Log Gamma, etc .) around. In these cases, a CVLA (for example, 120 of 1 ) are applied to the video content involving color transformation, luminance remapping, and / or saturation matching if such video content is not available in the selected standard format. Some types of graphics (such as those with limited dynamic range, subtitles, subtitles for the hearing-impaired, etc.) and some types of video content (for example, SDR video) may be adjusted to appear lighter or darker, so that after being composed with HB- Video content, video content with low brightness and / or HDR video content can be conveniently viewed. This match may be in response to either global or local Image features, such as mean or peak brightness or luminance, are performed. In some embodiments, the output video is predistorted to reduce the effects of the brightness limiting knee function and color compression of the display, if known.

Bei einem oder mehreren Ausführungsbeispielen stellt der VP 1080 über die VDI 1082 eine Anzeige mit Mastering-Metadaten (zum Beispiel den in SMPTE ST.2086 definierten Metadaten, den in Anhang P des Standards CTA 861-G definierten Metadaten MaxCLL und/oder MaxFALL, usw.) bereit, welche die Mastering-Anzeigeprimärfarben, die Spitzenluminanz, Inhaltsmerkmale, usw. angeben, die bei der Erstellung des Videoinhalts verwendet wurden. CVLA kann verwendet werden, um andere Grafik- und/oder Videoquellen an das Farbvolumen der Mastering-Anzeige anzupassen. Bei einigen Ausführungsbeispielen werden die Mastering-Metadaten mittels des VP 1080 modifiziert (zum Beispiel auf ein Farbvolumen, das alle der Farbvolumina der zusammengesetzten Quellen umschreibt, auf die erwartete Spitzenhelligkeit der Anzeigeeinheit, usw.), und die CVLA wird nach Bedarf auf jegliche Zusammensetzung von Grafik und/oder Video angewendet. Bei einem oder mehreren Ausführungsbeispielen werden die Mastering-Metadaten zum Bestimmen eines Sollhelligkeitspegels für die Zusammensetzung aus Grafik und/oder Video verwendet. Bei einigen Ausführungsbeispielen werden Metadaten, welche die Inhaltsmerkmale beschreiben, so modifiziert, dass sie den Inhaltsmerkmalen des Ergebnisses der Zusammensetzungsoperation in dem VP 1080 entsprechen.In one or more embodiments, the VP 1080 about the VDI 1082 a mastering metadata display (for example, the metadata defined in SMPTE ST.2086, in Appendix P of the CTA standard) 861-G defined metadata MaxCLL and / or MaxFALL, etc.) that specify the mastering display primaries, peak luminance, content characteristics, etc. that were used in creating the video content. CVLA can be used to match other graphics and / or video sources to the color volume of the mastering ad. In some embodiments, the mastering metadata is using the VP 1080 modified (for example, to a color volume that circumscribes all of the color volumes of the composite sources, to the expected peak brightness of the display unit, etc.), and the CVLA is applied to any composition of graphics and / or video as needed. In one or more embodiments, the mastering metadata is used to determine a desired brightness level for the composition of graphics and / or video. In some embodiments, metadata that describes the content features are modified to match the content features of the result of the composition operation in the VP 1080 correspond.

Bei einem oder mehreren Ausführungsbeispielen stellt der VP 1080 der Anzeigeeinheit eine Angabe bereit, dass der Videoinhalt an die Primärfarben angepasst wurde, die in ihren Leistungsmerkmalen (zum Beispiel gemäß VESA EDID, gemäß VESA DisplayID, Abruf von Informationen zu den Leistungsmerkmalen aus einer Datenbank für Anzeigeeinheiten, usw.) angegeben sind. Diese Angabe ist noch nicht standardisiert, aber das Signalisierungsverfahren kann in der Zukunft standardisiert werden oder anbieterspezifisch sein. Die CVLA kann verwendet werden, um Grafik- und/oder Videoquellen an dieses Farbvolumen anzupassen.In one or more embodiments, the VP 1080 providing the display unit with an indication that the video content has been matched to the primary colors specified in terms of their features (for example, according to VESA EDID, VESA DisplayID, retrieval of performance information from a display unit database, etc.). This information is not yet standardized, but the signaling method may be standardized in the future or be vendor-specific. The CVLA can be used to adapt graphics and / or video sources to this color volume.

Bei einem oder mehreren Ausführungsbeispielen stellt der VP 1080 der Anzeigeeinheit eine Angabe bereit, dass der Videoinhalt auf die DCI-P3-Primärfarben und den DCI-P3-Weißpunkt (zum Beispiel gemäß dem Standard SMPTE RP 431-2) angepasst wurde. Die CVLA kann verwendet werden, um Grafik- und/oder Videoquellen an dieses Farbvolumen anzupassen. Bei einem oder mehreren Ausführungsbeispielen stellt der VP 1080 der Anzeigeeinheit eine Angabe bereit, dass der Videoinhalt an das Anzeigehelligkeitsprofil angepasst wurde. Das Anzeigehelligkeitsprofil bzw. die Spitzenhelligkeit können der STB unter Verwendung eines auf einem Standard basierenden Signalisierungsverfahrens oder einer anbieterspezifischen Implementierung signalisiert werden. Das Helligkeitsprofil bzw. die Spitzenhelligkeit kann ein Skalar sein, oder es bzw. sie kann aus mehreren Größen bestehen (zum Beispiel aus einer Funktion oder Angabe, wie die Spitzenhelligkeit in Abhängigkeit von den Video-Inhaltsmerkmalen variiert).In one or more embodiments, the VP 1080 provide the display unit with an indication that the video content is on the DCI P3 Primary colors and the DCI P3 White point (for example, according to the standard SMPTE RP 431 - 2 ) was adjusted. The CVLA can be used to adapt graphics and / or video sources to this color volume. In one or more embodiments, the VP 1080 provide the display unit with an indication that the video content has been matched to the display brightness profile. The display brightness profile or peak brightness may be signaled to the STB using a standard based signaling technique or a vendor specific implementation. The brightness profile or peak brightness may be a scalar, or it may be multiple sizes (e.g., a function or indication of how the peak brightness varies depending on the video content features).

Bei einem oder mehreren Ausführungsbeispielen verwendet der VP 1080 dynamische Metadaten (zum Beispiel Metadaten wie in SMPTE ST.2094-10, ST.2094-20, ST.2094-30, ST2094-40 definiert; globale Tonzuordnungs- oder Luminanzanpassung; lokale Tonzuordnungs- oder Luminanzanpassung; globale oder lokale Farbsättigungsanpassung; usw.), um Videoinhalt in eines der oben angegebenen statischen Formate umzusetzen. Zur Durchführung dieser Umsetzung kann die CVLA verwendet werden.In one or more embodiments, the VP uses 1080 dynamic metadata (for example, metadata as defined in SMPTE ST.2094-10, ST.2094-20, ST.2094-30, ST2094-40, global tone mapping or luminance adaptation, local tone mapping or luminance matching, global or local chroma matching, etc .) to convert video content into one of the above static formats. To carry out this reaction, the CVLA can be used.

Bei einigen Ausführungsbeispielen führt der VP 1080 das Zusammensetzen von Video- und/oder Grafikquellen in einem der oben angegebenen statischen Formate durch. Bei einem oder mehreren Ausführungsbeispielen führt der VP 1080 das Zusammensetzen des Videos und/oder der Grafik in einem anderen gemeinsamen statischen Format durch, und die CVLA kann zum Umsetzen in das Ausgabeformat verwendet werden.In some embodiments, the VP performs 1080 assembling video and / or graphic sources in one of the above static formats. In one or more embodiments, the VP performs 1080 composing the video and / or graphics in another common static format, and the CVLA can be used to translate to the output format.

Bei einigen Szenarien kann eine Ausgabe an die Anzeigeeinheit unter der Verwendung von dynamischen Metadaten wünschenswert sein. Das Erscheinungsbild einer Zusammensetzung von Video und Grafik kann verbessert werden, wenn der VP 1080 irgendeine Art von Kenntnis davon hat, wie eine Anzeigevorrichtung diese dynamischen Metadaten zum Zuordnen von Farbwerten zu der Anzeige verwenden könnte. Einige dynamische Metadaten sind global und können sich auf alle Pixel in dem Raster gleich auswirken, und einige dynamische Metadaten sind lokal und können sich nur auf einige der Pixel in dem Raster auswirken.In some scenarios, output to the display unit using dynamic metadata may be desirable. The appearance of a composition of video and graphics can be improved when the VP 1080 has some kind of knowledge of how a display device could use this dynamic metadata to associate color values with the display. Some dynamic metadata is global and can affect all pixels in the grid equally, and some dynamic metadata is local and can only affect some of the pixels in the grid.

Bei einem oder mehreren Ausführungsbeispielen stellt der VP 1080 der Anzeigeeinheit dynamische Metadaten (zum Beispiel wie in CTA 861-G definiert) aus dem Videoinhalt bereit. Die STP kann zusammengesetzten Grafik- und Videoinhalt unter Verwendung von CVLA oder Helligkeitsabgleich so anpassen, dass er eine komfortable Betrachtungsqualität aufweist. Dieser Abgleich kann als Reaktion auf globale oder lokale Bildmerkmale, wie beispielsweise eine mittlere oder Spitzenhelligkeit oder Luminanz, durchgeführt werden. Dieser Abgleich kann als Reaktion auf einen oder mehrere Werte dynamischer Metadatenelemente durchgeführt werden.In one or more embodiments, the VP 1080 the display unit dynamic metadata (for example, as in CTA 861-G defined) from the video content. The STP can adjust composite graphics and video content using CVLA or brightness compensation to provide comfortable viewing quality. This adjustment may be performed in response to global or local image features, such as average or peak brightness or luminance. This alignment can be performed in response to one or more values of dynamic metadata elements.

Bei einigen Ausführungsbeispielen kann der VP 1080 der Anzeigevorrichtung dynamische Metadaten (zum Beispiel wie in CTA 861-G definiert) bereitstellen, die auf Videoinhalt basieren, aber modifiziert wurden, um die Merkmale der Zusammensetzung aus Grafik und Video zu berücksichtigen. Bei einem oder mehreren Ausführungsbeispielen kann der VP 1080 die dynamischen Metadaten abändern, bevor sie an die Anzeigeeinheit gesendet werden, um auf das gesamte Videoraster eine andere oder eine statische Tonzuordnung anzuwenden. Der VP 1080 kann die dynamischen Quellen-Metadaten nutzen, um das Quellvideo in ein bekanntes Format umzusetzen, und er kann die CVLA verwenden, um das Ergebnis an die voraussichtliche modifizierte dynamische oder statische Tonzuordnung anzupassen. Bei einem oder mehreren Ausführungsbeispielen ändert der VP 1080 die dynamischen Metadaten ab, um die Anzeigeeinheit anzuweisen, eine alternative Tonzuordnung lediglich auf die Bereiche anzuwenden, in denen sich Video und/oder Grafik befindet. Bei einem oder mehreren Ausführungsbeispielen kann der VP 1080 zusammengesetzten Grafik- und Videoinhalt unter Verwendung von CVLA oder Helligkeitsabgleich so anpassen, dass er eine komfortable Betrachtungsqualität aufweist. Diese Anpassung kann auf globalen oder lokalen Pixelstatistiken oder auf den eingehenden dynamischen Metadaten basieren. Bei einem oder mehreren Ausführungsbeispielen kann der VP 1080 alle zusammengesetzten Video- und Grafikquellen in ein gemeinsames Format umsetzen und die CVLA (und möglicherweise andere Verarbeitungen) verwenden, um den voraussichtlichen Effekt der dynamischen Metadaten umzukehren, die gänzlich oder teilweise von dem Quellvideo stammen können oder bei denen es sich um einen anderen Satz von dynamischen Metadaten handeln kann, für welchen die Effekte bekannt sind.In some embodiments, the VP 1080 the display device dynamic metadata (for example, as in CTA 861-G defined) based on video content but modified to accommodate the graphics and video composition characteristics. In one or more embodiments, the VP 1080 Modify the dynamic metadata before sending it to the display unit to apply a different or static audio mapping to the entire video grid. The VP 1080 It can use the dynamic source metadata to translate the source video into a known format, and it can use the CVLA to tailor the result to the expected modified dynamic or static tone mapping. In one or more embodiments, the VP changes 1080 the dynamic metadata to instruct the display unit to apply an alternative sound mapping only to those areas where the video and / or graphics are located. In one or more embodiments, the VP 1080 adjust composite graphics and video content using CVLA or brightness compensation to provide comfortable viewing quality. This customization may be based on global or local pixel statistics or on inbound dynamic metadata. In one or more embodiments, the VP 1080 translate all composite video and graphics sources into a common format and use the CVLA (and possibly other processings) to reverse the likely effect of dynamic metadata, which may be wholly or partially sourced from the source video, or another set of dynamic metadata for which the effects are known.

Bei einigen Ausführungsbeispielen empfängt der VP 1080 primären Videoinhalt, der zugehörige dynamische Metadaten aufweist. Die Umkehrfunktion des voraussichtlichen Effekts der dynamischen Metadaten kann bestimmt werden. Der VP 1080 kann die Umkehrfunktion auf Grafik, Video ohne dynamische Metadaten und/oder Video mit dynamischen Metadaten anwenden, für das die dynamischen Metadaten bereits angewendet wurden. Die Grafik und/oder das Video, die bzw. das mit der Umkehrfunktion verarbeitet wurden, können, optional unter Verwendung von Alpha-Blending, mit dem primären Video zusammengesetzt werden. Bei einem oder mehreren Ausführungsbeispielen berechnet der VP 1080 Statistiken zu den Video- und/oder Grafikquellen und verwendet diese Statistiken zum Befüllen der Werte der dynamischen Metadaten.In some embodiments, the VP receives 1080 primary video content that has associated dynamic metadata. The inverse of the expected effect of the dynamic metadata can be determined. The VP 1080 can apply the inverse function to graphics, video without dynamic metadata, and / or dynamic metadata video for which dynamic metadata has already been applied. The graphics and / or video processed with the Invert function can be merged with the primary video, optionally using alpha blending. In one or more embodiments, the VP calculates 1080 Statistics on the video and / or graphics sources and uses these statistics to populate the values of the dynamic metadata.

11 ist ein Blockdiagramm, das ein Beispiel für eine STB 1100 mit verbessertem Video-Rendering unter Verwendung von Eingaben und statistikbasierten dynamischen Metadaten gemäß Erscheinungsformen der beanspruchten Technologie veranschaulicht. Die STB 1100 ist ähnlich der STB 300 von 3, mit der Ausnahme von zusätzlichen Metadaten-Eingaben, zum Beispiel einer Metadaten-Eingabe aus dem Eingabe-Videoinhalt 102-m an den Prozessor 350, und von Steuerausgaben aus dem Prozessor 350 an die CVLA-Eingabeblöcke 120. Die STB 1100 dient zur Zusammenfassung des allgemeinen Falls, bei dem eine Kombination aus Metadaten und Statistiken von mehr als einem Eingabe-Video- und -Grafikstrom von dem Prozessor 350 verwendet wird, um die jeweilige Konfiguration für jeden CVLA-Eingabeblock 120 und den CVLA-Ausgabeblock 140 zu bestimmen. Der Prozessor 350 kann außerdem ein Sammel-Metadatenelement 1105 an den Metadaten-Formatierungsblock 360 senden. Die formatierten Metadaten werden schließlich zusammen mit dem Ausgabevideo als kombinierte Ausgabe 1142 ausgegeben. 11 is a block diagram illustrating an example of an STB 1100 with improved video rendering using inputs and statistics-based dynamic metadata according to aspects of the claimed technology. The STB 1100 is similar to the STB 300 from 3 with the exception of additional metadata inputs, for example, metadata input from the input video content 102-m to the processor 350 , and tax issues from the processor 350 to the CVLA input blocks 120 , The STB 1100 serves to summarize the general case where a combination of metadata and statistics from more than one input video and graphics stream from the processor 350 is used to configure each configuration for each CVLA input block 120 and the CVLA output block 140 to determine. The processor 350 can also have a collection metadata element 1105 to the metadata formatting block 360 send. The formatted metadata eventually becomes a combined output along with the output video 1142 output.

Bei einigen Ausführungsbeispielen erlaubt die Konfiguration der STB 1100, auf der Grundlage der gesamten verfügbaren Informationen aus den Eingabe-Videoströmen und -Grafikströmen, eine gemeinsame Zusammensetzungsdomäne zu wählen und, falls die Zusammensetzungsdomäne von der Ausgabedomäne abweicht, getrennt davon ferner die Ausgabe an die Ausgabedomäne anzupassen. Auf der Grundlage der gewählten Zusammensetzungsdomäne kann jeder Eingabe-Videostrom oder -Grafikstrom eine andere Konfiguration des CVLA-Blocks erfordern, die von Metadaten (wenn vorhanden) und den berechneten Statistiken dynamisch beeinflusst wird.In some embodiments, the configuration allows the STB 1100 on the basis of all the information available from the input video streams and graphics streams, to choose a common composition domain and, if the composition domain differs from the output domain, separately to further adapt the output to the output domain. Based on the chosen composition domain, each input video stream or graphics stream may require a different CVLA block configuration, which is dynamically influenced by metadata (if any) and the computed statistics.

Die prädikativen Formulierungen „so konfiguriert, dass“, „dahingehend betriebsfähig, dass“ und „so programmiert, dass“ implizieren keine besondere materielle oder immaterielle Änderung eines Gegenstands, sondern sollen vielmehr miteinander austauschbar verwendet werden. Zum Beispiel kann die Aussage, dass ein Prozessor so konfiguriert ist, dass er eine Operation oder eine Komponente überwacht und steuert, auch bedeuten, dass der Prozessor so programmiert ist, dass er die Operation überwacht und steuert, oder dass der Prozessor dahingehend betriebsfähig ist, dass er die Operation überwacht und steuert. Analog kann die Aussage, dass ein Prozessor so konfiguriert ist, dass er Code ausführt, so ausgelegt werden, dass ein Prozessor so programmiert ist, dass er Code ausführt, oder dass er dahingehend betriebsfähig ist, dass er Code ausführt.The predicative language "configured to", "operable to that" and "programmed to" imply no particular material or immaterial change of an object, but rather are used interchangeably. For example, the statement that a processor is configured to monitor and control an operation or component may also mean that the processor is programmed to monitor and control the operation or that the processor is operable to do so. that he monitors and controls the operation. Similarly, the statement that a processor is configured to execute code may be construed as meaning that a processor is programmed to execute code or to be operable to execute code.

Claims

An apparatus comprising: a plurality of for receiving a plurality of input contents and for processing the plurality input channels configured processing channels; a compositor configured to compose a plurality of processed input contents into a composite output signal; and an output adjustment block configured to adjust the composite output signal, wherein at least one processing channel of the plurality of processing channels comprises a statistics generator and an input adaptation block.

Device after Claim 1 wherein the plurality of input content comprises input video content and graphic content, and wherein the device comprises the display device or a set-top box communicatively coupled to the display device.

Device after Claim 1 wherein the statistics generator is configured to compute statistics on a respective input content of the plurality of input contents, the statistics being at least one of a histogram, a classed histogram, a 2D histogram, a 3D histogram, a minimum, a Maximum, a sum or an average of one or more sizes.

Device after Claim 3 wherein the one or more magnitudes are luminance values, values of the components red (R), green (G) and blue (B), a value for MAX (a * R, b * G, c * B), or a value for SUM (d * R, e * G, f * B) for each pixel of the respective input content, where a, b, c, d, e and f are constant values and MAX and SUM represent a maximum and sum function, respectively.

Device after Claim 1 wherein the statistics generator is configured to calculate statistics based on a representation of a luma difference or a color difference.

Device after Claim 1 wherein the input adjustment block comprises a color volume and luminance adjustment block (CVLA block), wherein CVLA blocks of different processing channels of the plurality of processing channels are configured differently.

Device after Claim 6 wherein the CVLA block comprises a plurality of processor-controlled hardware modules (HW modules) for nonlinearity and color space transformation, and wherein the CVLA block is configured to perform volume transformation, and static and dynamic tone mapping.

Device after Claim 7 wherein the dynamic tone mapping is performed based on the dynamic metadata comprised of one or more of the plurality of input contents, or based on an analysis of possible changes to scene parameters of the plurality of input contents.

A method for improved rendering of video content from a variety of sources, the method comprising: Converting the video content from the plurality of sources using dynamic metadata into one of a composition domain or an output domain; and Incorporating an inverse function of prospective dynamic metadata display processing into at least one of the composition domain or the output domain, wherein using the dynamic metadata comprises directly using or modifying dynamic metadata from one or more selected sources of the plurality of sources.

Set Top Box (STB) comprising: a processor; a video processing block configured to receive video content from a plurality of sources, the video processing block comprising: a plurality of input processing channels configured to process the received video content; a compositor configured to compose a plurality of processed input contents to generate a composite output signal; and an output adjustment block configured to adjust the composite output signal for an output device; wherein the processor is configured to receive dynamic metadata in a customized composite output signal of the output adjustment block.