CN110073667B

CN110073667B - Display device, content recognition method, and non-transitory computer-readable recording medium

Info

Publication number: CN110073667B
Application number: CN201780076820.0A
Authority: CN
Inventors: 吕海东
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2016-12-21
Filing date: 2017-12-20
Publication date: 2021-07-13
Anticipated expiration: 2037-12-20
Also published as: CN110073667A; KR102468258B1; KR20180072522A

Abstract

The invention provides a display device, a content recognition method thereof, and a non-transitory computer-readable recording medium. The display device includes: a display configured to store a memory storing information on a fingerprint generated by extracting a characteristic of content, and content corresponding to the fingerprint; a communication device configured to communicate with a server; and at least one processor configured to extract a characteristic of a content screen currently being reproduced on the display and generate a fingerprint, to search for the presence/absence of a fingerprint matching the generated fingerprint in a memory, and to determine whether to transmit a query including the generated fingerprint to a server to request information on the currently reproduced content, based on a result of the search.

Description

Display device, content recognition method, and non-transitory computer-readable recording medium

Technical Field

The present disclosure relates to a display device, a content recognition method thereof, and at least one non-transitory computer-readable recording medium. More particularly, the present disclosure relates to a display device capable of effectively recognizing content viewed by a user, a content recognition method thereof, and a non-transitory computer-readable recording medium.

In addition, apparatus and methods consistent with various embodiments relate to Artificial Intelligence (AI) systems that simulate human brain functions such as recognition, determination, etc., for example, by utilizing machine learning algorithms such as deep learning and techniques applied thereto.

Background

In recent years, display apparatuses such as Televisions (TVs) increasingly use set-top boxes instead of directly receiving broadcast signals. In this case, the display device cannot know what content the user is currently viewing.

If the display device knows what content the user is currently viewing, intelligent services such as targeted advertising, content recommendations, related information services can be provided. For this reason, Automatic Content Recognition (ACR) has been developed as a technique for recognizing currently displayed content at a display device.

In the related art method, the display device periodically captures a screen currently being viewed, extracts a characteristic for identifying the screen, and periodically identifies the current screen by querying the request server.

However, the display device has no choice but to frequently send a query to the server to quickly detect a change in the content being viewed, and thus, the ACR requires many resources and a large cost.

As computer technology has developed and data traffic has increased in the form of exponential functions, Artificial Intelligence (AI) has become an important trend leading to future innovations. AI can be applied indefinitely to all industries due to the way it mimics human thinking.

The AI system refers to a computer system that implements high intelligence as human intelligence, and is a system that makes a machine learn and determine itself and become smarter, unlike existing rule-based intelligent systems. AI systems can improve recognition rates when used and can accurately understand a user's tastes, so existing rule-based intelligent systems are increasingly being replaced by deep learning-based AI systems.

AI techniques include machine learning (e.g., deep learning) and component techniques that use machine learning.

Machine learning is an algorithmic technique of classifying/learning features of input data by itself, and an element technique is a technique for simulating human brain functions such as recognition, determination, and the like by using a machine learning algorithm such as deep learning, and may include technical fields such as language understanding, visual understanding, inference/prediction, knowledge representation, operation control, and the like.

Various fields to which the AI technique is applied are as follows. Language understanding is a technique for recognizing human languages/characters and applying/processing them and may include natural language processing, machine translation, dialog systems, questions and answers, speech recognition/synthesis. Visual understanding is a technique for recognizing things in the same way as humans use eyes, and may include object recognition, object tracking, image search, character recognition, scene understanding, spatial understanding, and image enhancement. Inference/prediction is a technique for logically inferring and predicting by determining information, and may include knowledge/probability based inference, optimized prediction, preference based planning, recommendation, and the like. Knowledge representation is a technique for automating human experience information into knowledge data, and may include knowledge building (data generation/classification), knowledge management (data utilization), and the like. The operation control is a technique for controlling the automatic driving of the vehicle and the movement of the robot, and may include movement control (navigation, collision, driving), steering control (behavior control), and the like.

The above information is presented merely as background information to aid in understanding the present disclosure. No determination is made and no assertion is made as to whether any of the above applies to the prior art with respect to the present disclosure.

Disclosure of Invention

Aspects of the present disclosure are directed to solving at least the above problems and/or disadvantages and to providing at least the advantages described below. Accordingly, it is an aspect of the present disclosure to provide a display apparatus capable of adjusting a content recognition period using information of content, a content recognition method thereof, and a non-transitory computer-readable recording medium.

According to an aspect of the present disclosure, there is provided a display device. The display device includes a display; a memory configured to store a fingerprint generated by extracting a characteristic of content and information on the content corresponding to the fingerprint; a communication device configured to communicate with a server; and at least one processor configured to extract characteristics of a content screen currently being reproduced on the display and generate a fingerprint, to search for the presence/absence of a fingerprint matching the generated fingerprint in the memory, and to determine whether to transmit a query including the generated fingerprint to the server to request information on the currently reproduced content, based on a result of the search.

The at least one processor may be configured to, in response to searching for a fingerprint matching the generated fingerprint in the memory, identify currently-reproduced content based on information about content corresponding to the searched fingerprint, and control the communication device to transmit a query including the fingerprint to the server to request information about the currently-reproduced content in response to not searching for a fingerprint matching the generated fingerprint in the memory.

The at least one processor may be configured to control the communication device to receive information about the currently reproduced content and a fingerprint of the currently reproduced content from the server in response to the query, in response to not searching for a fingerprint in the memory that matches the generated fingerprint.

The at least one processor may be configured to determine a type of the content based on the information on the content currently being reproduced, and change the content identification period according to the determined type of the content.

Additionally, the at least one processor may be configured to identify content in each first time period in response to the content being advertising content and identify content in each second time period that is longer than the first time period in response to the content being broadcast program content.

The at least one processor may be configured to determine a type of content based on the information on the currently reproduced content, and change the number of fingerprints of the currently reproduced content to be received according to the determined type of content.

The at least one processor may be configured to calculate a probability that the reproduced content is changed based on the information on the currently reproduced content and the viewing history, and change the content recognition period according to the calculated probability.

The at least one processor may be configured to predict content to be reproduced next based on the viewing history, and request information about the predicted content from the server.

The at least one processor may be configured to receive additional information related to the currently reproduced content from the server and control the display to display the received additional information with the currently reproduced content.

According to another aspect of the present disclosure, a method for identifying content of a display device is provided. The method includes extracting characteristics of a currently reproduced content screen and generating a fingerprint, searching whether a fingerprint matching the generated fingerprint is stored in a display device, and determining whether to transmit a query including the generated fingerprint to an external server to request information on the currently reproduced content based on a result of the searching.

Determining whether to transmit the query to the external server may include identifying the currently reproduced content based on information of the content corresponding to the searched fingerprint in response to searching for a fingerprint matching the generated fingerprint in the display device, and transmitting the query including the fingerprint to the server to request information on the currently reproduced content in response to not searching for a fingerprint matching the generated fingerprint in the display device.

In addition, the method may further include receiving information on the currently reproduced content and a fingerprint of the currently reproduced content from a server in response to the query in response to a fingerprint matching the generated fingerprint not being searched in the display apparatus.

The method may further include determining a type of the content based on the information on the currently reproduced content, and changing the content recognition period according to the determined type of the content.

Changing the content identification period may include identifying the content in each first period in response to the content being advertising content, and identifying the content in each second period that is longer than the first period in response to the content being broadcast program content.

The method may further include determining a type of the content based on the information on the currently reproduced content, and changing the number of fingerprints of the currently reproduced content to be received according to the determined type of the content.

The method may further include calculating a probability that the reproduction content is changed based on the information on the currently reproduced content and the viewing history, and changing the content recognition period according to the calculated probability.

The method may further include predicting content to be reproduced next based on the viewing history, and requesting information about the predicted content from the server.

The method may further include receiving additional information related to the currently reproduced content from the server, and displaying the received additional information and the currently reproduced content.

According to another aspect of the present disclosure, at least one non-transitory computer-readable recording medium is provided. The at least one non-transitory computer-readable recording medium includes a program for executing a method for identifying content of a display device, the method including extracting characteristics of a screen of currently reproduced content and generating a fingerprint, searching whether a fingerprint matching the generated fingerprint is stored in the display device, and determining whether to transmit a query including the generated fingerprint to an external server to request information on the currently reproduced content based on a result of the searching.

According to the various embodiments described above, the display apparatus dynamically adjusts the ratio and the operation period between the server Automatic Content Recognition (ACR) and the local ACR, thereby reducing the load on the server and improving the accuracy of recognizing the content.

Other aspects, advantages, and salient features of the disclosure will become apparent to those skilled in the art from the following detailed description, which, taken in conjunction with the annexed drawings, discloses various embodiments of the disclosure.

Drawings

The above and other aspects, features and advantages of certain embodiments of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings, in which:

FIG. 1 illustrates a display system according to an embodiment of the present disclosure;

fig. 2a and 2b are schematic block diagrams illustrating a configuration of a display device according to an embodiment of the present disclosure;

FIG. 3 is a block diagram of a processor according to an embodiment of the present disclosure;

FIG. 4a is a block diagram of a data learning unit according to an embodiment of the present disclosure;

FIG. 4b is a block diagram of a data identification unit according to an embodiment of the present disclosure;

fig. 5 is a block diagram illustrating a configuration of a display device according to an embodiment of the present disclosure;

fig. 6 is a view illustrating hybrid Automatic Content Recognition (ACR) according to an embodiment of the present disclosure;

FIG. 7 is a diagram illustrating fingerprint information having different granularities, according to an embodiment of the present disclosure;

fig. 8 is a view illustrating viewing history information according to an embodiment of the present disclosure;

fig. 9 is a view illustrating display of additional information on content according to an embodiment of the present disclosure;

fig. 10, 11, 12a, 12b, 13a, 13b, 14a, 14b, 15a, and 15b are flowcharts illustrating a content recognition method of a display device according to various embodiments of the present disclosure.

Fig. 16 is a view illustrating data learned and recognized by a display device and a server interlocked with each other according to an embodiment of the present disclosure;

fig. 17 is a flowchart illustrating a content recognition method of a display system according to an embodiment of the present disclosure;

fig. 18 is a flowchart illustrating a content recognition method of a display system according to an embodiment of the present disclosure;

fig. 19 is a view showing a case in which a display apparatus changes a content recognition period according to a probability of changing content by interlocking with a server according to an embodiment of the present disclosure;

fig. 20 is a view illustrating a method in which a display device predicts content to be reproduced next time and receives information on the predicted content in advance by interlocking with a server according to an embodiment of the present disclosure; and

fig. 21 is a view illustrating a method in which a display device predicts content to be reproduced next time and receives information on the predicted content in advance by interlocking with a plurality of servers according to an embodiment of the present disclosure.

Throughout the drawings, like reference numerals will be understood to refer to like parts, components and structures.

Detailed Description

The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of various embodiments of the disclosure as defined by the claims and their equivalents. It includes various specific details to aid understanding, but these are to be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the various embodiments described herein can be made without departing from the scope and spirit of the present disclosure. In addition, descriptions of well-known functions and constructions may be omitted for clarity and conciseness.

The terms and words used in the following description and claims are not limited to the written meaning, but are used only by the inventors to enable a clear and consistent understanding of the disclosure. Accordingly, it will be apparent to those skilled in the art that the following descriptions of the various embodiments of the present disclosure are provided for illustration only and not for the purpose of limiting the disclosure as defined by the appended claims and their equivalents.

It is understood that the singular forms "a," "an," and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to a "component surface" includes reference to one or more such surfaces.

The term "substantially" means that the feature, parameter, or value does not need to be achieved exactly, but includes deviations or variations from, for example, tolerances, measurement error, measurement accuracy limitations, and other factors known to those of skill in the art, which may occur in amounts that do not preclude the effect that the feature is intended to provide.

Terms such as "first" and "second" as used in various embodiments may be used to describe various elements but do not limit the corresponding elements. These terms may be used for the purpose of distinguishing one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. The term "and/or" includes a combination of multiple related items or any one of multiple related items.

The terminology used in the various embodiments of the disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting and/or limiting of the disclosure. As used herein, the singular forms are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises" or "comprising" mean the presence of the features, numbers, operations, elements, and components, or combinations thereof, described in the specification, and do not preclude the presence or addition of one or more other features, numbers, operations, elements, or components, or groups thereof.

In addition, a "module" or a "unit" used in the embodiments performs one or more functions or operations, and may be implemented by using hardware or software or a combination of hardware and software. In addition, in addition to "modules" or "units" that need to be implemented by specific hardware, a plurality of "modules" or a plurality of "units" may be integrated into one or more modules and may be implemented as one or more processors.

Hereinafter, the present disclosure will be described below with reference to the accompanying drawings.

Fig. 1 illustrates a display system according to an embodiment of the present disclosure.

Referring to fig. 1, a display system 1000 includes a display apparatus 100 and a server 200.

The display apparatus 100 may be a smart Television (TV), but this is just an example, and the display apparatus 100 may be implemented by using various types of apparatuses, such as a projection television, a monitor, a kiosk, a notebook, a computer (PC), a tablet, a smart phone, a Personal Digital Assistant (PDA), an electronic photo frame, a desktop display, and the like.

The display apparatus 100 may extract characteristics from a content screen currently being reproduced, and may generate a fingerprint. In addition, the display apparatus 100 may perform local Automatic Content Recognition (ACR) by searching for the generated fingerprint in a fingerprint database stored in the display apparatus 100 and recognize the content currently being reproduced, and may perform server ACR by transmitting a query including the generated fingerprint to the server 200 and recognizing the content. More specifically, the display apparatus 100 can appropriately adjust the ratio of operations between the local ACR and the server ACR by adjusting a content identification period, the number of fingerprints to be received from the server 200, and the like using the identified content information, viewing history, and the like.

The server 200 may be implemented by using a device capable of transmitting information including identification (or Identification (ID)) information for distinguishing a specific image from other images. For example, the server 200 may transmit the fingerprint to the display device 100. A fingerprint is an identification information that can distinguish an image from other images.

Specifically, the fingerprint is characteristic data extracted from video and audio signals included in a frame. Unlike text-based metadata, fingerprints may reflect unique characteristics of a signal. For example, when an audio signal is included in a frame, a fingerprint may be data representing characteristics of the audio signal, such as frequency, amplitude, and the like. When a video (or still image) signal is included in a frame, a fingerprint may be data representing a feature, such as a motion vector, a color, and the like.

Fingerprints may be extracted by various algorithms. For example, the display apparatus 100 or the server 200 may divide the audio signal according to regular time intervals, and may calculate the magnitude of the signal of the frequency included in each time interval. In addition, the display apparatus 100 or the server 200 may calculate the frequency slope by obtaining a magnitude difference between signals of adjacent frequency intervals. The fingerprint on the audio signal may be generated by setting 1 when the calculated frequency slope is a positive value and setting 0 when the calculated frequency slope is a negative value.

The server 200 may store the fingerprint on a particular image. The server 200 may store one or more fingerprints on the already registered images, and when two or more fingerprints regarding a specific image are stored, the fingerprints may be managed as a list of fingerprints for the specific image.

The term "fingerprint" used in the present disclosure may refer to one fingerprint on a specific image or, as the case may be, a fingerprint list formed of a plurality of fingerprints on a specific image.

The term "frame" as used in this disclosure refers to a series of data having information about audio or images. The frame may be data regarding audio or image during a predetermined time. In the case of digital image content, a frame may be formed of 30-60 image data per second, and these 30-60 image data may be referred to as a frame. For example, when a current frame and a next frame of image content are used together, the frame may refer to respective image screens included in the content and displayed continuously.

Although fig. 1 depicts the display system 1000 including one display apparatus 100 and one server 200, a plurality of display apparatuses 100 may be connected to one server 200, or a plurality of servers 200 may be connected to one display apparatus 100. Other combinations are also possible.

Fig. 2a is a block diagram illustrating a configuration of a display device according to an embodiment of the present disclosure.

Referring to fig. 2a, the display device 100 may include a display 110, a memory 120, a communication unit 130, and a processor 140.

The display 110 may display various image contents, information, a User Interface (UI), and the like provided by the display apparatus 100. For example, the display apparatus 100 may display image content, broadcast program images, user interface windows provided by a set-top box (not shown).

The memory 120 may store various modules, software, and data for driving the display apparatus 100. For example, the memory 120 may store a plurality of fingerprints, information about content corresponding to the fingerprints, viewing history information, and the like. The fingerprint stored in the memory 120 may be a fingerprint generated by the display apparatus 100 itself, or may be a fingerprint received from the server 200. The memory 120 may attach index information for the local ACR to the fingerprint and store the fingerprint.

In addition, when the display apparatus 100 reproduces the contents stored in the memory 120, the fingerprint may be paired with the corresponding contents and may be stored in the memory 120. For example, a fingerprint may be added to each frame of memory 120 and may be stored in the form of a new file that combines the content and the fingerprint. In another example, the fingerprint may also include information mapped to corresponding frames of the content.

The communication unit 130 may communicate with an external device such as the server 200 using a wired/wireless communication method. For example, the communication unit 130 may exchange data such as a fingerprint, content information, viewing history information, additional information related to the content, and a control signal (such as an identification period change control signal) with the server 200.

The processor 140 may identify content currently being reproduced, and may control to perform ACR with appropriate accuracy based on the identification result. For example, the processor 140 may adjust the content identification period based on the content information, and may determine the content to be previously received and stored from the server 200 and the number of fingerprints about the content.

The processor 140 may extract characteristics of the content screen currently being reproduced, and may generate a fingerprint. In addition, the processor 140 may search for whether there is a fingerprint matching the generated fingerprint from among the plurality of fingerprints stored in the memory 120. In addition, the processor 140 may transmit a query to the server 200 according to the search result, and may determine whether to attempt to execute the server ACR. For example, the processor 140 may first attempt to execute a local ACR in order to reduce the load on the server 200.

In response to searching for a fingerprint that matches the generated fingerprint, the processor 140 may identify the currently reproduced content based on information of the content corresponding to the searched fingerprint. For example, the information on the content corresponding to the fingerprint may include a position of the current frame in the total frame, a reproduction time, and the like, which is information on the current frame. In addition, the information on the content corresponding to the fingerprint may include at least one of a content name, a content ID, a content provider, content series information, a genre, information on whether the content is a real-time broadcast, and information on whether the content is a pay content.

On the other hand, in response to not searching for a fingerprint matching the generated fingerprint, the processor 140 may control the communication unit 130 to transmit a query requesting information about the currently reproduced content to the server 200. For example, the query may include a generated fingerprint, a viewing history, information about the display device 100, and the like.

In addition, the processor 140 may control the communication unit 130 to receive information on the currently reproduced content and a fingerprint of the currently reproduced content from the server 200 in response to the query. Here, the fingerprint of the content currently reproduced may be a fingerprint about a frame located after the current frame in the entire content. Since the processor 140 knows the time indicated by the position of the current frame in the entire content based on the fingerprint included in the query, the processor 140 may receive a fingerprint on a frame expected to be reproduced after the current frame from the server.

As described above, the processor 140 may identify content by appropriately combining the local ACR and the server ACR. By doing so, the processor 140 may identify the content currently being rendered on the display 110 while minimizing the load on the server 200.

In order to appropriately combine the local ACR and the server ACR, the processor 140 may determine a fingerprint received from the server 200 for the local ACR in advance based on at least one of the result of the content recognition and the viewing history, and may determine whether to change the content recognition period. For example, the processor 140 may determine which content to consider to receive a fingerprint, and how many fingerprints will be received at one time.

According to an embodiment of the present disclosure, the processor 140 may determine the type of the content based on information about the content currently being reproduced. In addition, the processor 140 may change the content recognition period according to the type of the content. The type of the content may be classified according to criteria such as the details of the content, the type, information on whether the content is broadcast in real time, importance, and the like.

For example, in response to the currently reproduced content being identified as advertising content, the processor 140 may adjust the content identification period to be short (e.g., to identify the currently displayed content screen in each frame). In response to the currently reproduced content being identified as movie content or broadcast program content, the processor 140 may adjust the content identification period to be long (e.g., to identify the currently displayed content screen every 30 seconds).

Each type of identification period may be a predetermined period. The various time periods may be personalized and set according to the various criteria and viewing history described above.

In the case of advertising content, the processor 140 needs to frequently identify the content because an advertisement is typically changed to another advertisement in a short time. In contrast, in the case of movie content, the processor 140 need not identify the content frequently, as it is only determined whether the movie is being viewed continuously.

In the above example, the types of the contents are classified according to the kinds of the contents. As in the example above, in the case of advertising content, the content may be identified in each frame, but may be identified infrequently according to other criteria such as importance or viewing history.

The number of frames received and stored in advance from the server 200 may vary according to the identified content type. The number of fingerprints received and stored in advance may also vary, since there are fingerprints corresponding to the respective frames. For example, in case of a Video On Demand (VOD) or a Digital Video Recorder (DVR), the server 200 may have all image information, but in case of a live broadcast, the server 200 may receive the image information several seconds before the display device 100. Let us take an example of a 60Hz image showing 60 frames per second. In the case of one hour of VOD, the server 200 may possess a fingerprint corresponding to about 200,000 frames, but in the case of live broadcast, the server 200 may possess only a fingerprint corresponding to about several hundred frames.

Accordingly, the processor 140 may determine the number of fingerprints to request based on the identified content. In addition, the processor 140 may change the content identification period based on the number of fingerprints received at one time.

According to an embodiment of the present disclosure, the processor 140 may change the content identification period based on the information on the identified content and the viewing history. The viewing history may include content that the user has viewed, viewing time, additional applications that have been run while viewing.

For example, the processor 140 may determine whether the currently reproduced content will be continuously reproduced or another content will be reproduced by comparing the currently reproduced content with the viewing history. In addition, the processor 140 may request information about a fingerprint corresponding to content expected to be reproduced next time from the server 200.

According to an embodiment of the present disclosure, the processor 140 may receive additional information related to the content in addition to the fingerprint of the content currently being reproduced and the fingerprint corresponding to the content expected to be reproduced next time. For example, the additional information may include a content name, a content reproduction time, a content provider, PPL product information appearing in the content, advertisements related to the PPL product information, and additional executable applications, etc.

In addition, the processor 140 may control the display 110 to display the received additional information and the currently reproduced content.

In the above example, the display apparatus 100 requests information of content such as a fingerprint from the server 200. However, the server 200 may transmit information (e.g., a fingerprint) required for the display apparatus 100 to the server 200 without receiving the request.

According to various embodiments of the present disclosure, the display apparatus 100 may estimate the type of content using a data recognition model. The data recognition model may be, for example, a set of algorithms that use statistical machine learning to estimate the type of content using information of the content and/or fingerprints generated from the content.

In addition, the display apparatus 100 may calculate a probability that the content is changed using the data recognition model. The data recognition model may be, for example, a set of algorithms for estimating a probability that reproducing the content is changed using information of the content (e.g., content reproduction time, content reproduction channel, type of content, etc.).

The data recognition model may be implemented by using software or an engine for running the set of algorithms. The data recognition model implemented using software or an engine may be run by a processor in the display device 100 or a processor of a server (e.g., server 200 in fig. 1).

According to an embodiment of the present disclosure, the server 200 may include a configuration of a general server device. For example, the server 200 may include a memory 210, a communication unit 220, a broadcast signal receiver 230, and a processor 240.

The server 200 may capture video/audio information for a plurality of contents. For example, the server 200 may collect images based on frames. For example, the server 200 may previously divide various contents into data of frame units, and may collect the data. In addition, the server 200 may generate a fingerprint by analyzing the collected frames. In another example, the server 200 may receive a broadcast signal from a broadcasting station and may capture video/audio information from the received signal. The server 200 may receive the broadcast signal before the display apparatus 100 receives it.

For example, the fingerprint generated by the server 200 may be information for distinguishing a screen and audio at a specific time. Alternatively, the fingerprint generated by the server 200 may include information on a scene change pattern, and may be information indicating what content is being continuously viewed. The server 200 may build a database in which the generated fingerprints and information corresponding to the content of the fingerprints are indexed to facilitate searching. For example, the information on the content corresponding to the fingerprint may include a position of the current frame in the entire content, a reproduction time, and the like. In addition, the information on the content corresponding to the fingerprint may include at least one of a content name, a content ID, a content provider, content series information, a genre, information on whether the content is a real-time broadcast, and information on whether the content is a pay content.

In response to receiving a query from the display device 100, the server 200 may extract at least one fingerprint from the query. In addition, the server 200 may receive information on the display device 100 that has transmitted the query.

The server 200 may match the extracted fingerprint with information stored in a database and may determine what content the display device 100 is currently viewing. The server 200 may transmit a response to the determined content information to the display device 100.

In addition, the server 200 may manage the viewing history of each of the display devices 100 using the information received on the display devices 100 and the determined content information. By doing so, the server 200 may provide a personalized service for each of the display devices 100.

The server 200 may predict content to be displayed next on the display device 100 using information of content currently displayed on the display device 100 and viewing history information. In addition, the server 200 may transmit the fingerprint extracted from the predicted content to the display device 100. For example, the extracted fingerprint may be a fingerprint corresponding to a frame following the current frame in the entire content. In another example, the extracted fingerprint may be a fingerprint on content of another broadcast channel predicted based on the viewing history information.

In addition, the server 200 may determine the content recognition period by analyzing the content image or using Electronic Program Guide (EPG) information. According to the determined content recognition period, the server 200 may determine the number of fingerprints required to perform the local ACR at the display device 100. The display apparatus 100 may generate a fingerprint by analyzing a currently displayed frame in each content recognition period. In addition, the display apparatus 100 may search for the generated fingerprint in a fingerprint database received and stored from the server 200. Accordingly, the server 200 may transmit only the fingerprint corresponding to the frame for the display apparatus 100 to identify the content. Since only the necessary number of fingerprints are transmitted, the server 200 can minimize the communication load even when the server ACR is executed.

Fig. 2b illustrates a configuration of a display device according to an embodiment of the present disclosure.

Referring to fig. 2b, the display apparatus 100 may include a first processor 140-1, a second processor 140-2, a display 110, a memory 120, and a communication unit 130. However, not all of the elements shown in the figures are essential elements.

The first processor 140-1 may control the operation of at least one application installed in the display apparatus 100. For example, the first processor 140-1 may generate a fingerprint by capturing an image displayed on the display 110, and may perform ACR. The first processor 140-1 may be implemented in the form of a system on chip (SoC) integrating functions of a Central Processing Unit (CPU), a Graphic Processing Unit (GPU), a communication chip, and a sensor. Alternatively, the first processor 140-1 may be an Application Processor (AP).

The second processor 140-2 may estimate the type of content using a data recognition model. The data recognition model may be, for example, a set of algorithms for estimating the type of content using information of the content and/or fingerprints generated from the content using results of statistical machine learning.

In addition, the second processor 140-2 may calculate a probability that the content is changed using the data recognition model. The data recognition model may be, for example, a set of algorithms for estimating a probability that reproduction content is changed using information of the content (e.g., content reproduction time, content reproduction channel, type of content, etc.) and viewing history.

In addition, the second processor 140-2 may estimate the content to be reproduced next after reproducing the content using the data recognition model. The data recognition model may be, for example, a set of algorithms for estimating a probability that the reproduction content is changed using information of the content (e.g., content reproduction time, content reproduction channel, type of content, etc.) and viewing history.

The second processor 140-2 may be manufactured in the form of a dedicated hardware chip for AI that performs a function of estimating the type of content and estimating the probability of a change in the content using a data recognition model.

According to an embodiment of the present disclosure, the first processor 140-1 and the second processor 140-2 may interlock with each other to perform a series of processes like the processor 140 to generate a fingerprint from content and identify the content using ACR as described above with reference to fig. 2 a.

The display 110, the memory 120, and the communication unit 130 correspond to the display 110, the memory 120, and the communication unit 130 in fig. 2a, respectively, and thus redundant description thereof is omitted.

Fig. 3 is a block diagram of a processor according to an embodiment of the present disclosure.

Referring to fig. 3, the processor 140 according to the embodiment may include a data learning unit 141 and a data recognition unit 142.

The data learning unit 141 may learn to make the data recognition model have a standard for analyzing characteristics of predetermined video/audio data. The processor 140 may generate the fingerprint by analyzing characteristics of each of the captured frames (e.g., a change in frequency of audio data, a change in color of video data per frame, or a change in motion vector).

The data learning unit 141 may determine what learning data is to be used to determine characteristics of the captured content screen (or frame). In addition, the data learning unit 141 may learn the criterion for extracting the characteristics of the captured content using the determined learning data.

According to various embodiments of the present disclosure, the data learning unit 141 may learn to make the data recognition model have a criterion for estimating the type of video/audio data based on the learning data related to the information about the predetermined video/audio and the type of video/audio data.

The information on the video/audio data may include, for example, the position of the current frame in the entire video/audio and the position of the reproduction time, which are information on the current frame. In addition, the information on the video/audio may include at least one of a video/audio name, a video/audio ID, a video/audio provider, video/audio series information, a genre, information on whether the video/audio is real-time broadcast, and information on whether the video/audio is pay content.

The type of video data may include, for example, drama, advertisement, movie, news, and the like. The type of audio data may include, for example, music, news, advertisements, and the like. However, the type of audio/video data is not limited thereto.

According to various embodiments of the present disclosure, the data learning unit 141 may learn to make the data recognition model have a criterion for estimating a probability that the video/audio data is changed to another video/audio data during reproduction, or a criterion for estimating video/audio data to be reproduced next after completion of reproduction, based on learning data related to information on predetermined video/audio data, a type of the video/audio data, and viewing history/audio data of the video (e.g., a history of having changed to another video/audio data to be viewed).

The data recognition unit 142 may recognize the situation based on predetermined recognition data using the learned data recognition model. The data recognition unit 142 may obtain predetermined recognition data according to a predetermined criterion obtained according to the learning, and may use the data recognition model using the obtained recognition data as an input value.

For example, using a learned feature extraction model, the data recognition unit 142 may extract features on individual frames included in the recognition data (such as captured content) and may generate a fingerprint. In addition, the data recognition unit 142 may update the data recognition model using output data obtained as a result of being applied to the data recognition model again as an input value.

According to various embodiments of the present disclosure, the data recognition unit 142 may obtain a result of determining the type of the video/audio data by applying recognition data related to information on predetermined video/audio as an input value to the data recognition model.

According to various embodiments of the present disclosure, by applying identification data related to information of predetermined video/audio data and the type of the video/audio data as input values to the data recognition model, the data recognition unit 142 may obtain a result of estimating a probability that the video/audio data is changed to another video/audio data while reproducing the video/audio data, or a result of estimating video/audio data reproduced next after completion of reproduction.

At least one of the data learning unit 141 and the data recognition unit 142 may be manufactured in the form of one hardware chip or a plurality of hardware chips, and may be installed in the display device 100. For example, the at least one data learning unit 141 and the data recognition unit 142 may be manufactured in the form of a dedicated hardware chip for AI, or may be manufactured as a part of an existing general-purpose processor (e.g., CPU or AP) or a graphic dedicated processor (e.g., GPU, ISP), and may be installed in the various display devices 100 described above.

In this case, the dedicated hardware chip for AI may be a dedicated processor dedicated to calculating probabilities, and may have higher parallel processing performance than an existing general-purpose processor, and thus may quickly process operation tasks in the AI domain, such as machine learning. When the data learning unit 141 and the data recognition unit 142 are implemented by using software modules (or program modules including instructions), the software modules may be stored in a non-transitory computer-readable recording medium. In this case, the software module may be provided by an Operating System (OS) or may be provided by a predetermined application. Alternatively, a part of the software module may be provided by the OS and another part may be provided by a predetermined application.

Although fig. 3 depicts that the data learning unit 141 and the data recognition unit 142 are both installed in the display apparatus 100, they may be installed in separate devices. For example, one of the data learning unit 141 and the data recognition unit 142 may be included in the display device 100, and the other may be included in the server 200. Further, the data learning unit 141 and the data recognition unit 142 may be connected to each other in a wired or wireless method, and the model information established by the data learning unit 141 may be provided to the data recognition unit 142, and the data input to the data recognition unit 142 may be provided to the data learning unit 141 as additional learning data.

Fig. 4A is a block diagram of a data learning unit according to an embodiment of the present disclosure.

Referring to fig. 4A, the data learning unit 141 according to the embodiment may include a data acquisition unit 141-1 and a model learning unit 141-4. In addition, the data learning unit 141 may further selectively include at least one of the preprocessing unit 141-2, the learning data selection unit 141-3, and the model evaluation unit 141-5.

The data acquisition unit 141-1 can obtain learning data necessary for determining the situation. For example, the data acquisition unit 141-1 may acquire an image frame by capturing a screen reproduced on the display 110. Further, the data acquisition unit 141-1 may receive image data from an external device, such as a set-top box. The image data may be formed from a plurality of image frames. In addition, the data acquisition unit 141-1 may receive the learning image data from the server 200 or a network such as the internet.

The model learning unit 141-4 may learn to make the data recognition model have a criterion for determining a situation based on the learning data. In addition, the model learning unit 141-4 may learn to have the data recognition model with criteria for selecting what learning data will be used to determine the situation.

For example, the model learning unit 141-4 may learn physical characteristics for distinguishing images by comparing a plurality of image frames. The model learning unit 141-4 may learn a criterion for distinguishing the image frames by extracting a ratio between the foreground and the background in the image frames, the size, the position, and the arrangement of the object, and the feature point.

In addition, the model learning unit 141-4 may learn a criterion for recognizing the type of content including the image frame. For example, the model learning unit 141-4 may learn criteria for identifying a frame having a text box at the upper end or the lower end of an image frame as one type. This is because the image of the news content has a text box at the upper or lower end of the left side to display the news content.

According to various embodiments of the present disclosure, the model learning unit 141 may learn to make the data recognition model have a criterion for estimating the type of video/audio data based on the learning data related to the information of the predetermined video/audio data and the kind of the video/audio data.

The information of the video/audio data may include, for example, the position and reproduction time of the current frame in the entire video/audio, which are information on the current frame. In addition, the information on the video/audio may include at least one of a video/audio name, a video/audio ID, a video/audio provider, video/audio series information, a genre, information on whether the video/audio is real-time broadcast, and information on whether the video/audio is pay content.

According to various embodiments of the present disclosure, the model learning unit 141 may learn to make the data recognition model have a criterion for estimating a probability that the video/audio data is changed to another video/audio data during reproduction or a criterion for estimating video/audio data to be reproduced next after completion of reproduction, based on data related to information of predetermined video/audio data, a type of the video/audio data, and a viewing history of the video/audio data (e.g., a history that has been changed to another audio/video data or a history that has completed selecting another audio/video data after viewing the audio/video data).

The data recognition model may be an already established model. For example, the data recognition model may be a model that receives basic learning data (e.g., sample images) and has been built.

The data learning unit 141 may further include a preprocessing unit 141-2, a learning data selection unit 141-3, and a model evaluation unit 141-5 in order to improve the result of data recognition model recognition or in order to save resources or time required to generate the data recognition model.

The preprocessing unit 141-2 may preprocess the obtained learning data so that the obtained learning data can be used for learning to determine the situation. The preprocessing unit 141-2 may process the obtained data in a predetermined format so that the model learning unit 141-4 may perform learning using the obtained data to determine the situation.

For example, the preprocessing unit 141-2 may generate image frames of the same format by performing decoding, scaling, noise filtering, resolution conversion, and the like on input image data. In addition, the preprocessing unit 141-2 may clip only a specific region included in each of the inputted plurality of image frames. The display apparatus 100 may consume less resources to distinguish one of the frames from the other frames only in response to the particular region being cropped.

In another example, the preprocessing unit 141-2 may extract a text region included in the input image frame. In addition, the preprocessing unit 141-2 may generate text data by performing Optical Character Recognition (OCR) on the extracted text regions. The text data preprocessed as described above may be used to distinguish image frames.

The learning data selection unit 141-3 may select data required for learning from the preprocessed data. The selected data may be provided to the model learning unit 141-4. The learning data selection unit 141-3 may select data required for learning from the preprocessed data according to a predetermined criterion for determining a situation. In addition, the learning data selection unit 141-3 may select data according to a predetermined criterion determined by the learning of the model learning data selection unit 141-3, which will be described below.

For example, at the initial timing of learning, the learning data selection unit 141-3 may remove image frames having high similarity from the pre-processed image frames. For example, the learning data selection unit 141-3 may select data having a low degree of similarity for initial learning, so that a criterion for easy learning may be learned.

In addition, the learning data selection unit 141-3 may select a preprocessed image frame that generally satisfies one of the criteria determined by learning. By doing so, the model learning unit 141-4 can learn different criteria from the criteria that have been learned.

The model evaluation unit 141-5 may input evaluation data into the data recognition model, and in response to a recognition result output from the evaluation data that does not satisfy a predetermined criterion, may cause the model learning unit 141-4 to learn again. In this case, the evaluation data may be predetermined data for evaluating the data recognition model.

In the initial recognition model configuration operation, the evaluation data may be image frames representing different content types. Thereafter, the evaluation data may be replaced with a set of image frames having a higher similarity. By doing so, the model evaluation unit 141-5 can verify the performance of the data recognition model in stages.

For example, the model evaluation unit 141-5 may evaluate that the predetermined criterion is not satisfied in response to the number or ratio of evaluation data that results in an inaccurate recognition result from among recognition results of learning data recognition models with respect to the evaluation data exceeding a predetermined threshold. For example, the predetermined criterion may be defined as a ratio of 2%. In this case, in response to the learning data recognition model outputting the erroneous recognition results for 20 or more evaluation data out of the 1000 total evaluation data, the model evaluation unit 141-5 may evaluate that the learning data recognition model is not appropriate.

In response to the presence of a plurality of learning data recognition models, the model evaluation unit 141-5 may evaluate whether each of the learning data recognition models satisfies a predetermined criterion, and may determine a model satisfying the predetermined criterion as a final data recognition model. In this case, in response to a plurality of models satisfying the predetermined criterion, the model evaluation unit 141-5 may determine an arbitrary predetermined model or a predetermined number of models in order from the highest evaluation score as the final data recognition model.

The data recognition model may be built based on the application domain of the recognition model, the learning purpose, or the computer capabilities of the device. The data recognition model may be based on, for example, a neural network. For example, a model such as a Deep Neural Network (DNN), a Recurrent Neural Network (RNN), a Bidirectional Recurrent Deep Neural Network (BRDNN) may be used as the data recognition model, but the data recognition model is not limited thereto.

According to various embodiments of the present disclosure, in response to having established a plurality of data recognition models, the model learning unit 141-4 may determine a data recognition model having a high correlation with the input learning data and the basic learning data as the data learning recognition model. In this case, the basic learning data may be classified according to the data type, and the data recognition model may be established according to the data type. For example, the basic learning data may have been classified according to various criteria, such as an area in which the learning data is generated, a time at which the learning data is generated, a size of the learning data, a type of the learning data, a generator of the learning data, a type of an object in the learning data, and the like.

In addition, the model learning unit 141-4 may learn the data recognition model by using a learning algorithm including, for example, an error back propagation or gradient descent method.

For example, the model learning unit 141-4 may cause the data recognition model to learn by supervised learning in which learning data for learning is regarded as having determination criteria as input values. In another example, the model learning unit 141-4 may learn the data type required for the self-determination case without separate supervision, thereby causing the data recognition model to learn by unsupervised learning that finds the criteria for determining the case. In another example, the model learning unit 141-4 may cause the data recognition model to learn by reinforcement learning using feedback on whether the result of determining the situation according to learning is correct.

In addition, in response to learning the data recognition model, the model learning unit 141-4 may store the learned data recognition model. In this case, the model learning unit 141-4 may store the learning data recognition model in the memory 120 of the display device 100. Alternatively, the model learning unit 141-4 may store the learning data recognition model in a memory of the server 200 connected to the electronic device in the wired or wireless network.

In this case, the memory 120 in which the learned data recognition model is stored may also store instructions or data related to at least one other element of the display device 100. Further, the memory 120 may store software and/or programs. For example, a program may include a kernel, middleware, an Application Programming Interface (API), and/or an application (or application).

At least one of the data acquisition unit 141-1, the preprocessing unit 141-2, the learning data selection unit 141-3, the model learning unit 141-4, and the model evaluation unit 141-5 included in the data learning unit 141 may be manufactured in the form of at least one hardware chip and may be installed in an electronic device. For example, at least one of the data acquisition unit 141-1, the preprocessing unit 141-2, the learning data selection unit 141-3, the model learning unit 141-4, and the model evaluation unit 141-5 may be manufactured in the form of a dedicated hardware chip for AI, or may be manufactured as a part of an existing general-purpose processor (e.g., CPU or AP) or a graphic dedicated processor (e.g., GPU, ISP), and may be installed in the various display devices 100 described above.

In addition, the data acquisition unit 141-1, the preprocessing unit 141-2, the learning data selection unit 141-3, the model learning unit 141-4, and the model evaluation unit 141-5 may be installed in one electronic device, or may be installed in separate electronic devices, respectively. For example, a part of the data acquisition unit 141-1, the preprocessing unit 141-2, the learning data selection unit 141-3, the model learning unit 141-4, and the model evaluation unit 141-5 may be included in the display device 100, and the other part may be included in the server 200.

At least one of the data acquisition unit 141-1, the preprocessing unit 141-2, the learning data selection unit 141-3, the model learning unit 141-4, and the model evaluation unit 141-5 may be implemented by using a software module. When at least one of the data acquisition unit 141-1, the preprocessing unit 141-2, the learning data selection unit 141-3, the model learning unit 141-4, and the model evaluation unit 141-5 is implemented by using a software module (or a program module including instructions), the software module may be stored in a non-transitory computer-readable recording medium. The at least one software module may be provided by the OS or may be provided by a predetermined application. Alternatively, a part of the at least one software module may be provided by the OS, and the other part may be provided by a predetermined application.

Fig. 4b is a block diagram of a data identification unit according to an embodiment of the present disclosure.

Referring to fig. 4b, the data recognition unit 142 according to the embodiment may include a data acquisition unit 142-1 and a recognition result providing unit 142-4. In addition, the data recognition unit 142 may further selectively include at least one of a preprocessing unit 142-2, a recognition data selection unit 142-3, and a model update unit 142-5.

The data acquisition unit 142-1 may obtain identification data required to determine the situation.

The recognition result providing unit 142-4 may determine the situation by applying the selected recognition data to the data recognition model. The recognition result providing unit 142-4 may provide a recognition result according to the purpose of recognizing the input data. The recognition result providing unit 142-4 may apply the selected data to the data recognition model by using the recognition data selected by the recognition data selecting unit 142-3 as an input value. In addition, the recognition result may be determined by a data recognition model.

For example, the recognition result providing unit 142-4 may classify the selected image frame according to the classification criteria determined at the data recognition model. In addition, the recognition result providing unit 142-4 may output the classified feature values so that the processor 140 generates a fingerprint. In another example, the recognition result providing unit 142-4 may apply the selected image frame to the data recognition model and may determine a type of content to which the image frame belongs. In response to the determined type of content, the processor 140 may request fingerprint data from the server 200 corresponding to a granularity of the type of content.

According to various embodiments of the present disclosure, the recognition result providing unit 142-4 may obtain a result of determining the type of the video/audio data by using recognition data related to information of predetermined video/audio data as an input value.

The information of the video/audio data may include, for example, the position and reproduction time of the current frame in the entire video/audio, which are information of the current frame. In addition, the information on the video/audio may include at least one of a video/audio name, a video/audio ID, a video/audio provider, video/audio series information, a genre, information on whether the video/audio is real-time broadcast, and information on whether the video/audio is pay content.

According to various embodiments of the present disclosure, by using identification data related to information on predetermined video/audio data and the type of the video/audio data as input values, the identification result providing unit 142-4 may obtain a result of estimating a probability that the video/audio data is changed to another video/audio data during production, or a result of estimating that the video/audio data is reproduced next after reproduction is completed.

The data recognition unit 142 may further include a preprocessing unit 142-2, a recognition data selection unit 142-3, and a model update unit 142-5 in order to improve the result of recognizing the data recognition model or in order to save resources or time required to provide the recognition result.

The preprocessing unit 142-2 may preprocess the obtained data so that the obtained identification data may be used to determine the situation. The preprocessing unit 142-2 may process the obtained identification data in a predetermined format so that the identification result providing unit 142-4 may determine a situation using the obtained identification data.

The identification data selection unit 142-3 may select identification data required for determining the situation from the preprocessed data. The selected recognition data may be provided to the recognition result providing unit 142-4. The identification data selection unit 142-3 may select identification data required for determining the situation from the preprocessed data according to a predetermined selection criterion for determining the situation. In addition, the recognition data selecting unit 142-3 may select data according to a predetermined selection criterion learned by the above-described model learning unit 141-4.

The model updating unit 142-5 may control updating of the data recognition model based on the evaluation of the recognition result provided by the recognition result providing unit 142-4. For example, the model updating unit 142-5 may provide the recognition result provided by the recognition result providing unit 142-4 to the model learning unit 141-4, so that the model learning unit 141-4 controls to update the data recognition model.

At least one of the data acquisition unit 142-1, the preprocessing unit 142-2, the recognition data selection unit 142-3, the recognition result providing unit 142-4, and the model update unit 142-5 included in the data recognition unit 142 may be manufactured in the form of at least one hardware chip and may be installed in an electronic device. For example, at least one of the data acquisition unit 142-1, the preprocessing unit 142-2, the recognition data selection unit 142-3, the recognition result providing unit 142-4, and the model update unit 142-5 may be manufactured in the form of a dedicated hardware chip for AI, or may be manufactured as a part of an existing general-purpose processor (e.g., CPU or AP) or a graphic dedicated processor (e.g., GPU, ISP), and may be installed in the various display devices 100 described above.

In addition, the data acquisition unit 142-1, the preprocessing unit 142-2, the recognition data selection unit 142-3, the recognition result providing unit 142-4, and the model updating unit 142-5 may be installed in one electronic device, or may be installed in separate electronic devices, respectively. For example, a part of the data acquisition unit 142-1, the preprocessing unit 142-2, the recognition data selection unit 142-3, the recognition result providing unit 142-4, and the model update unit 142-5 may be included in the display device 100, and the other part may be included in the server 200.

At least one of the data acquisition unit 142-1, the preprocessing unit 142-2, the recognition data selection unit 142-3, the recognition result providing unit 142-4, and the model update unit 142-5 may be implemented by using a software module. When at least one of the data acquisition unit 142-1, the preprocessing unit 142-2, the recognition data selecting unit 142-3, the recognition result providing unit 142-4, and the model updating unit 142-5 is implemented by using a software module (or a program module including instructions), the software module may be stored in a non-transitory computer-readable recording medium. The at least one software module may be provided by the OS or may be provided by a predetermined application. Alternatively, a part of the at least one software module may be provided by the OS, and the other part may be provided by a predetermined application.

Fig. 5 is a block diagram illustrating a configuration of a display device according to an embodiment of the present disclosure.

Referring to fig. 5, the display apparatus 100 may include a display 110, a memory 120, a communication unit 130, a processor 140, an image receiver 150, an image processor 160, an audio processor 170, and an audio outputter 180.

The display 110 may display various image contents, information, UIs, and the like provided by the display apparatus 100. In particular, the display 110 may display image content and UI windows provided by an external device (e.g., a set-top box). For example, the UI window may include an EPG, a menu for selecting content to be reproduced, content-related information, an additional application execution button, a guide message, a notification message, a function setting menu, a calibration setting menu, an operation execution button, and the like. The display 110 may be implemented in various forms such as a Liquid Crystal Display (LCD), an Organic Light Emitting Diode (OLED), an active matrix OLED (AM-OLED), a Plasma Display Panel (PDP), and the like.

The memory 120 may store various programs and data required for the operation of the display apparatus 100. The memory 120 may be implemented in the form of a flash memory, a hard disk, or the like. For example, the memory 120 may include a Read Only Memory (ROM) for storing a program for performing an operation of the display apparatus 100, and a Random Access Memory (RAM) for temporarily storing data accompanying the operation of the display apparatus 100. In addition, the memory 120 may further include an electrically erasable programmable rom (eeprom) for storing various reference data.

The memory 120 may store programs and data for configuring various screens to be displayed on the display 110. In addition, the memory 120 may store programs and data for performing specific services. For example, the memory 120 may store a plurality of fingerprints, viewing histories, content information, and the like. The fingerprint may be generated by the processor 140 or may be received from the server 200.

The communication unit 130 may communicate with the server 200 according to various types of communication methods. The communication unit 130 may exchange fingerprint data with the connection server 200 thereof in a wired or wireless manner. In addition, the communication unit 130 may receive content information, a control signal for changing a content identification period, additional information, information on a product appearing in the content, and the like from the server 200. In addition, the communication unit 130 can stream (stream) image data from an external server. The communication unit 130 may include various communication chips for supporting wired/wireless communication. For example, the communication unit 130 may include a chip operating in a wired Local Area Network (LAN), a wireless LAN (wlan), Wi-Fi, Bluetooth (BT), or Near Field Communication (NFC) method.

Image receiver 150 may receive image content data through various sources. For example, the image receiver 150 may receive broadcast data from an external broadcasting station. In another example, the image receiver 150 may receive image data from an external device (e.g., a set-top box, a Digital Versatile Disc (DVD) player), or may receive streamed image data from an external server through the communication unit 130.

The image processor 160 performs image processing on the image data received from the image receiver 150. The image processor 160 may perform various image processing operations on the image data, such as decoding, scaling, noise filtering, frame rate conversion, or resolution conversion.

The audio processor 170 may perform processing of audio data. For example, the audio processor 170 may perform decoding, amplification, noise filtering, and the like on the audio data.

The audio outputter 180 may output not only various audio data processed at the audio processor but also various notification sounds or voice messages.

The processor 140 may control the above-described elements of the display apparatus 100. For example, the processor 140 may receive fingerprint or content information through the communication unit 130. Further, the processor 140 may adjust the content identification period using the received content information. The processor 140 may be implemented by using a single CPU to perform a control operation, a search operation, etc., or may be implemented by using a plurality of processors and IP to perform a specific function.

Hereinafter, the operation of the processor 140 will be described below with reference to the drawings.

Fig. 6 is a view illustrating a hybrid ACR according to an embodiment of the present disclosure.

Referring to fig. 6, the hybrid ACR refers to a method of using a combination of a local ACR according to which the processor 140 identifies reproduced content using fingerprint information stored in the memory 120 and a server ACR according to which content is identified by information received from the server. When the local ACR and the server ACR are combined, the load on the server 200 can be reduced, and the display apparatus 100 can accurately recognize what content is being reproduced.

The processor 140 may recognize what content is currently being reproduced, and may adjust to perform ACR with appropriate accuracy based on the recognition result. For example, the processor 140 may adjust the content identification period based on the content information, and may determine in advance the content to be received and stored from the server 200, and the number of fingerprints about the content.

Referring to fig. 6, the processor 140 may extract characteristics of a content screen currently being reproduced, and may generate a fingerprint. In addition, the processor 140 may search for whether there is a fingerprint (phi local ACR) matching the generated fingerprint from among the plurality of fingerprints stored in the memory 120.

In response to searching for a fingerprint in the memory 120 that matches the generated fingerprint, the processor 140 may identify the currently reproduced content based on information of the content corresponding to the searched fingerprint. For example, the information on the content corresponding to the fingerprint may include a position of the current frame in the entire content, a reproduction time, and the like, which are information on the current frame. In addition, the information on the content corresponding to the fingerprint may include at least one of a content name, a content ID, a content provider, content series information, a genre, information on whether the content is broadcast in real time, and information on whether the content is pay content.

In response to the local ACR success as described above, the processor 140 does not have to attempt to execute the server ACR, and thus the load on the server 200 can be reduced. For a proper local ACR, the memory 120 should store the necessary fingerprint information and content information. This will be described again below.

On the other hand, in response to the fingerprint matching the generated fingerprint not being searched in the memory 120, the processor 140 may control the communication unit 130 to transmit a query for requesting information on the content currently reproduced to the server 200 (server ACR). For example, the query may include a generated fingerprint, a viewing history, information about the display device 100, and the like.

In the case of receiving a server ACR request from the display apparatus 100, the server 200 may previously establish a fingerprint database regarding various image contents. The server 200 may analyze the image content and may extract characteristics of all image frames. In addition, the server 200 may generate fingerprints for distinguishing image frames from each other using the extracted characteristics. The server 200 may use the generated fingerprint to build a database.

The server 200 may extract at least one fingerprint from the requested query. In addition, the server 200 may search the created database for the extracted fingerprint, and may identify what content the display apparatus 100 is currently reproducing. The server 200 may transmit the identified content information to the display device 100. Further, the server 200 may add the identified content information to the viewing history of each display apparatus, and may manage the viewing history.

In addition, in response to the query, the processor 140 may control the communication unit 130 to receive information on the currently reproduced content and the fingerprint of the currently reproduced content from the server 200. Here, the fingerprint of the currently reproduced content may be a fingerprint about a frame temporally (in time) after the frame currently displayed in the entire content. Since the processor 140 knows the time indicated by the position of the currently displayed frame in the entire content based on the fingerprint included in the query, the processor 140 may receive a fingerprint on a frame expected to be reproduced after the currently displayed frame from the server 200.

In order to appropriately combine the local ACR and the server ACR, the processor 140 may determine a fingerprint to be received from the server 200 for the local ACR based on at least one of the result of the content recognition and the viewing history, and may determine whether to change the content recognition period.

According to an embodiment of the present disclosure, the processor 140 may determine the type of the content based on information about the content currently being reproduced. In addition, the processor 140 may change the content recognition period according to the determined type of the content. The type of the content may be classified according to criteria such as the details of the content, the type, information on whether the content is broadcast in real time, importance, and the like.

According to another embodiment of the present disclosure, the processor 140 may estimate the type of content using a data recognition model arranged to estimate the type of content based on information about the content. The data recognition model for estimating the type of content may be, for example, a data recognition model set to learn to have a criterion for estimating the type of content (e.g., video/audio data) based on learning data related to information of the content (e.g., video/audio data) and the type of the content (e.g., video/audio data).

Fig. 7 is a diagram illustrating fingerprint information having different granularities according to an embodiment of the present disclosure.

Referring to fig. 7(a) and 7(b), for example, in response to identifying that news content or advertisement content is being reproduced, the processor 140 may decrease the content identification period. Since news content or advertisement content often changes its details, the processor 140 may need to accurately identify the reproduced content. For example, the processor 140 may set a content identification period such that the processor 140 attempts to identify the content in each frame. In this case, as shown in fig. 7(a), the processor 140 may request fingerprint information on all frames to be reproduced after the currently displayed frame from the server 200.

In another example, in response to identifying that broadcast program content or movie content is being reproduced, processor 140 may increase the content identification period. With respect to the broadcast program content or the movie content, the processor 140 does not need to grasp the details included in each frame, and may grasp only whether the same content is continuously reproduced. Accordingly, the processor 140 may request that only frames displayed at a time corresponding to the content identification period among frames to be reproduced after the currently displayed frame include fingerprint information. In the example of fig. 7(b), the granularity of the fingerprint is lower than that of fig. 7(a), since fingerprint information is required only once every four frames.

According to an embodiment of the present disclosure, the processor 140 may determine the content recognition period by analyzing an application running while the user of the display apparatus 100 is viewing the content. For example, in the case of drama content, the processor 140 may generally set the content identification period to be long. However, in response to determining that there is a history that the user has run the shopping application and purchased the PPL product while viewing the drama, the processor 140 may reduce the content identification period for the drama content. The processor 140 may determine which frame of the theatrical content displays the PPL product and may control the display 110 to display the associated advertisement with the theatrical content. In addition, the processor 140 may control the display 110 to display a UI for immediately running a shopping application.

As described above, the processor 140 may learn the criteria for determining the content recognition period based on the viewing history information and information about the running additional applications. By doing so, the processor 140 may personalize the criteria used to determine the content identification period for each user. When learning the criteria for determining the content recognition period, the processor 140 may use an AI learning scheme, such as the unsupervised learning described above.

According to an embodiment of the present disclosure, the processor 140 may determine the number of fingerprints to be requested from the server differently in advance according to the identified content. In addition, the processor 140 may determine the number of fingerprints for subsequent frames of the currently reproduced content according to the identified content type and the determined content identification period.

The number of frames received and stored in advance from the server 200 may vary according to the identified content type. For example, in the case of VOD or DVR, the server 200 may have information on all image frames, but in the case of live broadcasting, the server 200 may receive only several seconds of image information (e.g., information of hundreds of image frames in the case of 60 Hz) before the display device 100. Since there are fingerprints corresponding to the respective frames, the number and storage of fingerprints that the display device 100 has previously received from the server 200 may also vary.

For example, in response to determining that the identified content type is theatrical content and thus setting the content to be identified every 30 seconds, the processor 140 may determine that the number of fingerprints to request is 0. Since the server 200 does not have a fingerprint corresponding to a frame to be reproduced after 30 seconds in the live broadcast, the processor 140 may omit an unnecessary communication process.

According to an embodiment of the present disclosure, the processor 140 may change the content identification period based on the information on the identified content and the viewing history. The viewing history may include content that the user has viewed, viewing time, and additional applications that have been run during the viewing time.

Fig. 8 is a view illustrating viewing history information according to an embodiment of the present disclosure.

Referring to fig. 8, the processor 140 may determine whether a currently reproduced content will be continuously reproduced or another content will be reproduced by comparing the identified current content and the viewing history. In addition, the processor 140 may change the content recognition period according to a probability of reproducing another content. In addition, the processor 140 may request information about content expected to be reproduced next time from the server 200, and may receive information required for the local ACR from the server 200 in advance. By doing so, the probability of executing the server ACR can be reduced, and thus the load on the server 200 can be reduced, and the display apparatus 100 can also accurately recognize the content.

According to various embodiments of the present disclosure, the processor 140 may estimate a probability of reproducing another content or estimate a content expected to be reproduced next time using a data recognition model that is set to estimate another content to be reproduced during reproduction or to estimate a content expected to be reproduced next time after reproduction is completed based on information of the reproduced content and a type of the content.

Based on learning data related to information of content (e.g., video/audio data), a type of content (e.g., video/audio data), and a viewing history of content (e.g., video/audio data) (e.g., history of having changed to another video/audio data), a data recognition model set to estimate a probability of reproducing another content during reproduction or to estimate a content expected to be reproduced next after completion of reproduction may estimate a probability of reproducing another content during reproduction or estimate a content expected to be reproduced next.

For example, referring to the viewing history of fig. 8, the user of the display apparatus 100 typically views news content on channel 3 between 17:00 and 18: 00. When the currently identified content is music content on channel 2 at 17:30, the processor 140 may determine that the probability of the channel change is high. Accordingly, the processor 140 may adjust the content recognition period to be short and may frequently check whether the reproduced content is changed.

According to an embodiment of the present disclosure, the processor 140 may receive additional information related to the content from the server 200 using the fingerprint. For example, the additional information may include a content name, a content reproduction time, a content provider, information about PPL products appearing in the content, advertisements related to PPL generation information, executable additional applications, and the like.

Fig. 9 is a view illustrating display of additional information having contents according to an embodiment of the present disclosure.

Referring to fig. 9, the processor 140 may receive additional information indicating that the PPL product 910 is included in a particular image frame. In addition, the processor 140 may control the display 110 to display the UI 920 including the received additional information and content. The UI 920 may include a photo of the PPL product 910, a guide message, an additional application launch button, and the like.

According to the above-described embodiment of the present disclosure, the display apparatus 100 may reduce the load on the server 200 while accurately performing the ACR by dynamically adjusting the content recognition period.

Fig. 10 is a flowchart illustrating a content recognition method of a display device according to an embodiment of the present disclosure.

Referring to fig. 10, the display device 100 may first capture a content screen currently being reproduced. In addition, the display apparatus 100 may extract characteristics from the captured screen, and may generate a fingerprint using the extracted characteristics at operation S1010. Fingerprints are identification information for distinguishing one image from another. Specifically, the fingerprint is characteristic data extracted from a video or audio signal included in a frame.

According to various embodiments of the present disclosure, the display device 100 may generate a fingerprint using the data recognition model described above with reference to fig. 3 to 4 b.

At operation S1020, the display apparatus 100 may search whether a fingerprint matching the generated fingerprint is stored in the display apparatus 100. For example, the display apparatus 100 may first perform local ACR. In response to the local ACR success, the display device 100 may recognize what content is currently being reproduced without sending a query to the server 200. Since the internal storage space of the display apparatus 100 is limited, the display apparatus 100 should appropriately select and store fingerprint information to be received in advance.

The display apparatus 100 may determine whether to transmit a query including the generated fingerprint to the external server 200 according to the search result, i.e., the result of the local ACR at operation S1030.

Fig. 11 is a flowchart illustrating a content recognition method of a display device according to an embodiment of the present disclosure.

Referring to fig. 11, since operations S1110 and S1120 correspond to operations S1010 and S1020, a repetitive description is omitted.

In response to searching for a fingerprint matching the generated fingerprint in the display apparatus 100 in operation S1130-Y, the display apparatus 100 may identify the currently reproduced content using the stored fingerprint in operation S1140.

On the other hand, in response to not searching for a fingerprint matching the generated fingerprint at operation S1130-N, the display apparatus 100 may transmit a query for requesting information about the currently reproduced content to the external server 200 at operation S1150. At operation S1160, the display device 100 may receive content information from the server 200. In addition, the display apparatus 100 may also receive a fingerprint about content expected to be reproduced next from the server 200. For example, the display apparatus 100 may receive a fingerprint of a frame located after a currently reproduced frame in the entire content and a fingerprint of a frame regarding another content expected to be reproduced next time.

As described above, the display apparatus 100 can recognize the content currently reproduced through the local ACR or the server ACR. Hereinafter, the operation of the display device after the content is recognized will be described.

Fig. 12a is a view illustrating a method for changing a content recognition period of a display device according to an embodiment of the present disclosure.

Referring to fig. 12a, the display apparatus 100 may identify a content currently reproduced at operation S1210. In addition, the display apparatus 100 may determine the type of the content using the information of the identified content at operation S1220. For example, the type of content may be classified based on criteria such as the details of the content, the type, whether the content is real-time broadcast information, importance, and the like. The criterion for classifying the type of the content may be learned by the display apparatus 100 itself by using AI (e.g., the data recognition model described in fig. 3 to 4 b).

In addition, the display apparatus 100 may change the content recognition period according to the determined type of the content at operation S1230. For example, in the case of news content or advertisement content in which the details of the reproduced content are frequently changed, the display apparatus 100 may set the content recognition period to be short. In addition, in the case of a VOD content that only requires a decision as to whether to continuously reproduce the currently reproduced content, the display apparatus 100 may set the content recognition period to be long.

The criteria may vary depending on the taste of the individual viewing. The display apparatus 100 may set the personalization criteria using the viewing history. The display apparatus 100 may learn the standard by itself using an unsupervised learning method.

Fig. 12b is a view illustrating a method for changing a content recognition period of a display device according to an embodiment of the present disclosure.

Referring to fig. 12b, the first processor 140-1 may control to run at least one application installed in the display apparatus 100. For example, the first processor 140-1 may capture an image displayed on a display and generate a fingerprint, and may perform ACR.

The second processor 140-2 may estimate the type of content using a data recognition model. The data recognition model may be, for example, a set of algorithms that use the results of statistical machine learning to estimate the type of content using information of the content and/or fingerprints generated from the content.

Referring to fig. 12b, the first processor 140-1 may identify the currently reproduced content at operation S1240. For example, the first processor 140-1 may identify content using a local ACR or a server ACR.

The first processor 140-1 may transmit the result of the content recognition to the second processor 140-2 at operation S1245. For example, the first processor 140-1 may transmit the result of the content recognition to the second processor 140-2 to request the second processor 140-2 to estimate the type of the reproduced content.

The second processor 140-2 may estimate the type of the reproduction content using the data recognition model at operation S1250. For example, the data recognition model may estimate the type of content (e.g., video/audio data) based on learning data related to information of the content (e.g., video/audio data) and the type of the content (e.g., video/audio data).

The second processor 140-2 may derive an identification period of the content according to the estimated content type at operation S1255. For example, in the case of news content or advertisement content in which the details of the reproduced content are frequently changed, the display apparatus 100 may set the content recognition period to be short. In addition, in the case of a VOD content that only requires a decision as to whether to continuously reproduce the currently reproduced content, the display apparatus 100 may set the content recognition period to be long.

At operation S1260, the second processor 140-2 may transmit the derived content identification period to the first processor 140-1. The first processor 140-1 may change the content recognition period based on the received content recognition period at operation S1270.

According to various embodiments of the present disclosure, the first processor 140-1 may receive the estimated content type from the second processor 140-2, and may execute at operation S1255.

Fig. 13a is a view illustrating a method for determining the number of fingerprints to be requested by a display device according to an embodiment of the present disclosure.

Referring to fig. 13a, the display apparatus 100 may identify a content currently reproduced at operation S1310. In addition, the display apparatus 100 may determine the type of the content using the information of the identified content at operation S1320. For example, the display apparatus 100 may determine (or estimate) the type of the content using a data recognition model (e.g., the data recognition model described in fig. 3 to 4 b).

Similar to the method of changing the content recognition period according to the determined content type, the display apparatus 100 may determine the number of fingerprints to be received from the server 200 at operation S1330. Since the number of image frames existing in the server 200 varies according to the type of content, the number of fingerprints corresponding to the respective frames and existing in the server 200 may also vary according to the type of content.

The display apparatus 100 may determine the number of fingerprints to be received by considering the type of content, viewing history, information on whether the content is live, and the like. In response to the determined number of fingerprints received, the display device 100 may perform an optimized local ACR while minimizing the number of fingerprints stored in the display device 100.

Fig. 13b is a view illustrating a method for determining an amount of fingerprints to be requested by a display device according to another embodiment of the present disclosure.

Referring to fig. 13b, the first processor 140-1 may control to run at least one application installed in the display apparatus 100. For example, the first processor 140-1 may capture an image displayed on a display and generate a fingerprint, and may perform ACR.

The second processor 140-2 may estimate the type of content using a data recognition model. The data recognition model may be, for example, a set of algorithms that estimate the type of content using the results of statistical machine learning, information of the content, and fingerprints generated from the content.

Referring to fig. 13b, the first processor 140-1 may identify the currently reproduced content at operation S1340.

The first processor 140-1 may transmit the result of the content recognition to the second processor 140-2 at operation S1345.

The second processor 140-2 may estimate the type of the reproduced content using the data recognition model at operation S1350. For example, the data recognition model may estimate the type of content (e.g., video/audio data) based on learning data related to information of the content (e.g., video/audio data) and the type of the content (e.g., video/audio data).

At operation S1355, the second processor 140-2 may derive a number of fingerprints to be requested from a server (e.g., the server 200 of fig. 1) according to the estimated content type. Since the number of image frames existing in the server 200 varies according to the type of content, the number of fingerprints corresponding to the respective frames and existing in the server 200 may also vary according to the type of content.

The second processor 140-2 may transmit the derived number of fingerprints to be requested to the first processor 140-1 at operation S1360. At operation S1365, the first processor 140-1 may determine the number of fingerprints to request based on the number of received fingerprints.

According to various embodiments of the present disclosure, the first processor 140-1 may receive the estimated content type from the second processor 140-2, and may execute at operation S1355.

Fig. 14a, 14b, 15a, and 15b are views illustrating a method for predicting contents of a display device according to various embodiments of the present disclosure.

Referring to fig. 14a, the display apparatus 100 may identify content currently being reproduced at operation S1410.

In addition, the display apparatus 100 may calculate a probability that the currently reproduced content is changed based on the information on the identified content and the viewing history at operation S1420. For example, the viewing history may include a channel that the user has viewed, a viewing time, an ID of the display device, user information, an additional application that is running, and the like.

According to various embodiments of the present disclosure, the display apparatus 100 may estimate a probability that the currently reproduced content is changed using a data recognition model (e.g., the data recognition model described in fig. 3 to 4 b).

In addition, the display apparatus 100 may change the content recognition period according to the calculated probability at operation S1430. For example, in response to determining that the user generally likes to view a content different from the currently identified content with reference to the viewing history, the display apparatus 100 may determine that the currently reproduced content is likely to be changed. In this case, the display apparatus 100 may change the content recognition period to be short.

On the other hand, in response to determining to reproduce the content corresponding to the normal viewing history, the display apparatus 100 may determine that the possibility that the content currently being reproduced is changed is low. In this case, the display apparatus 100 may change the content recognition period to be long.

Fig. 14b is a view illustrating a method for predicting content and changing a content recognition period in a display device including a first processor and a second processor according to an embodiment of the present disclosure.

Referring to fig. 14b, the first processor 140-1 may control to run at least one application installed in the display apparatus 100. For example, the first processor 140-1 may capture an image displayed on a display and generate a fingerprint, and may perform ACR.

The second processor 140-2 may estimate a probability that the content is changed using the data recognition model. The data recognition model may be, for example, a set of algorithms for estimating a probability that video/audio data is changed to another video/audio data during reproduction based on learning data related to information of the video/audio data, a type of the video/audio data, and a viewing history of the video/audio data (e.g., a history of having changed to another video/audio data).

Referring to fig. 14b, the first processor 140-1 may identify the content currently being reproduced at operation S1440.

The first processor 140-1 may transmit the result of the content recognition to the second processor 140-2 at operation S1445.

At operation S1450, the second processor 140-2 may estimate a probability that the reproduction content is changed using the data recognition model.

At operation S1455, the second processor 140-2 may derive a content recognition period according to the estimated probability.

At operation S1460, the second processor 140-2 may transmit the derived content identification period to the first processor 140-1.

At operation S1465, the first processor 140-1 may change the content recognition period based on the received content recognition period.

According to various embodiments of the present disclosure, the first processor 140-1 may receive the estimated probability of content change from the second processor 140-2, and may perform at operation S1455.

Referring to fig. 15a, the display apparatus 100 may identify a content currently reproduced at operation S1510. In addition, the display device 100 may predict content to be reproduced next based on the viewing history at operation S1520. For example, in response to a user of the display apparatus 100 having a viewing history of frequently viewing a specific two channels, the display apparatus 100 may predict content to be reproduced in the two channels as content to be reproduced next time.

According to various embodiments of the present disclosure, the display apparatus 100 may estimate a content to be reproduced after a content currently reproduced using a data recognition model (e.g., the data recognition model described in fig. 3 to 4 b).

At operation S1530, the display device 100 may request fingerprint information of the predicted content from the server 200. In addition, the display apparatus 100 may receive information on the predicted content from the server 200 in advance, and may store the information at operation S1540.

The information on the predicted content transmitted from the server 200 to the display apparatus 100 may include at least one of information of a content currently reproduced in the display apparatus 100, a fingerprint of the currently reproduced content and a content predicted to be reproduced next, and a control signal for changing a content recognition period of the display apparatus 100. For example, the display apparatus 100 may receive fingerprint information of contents to be reproduced in the above-described two channels from the server 200 in advance and may store it. By doing so, the display apparatus 100 may receive an optimized fingerprint to be used for the local ACR.

Fig. 15b is a view illustrating a method for predicting content and receiving information about the predicted content in a display device including the first and second processors 140-2 according to an embodiment of the present disclosure.

Referring to fig. 15b, the first processor 140-1 may control to run at least one application installed in the display apparatus 100. For example, the first processor 140-1 may capture an image displayed on a display and generate a fingerprint, and may perform ACR.

The second processor 140-2 may estimate a probability that the content is changed using the data recognition model. The data recognition model may be, for example, an algorithm for estimating video/audio data to be reproduced after completion of reproduction based on learning data relating to information of the video/audio data, a type of the video/audio data, and a viewing history of the video/audio data (e.g., a history of having changed to another video/audio data).

Referring to fig. 15b, the first processor 140-1 may identify the currently reproduced content at operation S1550.

The first processor 140-1 may transmit the result of the content recognition to the second processor 140-2 at operation S1555.

At operation S1560, the second processor 140-2 may estimate content to be reproduced subsequent to the currently reproduced content using the data recognition model.

The second processor 140-2 may transmit the content to be reproduced next to the first processor 140-1 at operation S1565.

At operation S1570, the first processor 140-1 may request information about the predicted content from a server (e.g., the server 200 of fig. 1).

The first processor 140-1 may receive information about the predicted content from a server (e.g., the server 200 of fig. 1) and store the information at operation S1575.

According to various embodiments of the present disclosure, the second processor 140-2 may execute at operation S1570.

Fig. 16 is a view illustrating data learned and recognized by a display device and a server interlocked with each other according to an embodiment of the present disclosure.

Referring to fig. 16, the server 200 may learn a criterion for identifying a content and/or a criterion for estimating a type of the content and/or a criterion for estimating a probability of the content being changed, and the display apparatus 100 may set a criterion for discriminating an image frame based on a learning result of the server 200, and may be the type of the content and the probability of the content being changed.

In this case, the data learning unit 240 of the server 200 may include a data acquisition unit 240-1, a preprocessing unit 240-2, a learning data selection unit 240-3, a model learning unit 240-4, and a model evaluation unit 240-5. The data learning unit 240 may perform the function of the data learning unit 141 shown in fig. 4 a. The data learning unit 240 of the server 200 may learn to make the data recognition model have a standard for analyzing characteristics of the video/audio data. Server 200 may analyze the characteristics of each captured frame according to learned criteria and may generate a fingerprint.

The data learning unit 240 may determine what learning data will be used to determine characteristics of the captured content screen (or frame). In addition, the data learning unit 240 may learn the criteria for extracting the characteristics of the captured content using the determined learning data. The data learning unit 240 may obtain data to be used for learning, and may learn a criterion for analyzing characteristics by applying the obtained data to a data recognition model, which will be described below.

According to various embodiments of the present disclosure, the data learning unit 240 may learn to make the data recognition model have a criterion for estimating the type of the video/audio data based on the learning data related to the information of the predetermined video/audio data and the type of the video/audio data.

According to various embodiments of the present disclosure, the data learning unit 240 may learn to make the data recognition model have a criterion for estimating a probability that the video/audio data is changed to another video/audio data during reproduction or estimating an audio data video to be reproduced next after completion of reproduction, based on the learned data related to the information of the predetermined video/audio data, the type of the video/audio data, and the viewing history of the video/audio data (e.g., having a history of changing to another video/audio data).

In addition, the recognition result providing unit 142-4 of the display device 100 may determine the situation by applying the data selected by the recognition data selecting unit 142-3 to the data recognition model generated by the server 200. Further, the recognition result providing unit 142-4 may receive the data recognition model generated by the server 200 from the server 200, and may analyze the image or determine the type of the content using the received data recognition model. In addition, the model updating unit 142-5 of the display device 100 may provide the analyzed image and the determined content type to the model learning unit 240-4 of the server 200, so that the data recognition model may be updated.

For example, the display device 100 may recognize a model using data generated by using the computing power of the server 200. Further, the plurality of display devices 100 transmit the learned or recognized data information to the server 200 so that the data can update the recognition model of the server 200. In addition, each of the plurality of display apparatuses 100 transmits the learned or recognized data information to the server 200 so that the server 200 can generate a data recognition model personalized for each display apparatus 100.

Fig. 17 is a flowchart illustrating a content recognition method of a display system according to an embodiment of the present disclosure.

Referring to fig. 17, the display system 1000 may include the display device 100 and the server 200. Fig. 17 illustrates a pull method of requesting a fingerprint at the display device 100.

First, the display apparatus 100 may capture a content screen currently being reproduced at operation S1605. In addition, the display apparatus 100 may analyze the captured screen and extract characteristics. At operation S1610, the display device 100 may generate a fingerprint for distinguishing the captured screen from other image frames using the extracted characteristics.

At operation S1615, the display apparatus 100 may perform the local ACR to match the generated fingerprint with the stored fingerprint. The case where the currently reproduced content is recognized by the local ACR will not be described. In response to the currently reproduced content not being identified by the local ACR, the display apparatus 100 may transmit a query including the generated fingerprint to the server at operation S1625.

The server 200 may have analyzed various contents and may establish a fingerprint database at operation S1620. The server 200 may extract a fingerprint from the received query. In addition, the server 200 may match the extracted fingerprint with a plurality of fingerprints stored in a fingerprint database at operation S1630. The server 200 can recognize what the contents queried by the display device 100 are by matching fingerprints. The server 200 may transmit information on the identified content and a fingerprint on a next image frame of the identified content to the display device 100 at operation S1635.

For example, the display apparatus 100 may transmit a query for requesting information of a fingerprint generated at the server 200. Server 200 may use the query API to generate a response to the received query and may provide the response. The query API may be an API that searches a fingerprint database for fingerprints included in the query and provides stored relevant information. In response to receiving the query, the query API of the server 200 may search whether a fingerprint included in the query exists in the fingerprint database. In response to the searched fingerprint, the query API may transmit the name of the content, the position of the frame corresponding to the searched fingerprint in the entire content, the reproduction time, and the like to the display device 100 in response to the query. In addition, the query API may transmit a frame fingerprint corresponding to a frame temporally subsequent to a frame displaying the fingerprint in the entire content to the display apparatus 100.

In addition, when the content reproduced at the display apparatus 100 may be streamed through the server 200 (e.g., VOD or broadcast signal), the server 200 may transmit the fingerprint and the next frame of the identified content to the display apparatus 100. In this case, the fingerprint may be paired with a corresponding frame of the identified content and transmitted. For example, the fingerprint may be provided in the form of one file added to the content, and information for mapping the fingerprint and the corresponding frame may be included in the fingerprint.

The display apparatus 100 may determine the type, importance, etc. of the content using the received content information. In addition, the display device 100 may change the content recognition period based on criteria such as the type, importance, and the like of the content at operation S1640. In addition, the display apparatus 100 may predict content to be reproduced next using the viewing history having the content information at operation S1645. The content to be reproduced next refers to a content different from the content currently reproduced.

At operation S1650, the display device 100 may request a fingerprint about the predicted content from the server 200. In addition, the display device 100 may request the predicted content itself and the fingerprint from the server 200. In response thereto, the server 200 may transmit the requested fingerprint to the display device 100 at operation S1655. In response to the predicted content being stored in the server 200 or the predicted content being able to be streamed to the display device 100 through the server 200, the server 200 may transmit not only the requested fingerprint to the display device 100 but also the content paired with the fingerprint.

The display apparatus 100 may store the received fingerprint and may use it for the local ACR when the next content recognition period comes.

Fig. 18 is a flowchart illustrating a content recognition method of a display system according to an embodiment of the present disclosure.

Referring to fig. 18, the display system 1000 may include the display device 100 and the server 200. Fig. 18 shows a push method in which the server 200 preemptively sends a fingerprint.

First, the display apparatus 100 may capture a content screen currently being reproduced in operation S1805. In addition, the display apparatus 100 may analyze the captured screen and extract characteristics. At operation S1810, the display device 100 may generate a fingerprint for distinguishing the captured screen from other image frames using the extracted characteristics.

At operation S1815, the display device 100 may perform the local ACR to match the generated fingerprint with the stored fingerprint. In addition, in response to the currently reproduced content not being identified by the local ACR, the display apparatus 100 may transmit a query including the generated fingerprint to the server at operation S1825.

In addition to the fingerprint, the query may also include information of the display device 100. For example, the information of the display apparatus 100 may be a physical ID of the display apparatus 100, an IP address of the display apparatus 100, or information for specifying a user of the display apparatus 100, such as a user ID transmitted to the server 200 through the display apparatus 100.

The server 200 may manage the viewing history on each display device 100 using the information of the display device 100. For example, in response to a client device access, the server 200 may collect device IDs, and may perform the above-described operations using a client management API for managing a viewing history for each device ID.

The server 200 may have analyzed various contents and may establish a fingerprint database at operation S1820. The server 200 may store information corresponding to the content of the fingerprint in a database. For example, the server 200 may store in the database the name of the content corresponding to the fingerprint, the position of the frame corresponding to the fingerprint in the entire content, reproduction time, content ID, content provider, content series information, genre, information on whether the content is real-time broadcast, information on whether the content is pay content, and the like.

The server 200 may extract a fingerprint from the received query. For example, the query API of the server 200 may extract information corresponding to a fingerprint only from character string information of a received query. In addition, the server 200 may match the extracted fingerprint with a plurality of fingerprints stored in a fingerprint database at operation S1830.

The server 200 can recognize what the contents queried by the display device 100 are by matching fingerprints. The server 200 may determine the type, importance, etc. of the content using the determined content information. In addition, the server 200 may change the content identification period based on criteria such as the type, importance, and the like of the content at operation S1835. In addition, the server 200 may predict content to be reproduced next using the viewing history having the content information at operation S1840. For example, in the embodiment of fig. 18, the server 200 may perform operations of changing the content identification period and predicting the content to be reproduced next time.

Based on the information grasped in this process, the server 200 may transmit fingerprint information or the like to the display device 100 without receiving a request from the display device 100 at operation S1845. For example, the server 200 may transmit information of content currently reproduced at the display apparatus 100, fingerprints of the currently reproduced content and content predicted to be reproduced next, and a control signal for changing a content recognition period of the display apparatus 100 to the display apparatus 100. In another example, the server 200 may transmit the content itself predicted to be reproduced next time and a fingerprint. In this case, the fingerprint may be combined with all frames of the content, or the fingerprint may be combined with every interval frame set according to the content recognition period.

In addition, the server 200 may transmit an advertisement screen, a product purchase UI, and the like with respect to a product included in a frame of content to be displayed on the display device 100 to the display device 100 without receiving a request from the display device 100. The server 200 may grasp the history of purchasing a product during the viewing time based on the information of the display apparatus 100 received from the display apparatus 100. The server 200 may change the frequency of transmitting the advertisement screen or the like using personalized information such as a product purchase history or a viewing history.

The server 200 may generate an advertisement screen having a size suitable for each screen display on the display device 100 using resolution information of the display device 100, information indicating which portion of a frame of content corresponds to a background, and the like. In addition, the server 200 may transmit a control signal for displaying an advertisement screen at an appropriate position on the screen of the display device 100 to the display device 100 together with the advertisement screen.

In another example, the server 200 may request the second server 300 to provide an advertisement screen. For example, the second server 300 may be a separate server providing an advertisement providing function. The second server 300 may receive information including a product to be advertised, a resolution of the display device 100, and the like from the server 200. Based on the received information, the second server 300 may generate an advertisement screen having an appropriate size. The second server 300 may transmit the generated advertisement screen to the server 200, or may directly transmit the advertisement screen to the display device 100. In an embodiment where the second server 300 directly transmits the advertisement screen to the display device 100, the server 200 may provide the second server 300 with communication information such as an IP address of the display device 100.

Fig. 19 is a view illustrating a case where a display device according to an embodiment of the present disclosure changes a content recognition period according to a probability that content is changed by interlocking with a server.

Referring to fig. 19, the server 200 may estimate a probability that content is changed using a data recognition model. The data recognition model may be, for example, a set of algorithms that estimates a probability that video/audio data is changed to another video/audio data during reproduction based on learning data related to information of the video/audio data, a type of the video/audio data, and a viewing history of the video/audio data (e.g., a history of having changed to another video/audio data).

In this case, an interface for transmitting/receiving data between the display device 100 and the server 200 may be defined.

For example, an API having learning data to be applied to the data recognition model as a factor value (or parameter value or transfer value) may be defined. The API may be defined as a set of subroutines or functions that are called at one protocol (e.g., the protocol defined by the display device 100) to perform certain processing of another protocol (e.g., the protocol defined by the server 200). For example, through an API, an environment may be provided in which one protocol may perform operations of another protocol.

Referring to fig. 19, the display apparatus 100 may identify a content currently being reproduced at operation S1910. For example, the display apparatus 100 may identify content using a local ACR or a server ACR.

At operation S1920, the display apparatus 100 may transmit the result of the content recognition to the server 200. For example, the display device 100 may transmit the result of content recognition to the server 200 to request the server 200 to estimate the probability that the reproduced content is changed.

The server 200 may estimate a probability that the reproduction content is changed using the data recognition model at operation S1930.

The server 200 may derive a content recognition period according to the probability of the content change at operation S1940. For example, in response to determining that the user generally likes viewing a content different from the currently identified content with reference to the viewing history based on the query history (e.g., channel change) requested from the server 200 by the display apparatus 100, the display apparatus 100 may determine that the currently reproduced content is likely to be changed. In this case, the display apparatus 100 may change the content recognition period to be short.

The server 200 may transmit the derived content identification period to the display device 100 at operation S1950. The display apparatus 100 may change the content recognition period based on the received content recognition period at operation S1960.

Fig. 20 is a view illustrating a method in which a display device predicts content to be reproduced next by interlocking with a server and receives information on the predicted content in advance according to an embodiment of the present disclosure.

Referring to fig. 20, the server 200 may estimate a probability that content is changed using a data recognition model. The data recognition model may be, for example, an algorithm for estimating video/audio data to be reproduced next after completion of reproduction, based on learning data relating to information of the video/audio data, a type of the video/audio data, and a viewing history of the video/audio data (e.g., a history of having changed to another video/audio data).

At operation S2010, the display apparatus 100 may identify the content currently being reproduced.

The display device 100 may transmit the result of the content recognition to the server 200 at operation S2020.

At operation S2030, the server 200 may estimate content to be reproduced after the content currently being reproduced using the data recognition model.

The server 200 may search for information about the estimated content at operation S2040.

The server 200 may transmit information about the content to be reproduced next to the display device 100 at operation S2050.

The display apparatus 100 may receive information about the estimated content from the server 200 and may store the information at operation S2060.

Fig. 21 is a view illustrating a method in which a display device predicts content to be reproduced next by interlocking with a plurality of servers and receives information on the predicted content in advance according to an embodiment of the present disclosure.

Referring to fig. 21, the first server 200 may estimate a probability that the content is changed using a data recognition model. The data recognition model may be, for example, an algorithm for estimating video/audio data to be reproduced next after completion of reproduction, based on learning data relating to information of the video/audio data, a type of the video/audio data, and a viewing history of the video/audio data (e.g., a history of having changed to another video/audio data).

For example, the third server 201 may include a cloud server storing information about content.

At operation S2110, the display device 100 may identify the content currently reproduced.

The display device 100 may transmit the result of the content recognition to the first server 200 at operation S2120.

At operation S2130, the first server 200 may estimate a content to be reproduced after the currently reproduced content using the data recognition model.

The first server 200 may transmit the estimated content to the second server to request the second server to search for information at operation S2140.

The third server 201 may search for information about the content in operation S2150.

The third server 201 may transmit information about the content to be reproduced next to the first server 200 at operation S2160. In addition, the first server 200 may transmit information about the content to be reproduced next to the display apparatus 100 at operation S2170. However, according to various embodiments of the present disclosure, the third server 201 may transmit information about the content to be reproduced next to the display apparatus 100.

The display apparatus 100 may receive information on the estimated content from the first server 200 or the third server 201, and may store the information at operation S2180.

Certain aspects of the present disclosure may also be embodied as computer readable code on a non-transitory computer readable recording medium. The non-transitory computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the non-transitory computer-readable recording medium include read-only memory (ROM), random-access memory (RAM), compact disc-ROMs (CD-ROMs), magnetic tapes, floppy disks, and optical data storage devices. The non-transitory computer-readable recording medium can also be distributed over network-coupled computer systems so that the computer-readable code is stored and executed in a distributed fashion. In addition, functional programs, codes, and code segments for accomplishing the present disclosure may be easily construed by programmers skilled in the art to which the present disclosure pertains.

At this point it should be noted that the various embodiments of the present disclosure as described above generally involve the processing of input data and the generation of output data to some extent. The input data processing and output data generation may be implemented in hardware or software in conjunction with hardware. For example, certain electronic components may be employed in a mobile device or similar or related circuitry to implement the functions associated with the various embodiments of the present disclosure as described above. Alternatively, one or more processors operating in accordance with stored instructions may implement the functions associated with the various embodiments of the disclosure as described above. If this is the case, it is within the scope of the disclosure that the instructions may be stored on one or more non-transitory processor-readable media. Examples of the processor-readable medium include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device. The processor-readable medium can also be distributed over network-coupled computer systems so that the instructions are stored and executed in a distributed fashion. In addition, functional computer programs, instructions, and instruction segments for implementing the present disclosure can be easily construed by programmers skilled in the art to which the present disclosure pertains.

According to various embodiments of the present disclosure, the disclosed embodiments may be implemented by using an S/W program including instructions stored in a computer-readable storage medium.

A computer is a device that calls stored instructions from a storage medium and may perform operations according to the disclosed embodiments according to the called instructions, and may include a display apparatus according to the disclosed embodiments.

The computer-readable storage medium may be provided in the form of a non-transitory storage medium. Here, "non-transitory" simply means that the storage medium does not include a signal and is tangible, and does not consider whether data is semi-permanently or temporarily stored in the storage medium.

In addition, a control method according to the disclosed embodiments may be included in a computer program product and provided. The computer program product may be used as a product for conducting a transaction between a seller and a buyer.

The computer program product may include an S/W program, and a computer-readable storage medium storing the S/W program. For example, the computer program product may include a product (e.g., a downloadable application) in the form of an S/W program that is electronically distributed through a manufacturer of the display device or an electronic marketplace (e.g., Google Play store, application store). For electronic distribution, at least a part of the S/W program may be stored in a storage medium or may be temporarily generated. In this case, the storage medium may be a storage medium of a server of a manufacturer, a server of an electronic market, or an intermediate server that temporarily stores the S/W program.

The computer program product may comprise a storage medium of a server or a storage medium of a device in a system comprising a server and a display apparatus. Alternatively, when there is a third device (e.g., a smartphone) communicatively connected to the server or the display apparatus, the computer program product may include a storage medium of the third device. Alternatively, the computer program product may comprise the S/W program itself sent from the server to the display device or the third device, or from the third device to the display device.

In this case, one of the server, the display apparatus and the third device may execute the computer program product and perform the method according to the disclosed embodiments. Alternatively, two or more of the server, the display apparatus and the third device may execute the computer program product and perform the method according to the disclosed embodiments in a distributed manner.

For example, a server (e.g., a cloud server or an AI server) may execute a computer program product stored in the server and may control a display device communicatively connected with the server to perform a method according to the disclosed embodiments.

In another example, a third device may run the computer program product and may control a display apparatus communicatively connected with the third device to perform a method in accordance with the disclosed embodiments. When the third device runs the computer program product, the third device may download the computer program product from the server and may run the downloaded computer program product. Alternatively, the third device may run a computer program product provided in a preloaded state and may perform a method according to the disclosed embodiments.

While the disclosure has been shown and described with reference to various embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents.

Claims

1. A display device, comprising:

a display;

a memory configured to store a fingerprint obtained by extracting a characteristic of content and information on the content corresponding to the fingerprint;

a communication device configured to communicate with the server, an

At least one processor configured to:

identifying a characteristic of a screen of content currently reproduced on the display and obtaining a fingerprint to search the memory for the presence/absence of a fingerprint matching the generated fingerprint,

identifying, based on a result of the search, whether a query containing the obtained fingerprint is sent to the server to request information on the currently reproduced content,

identifying a type of the content based on information on the content currently being reproduced, an

Changing a content identification period according to the identified type of the content.

2. The display apparatus of claim 1, wherein the at least one processor is further configured to:

in response to searching for a fingerprint matching the obtained fingerprint in the memory, identifying the currently reproduced content based on information about the content corresponding to the searched fingerprint, and

in response to not searching for a fingerprint in the memory that matches the obtained fingerprint, control the communication device to send a query containing the fingerprint to the server to request information about the currently rendered content.

3. The display apparatus of claim 2, wherein the at least one processor is further configured to: in response to not searching for a fingerprint matching the obtained fingerprint in the memory, controlling the communication device to receive information about the currently reproduced content and the fingerprint of the currently reproduced content from the server in response to the query.

4. The display apparatus of claim 1, wherein the at least one processor is further configured to:

in response to the content being advertising content, identifying content in each first time period, an

In response to the content being broadcast program content, identifying content in each second time period that is longer than the first time period.

5. The display apparatus of claim 3, wherein the at least one processor is further configured to:

Changing the number of fingerprints of the currently reproduced content to be received according to the identified type of the content.

6. The display apparatus of claim 3, wherein the at least one processor is further configured to:

calculating a probability that the reproduced content is changed based on information on the currently reproduced content and the viewing history, an

The content recognition period is changed according to the calculated probability.

7. The display apparatus of claim 2, wherein the at least one processor is further configured to:

predicting content to be reproduced next based on the viewing history, an

Information about the predicted content is requested from the server.

8. The display apparatus of claim 3, wherein the at least one processor is further configured to:

receiving additional information related to the currently reproduced content from the server, an

Controlling the display to display the received additional information and the currently reproduced content.

9. A method of identifying content of a display device, the method comprising:

recognizing a characteristic of a screen of the currently reproduced content and obtaining a fingerprint;

searching whether a fingerprint matching the obtained fingerprint is stored in the display device;

identifying whether to transmit a query containing the obtained fingerprint to an external server to request information on the currently reproduced content based on a result of the search;

identifying a type of the content based on information on the content currently being reproduced; and

10. The method of claim 9, wherein identifying whether to send the query to the external server comprises:

in response to a search for a fingerprint matching the obtained fingerprint in the display device, identifying the currently reproduced content based on information about the content corresponding to the searched fingerprint, and

in response to no fingerprint matching the obtained fingerprint being searched for in the display device, sending a query containing the fingerprint to the server to request information about the currently reproduced content.

11. The method of claim 10, further comprising: in response to a fingerprint matching the obtained fingerprint not being searched for in the display device, information on the currently reproduced content and a fingerprint of the currently reproduced content are received from the server in response to the query.

12. The method of claim 9, wherein changing the content identification period comprises:

13. The method of claim 11, further comprising: