WO2023069047A1 - A face recognition system to identify the person on the screen - Google Patents

A face recognition system to identify the person on the screen Download PDF

Info

Publication number
WO2023069047A1
WO2023069047A1 PCT/TR2022/051066 TR2022051066W WO2023069047A1 WO 2023069047 A1 WO2023069047 A1 WO 2023069047A1 TR 2022051066 W TR2022051066 W TR 2022051066W WO 2023069047 A1 WO2023069047 A1 WO 2023069047A1
Authority
WO
WIPO (PCT)
Prior art keywords
face recognition
content
image
person
face
Prior art date
Application number
PCT/TR2022/051066
Other languages
French (fr)
Inventor
Muvaffak Amasya
Original Assignee
Siskon Endustriyel Otomasyon Sistemleri Sanayi Ve Ticaret Anonim Sirketi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siskon Endustriyel Otomasyon Sistemleri Sanayi Ve Ticaret Anonim Sirketi filed Critical Siskon Endustriyel Otomasyon Sistemleri Sanayi Ve Ticaret Anonim Sirketi
Publication of WO2023069047A1 publication Critical patent/WO2023069047A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program

Definitions

  • the invention relates to identification systems for identifying the said person in at least one image frame including the face of at least one person on a screen that allows displaying visual content, comprising a user terminal having an image capture unit for capturing the image of the said screen.
  • US application numbered US2014068670 discloses a system and method in which additional information about actors in series or movie scenes can be accessed in a system with an optional content playback service.
  • the existing additional information can be accessed through a user interface where the user can interact on the screen, and the pictures and detailed information of the actors can be accessed.
  • additional information must be associated with each video in advance.
  • the present invention relates to a system to eliminate the above-mentioned disadvantages and bring new advantages to the relevant technical field.
  • An object of the invention is to provide a system and method to detect the people in the content displayed on the screen, such as the television screen, with less use of system resources and with increased accuracy.
  • Another object of the invention is to provide a system and method to obtain more accurate results close to advanced face recognition algorithms by using lesser face recognition algorithms.
  • Another object of the invention is to reduce the system resources used by existing face recognition algorithms and to increase their accuracy.
  • Another object of the invention is to provide a system to identify the people on the screen without requiring the special programming of the device with the screen.
  • the present invention is a recognition system that includes a user terminal with an image capture unit for identifying the said person in at least one image frame containing the face of at least one person on a screen that enables the display of image content. Accordingly, it comprises a processor unit that is associated to communicate with the said image capture unit, and the said processor unit is configured to:
  • a possible embodiment of the invention is characterized in that the said database is configured to include person information indicating the time intervals at which persons are present in the said content or an organization related to the content, and the processor unit is configured to:
  • Another possible embodiment of the invention is characterized in that the said database is configured to include person information indicating the time intervals at which persons are present in the said content or an organization related to the content, and the processor unit is configured to:
  • Another possible embodiment of the invention is characterized in that if the processor unit detects that the detected text is a subtitle text, it is configured to:
  • Another possible embodiment of the invention is characterized in that the processor unit is configured to display the user terminal of at least one person matched with the face in the face region on a user interface as a result of the face recognition process.
  • processor unit is configured to request feedback from the said user interface on whether the matching person is correct or not and to edit the face recognition algorithm used in the face recognition process according to the feedback it receives.
  • processor unit is configured to access a person detail information source containing additional information about the person and to display additional information about at least one of the matching persons in the user interface.
  • processor unit comprises a face image of at least one person from the said additional information.
  • Another possible embodiment of the invention is characterized in that the processor unit is configured to display the matching results in the user interface to display the face image of the matching person and the first image on the same screen.
  • the processor unit is configured to display the matching results in the user interface to display the face image of the matching person and the first image on the same screen.
  • processor unit is configured to display the created filter in the user interface for the user to approve or edit and to perform face recognition processes if the user receives input from the user terminal regarding the approval.
  • Another possible embodiment of the invention is characterized in that the said sign is at least one of a broadcast channel symbol, a symbol for the content, a sports club jersey, and a sports club symbol.
  • the said text is at least one of the subtitle, content name, content section information, text about whether the content is broadcasted live, text about the time the content is broadcasted, sports club name, sports club abbreviation, and competition score text.
  • Another possible embodiment of the invention is characterized in that the said screen and the user terminal are integrated, and the image capture unit has hardware configured to receive the screen image of the user terminal.
  • identification can be performed by taking a screenshot on a screen such as a computer or smart TV.
  • Figure 1 shows a representative view of the system.
  • Figure 2 shows a representative view of the first image.
  • Figure 3 shows a representative view of the system.
  • Figure 4 shows a representative view of the first image.
  • Figure 5 shows a schematic view of the system.
  • the invention is a recognition system comprising a user terminal (100) having an image capture unit (1 10) for receiving an image of the said screen (400), for the identification of the said person in at least one image frame including the face of at least one person in a screen (400) enabling the display of video content.
  • a processor unit (not shown in the figure) allows narrowing the database to be searched for face recognition using the text and signs in the image in addition to the face to identify the person matching a face in the images captured by the image capture unit (110), i.e. to apply an additional filter to the search.
  • the face recognition system (10) comprises a user terminal (100) to enable image acquisition and display of the user; a server (200) that can communicate with the user terminal (100) through a communication network (300); a data source (251 ) that has a database that stores the person information containing at least one face image of the persons, to be associated with at least one of the said parameters, to which the said server (200) provides access to perform face recognition operations.
  • the face recognition system (10) may also include a person detail information source (252) containing the details of the persons of the server (200).
  • the said processor unit may be a terminal processor (120) of the user terminal (100).
  • the processor unit may be the server processor (210) of a server (200).
  • the processor unit may include a co-operating terminal processor (120) and server processor (210) to perform some of the steps of the invention.
  • the said user terminal (100) may include a communication unit (130) for communicating with the processor unit through the communication network (300).
  • the communication network (300) may be an internet or similar network.
  • the communication unit (130) may also be, for example, hardware that enables wireless connection to the internet network.
  • the user terminal (100) may include a user interface to enable data to be presented to the user.
  • the image capture unit (1 10) of the user terminal (100) may be a camera.
  • the terminal processor unit (120) of the user terminal (100) may be associated with a memory unit (140), it may include software consisting of command lines registered in the said memory unit (140) and contributing to the operation of the invention.
  • the user terminal (100) may be a smartphone, tablet computer, smartwatch, general-purpose computer, etc.
  • the screen (400) of this description is a device for displaying content. It may be a monitor, television, projector, etc.
  • the screen (400) may play content received from a content source or in a memory unit (140).
  • the screen (400) may receive broadcast content such as, for example, satellite broadcast, terrestrial broadcast, or cable broadcast; it may be able to receive content from an optional video source, a memory.
  • the content may be a video or just an image frame. It may be content, movies, series, sports competitions, etc.
  • the processor unit receives as input the first image captured by the image capture unit (1 10) as the characteristic aspect of the invention. Then, it detects at least one face region (411 ) in the said first image.
  • the face region (41 1 ) herein refers to the portion where the faces of the persons in the first image are detected. In the first image, there may be one or more people.
  • the processor unit detects at least one auxiliary information region (412) in the said first image.
  • the auxiliary information region (412) may be in the body part of the detected face, in the corners of the screen (400), and the region close to the lower edge of the screen (400).
  • Auxiliary information regions are the regions where the channel name, sports competition score, subtitle, athlete jersey, content name (series name, movie name, etc.) information is located.
  • the representative view of the face region (41 1 ) and the auxiliary regions in the first image is given in Figure 2.
  • the processor unit may access the electronic program guide (EPG) of the channel when the channel name is detected and may determine the name of the content accordingly. It also updates the filter by the name of the content.
  • EPG electronic program guide
  • the processor unit detects at least one sign and at least one text in the auxiliary information region (412).
  • the processor unit then creates a filter containing at least one parameter value with respect to at least one of the detected signs and text.
  • the processor unit determines the sequence name as a parameter, for example when it detects a sequence name in the text.
  • the processor unit then accesses the aforementioned database and performs face recognition based on facial images of persons who fit the filter created for at least one of the faces in the said face regions (41 1 ).
  • face recognition is performed using only the images of people indexed as players in this series.
  • the database may also include the date and time ranges in which persons are included in the contents or organizations associated with the content.
  • the date range in which a person is in a series or the date range in which an athlete is in a sports club can also be associated with people.
  • the processor unit is configured to add the current date and time information to the filter if it detects that it receives user input from the user terminal (100) that the content is live streamed.
  • the processor unit questions whether one of the said texts expresses that the content is broadcasted live or not, and if it determines that it expresses that it is broadcasted live, it adds the current date and time information to the filtering filter.
  • This text can be the expression "live" for example.
  • a search is made based on the current staff and employees of football clubs. From other auxiliary information, as shown in Figure 3 or Figure 4, club abbreviations can be detected and scanned according to the tool name parameter, for example, determined according to these abbreviations.
  • weighted colors or field color on the screen (400) can be determined as a sign and added to the sports type filter according to the field color and scanning can be done by the staff of the sports club in this sports branch.
  • the processor unit is configured to request feedback on whether the matching person is correct from the said user interface and to edit the face recognition algorithm used in the face recognition process according to the feedback it receives.
  • the face recognition system (10) includes a subtitle database.
  • the subtitle database contains subtitles in various languages associated with content names. If the processor unit detects that a text in the auxiliary information region (412) is a subtitle, it queries that subtitle in the subtitle database (253). As a result of the query, it adds the content names containing the subtitle as parameters to the filter. For example, when a caption is detected in the auxiliary information region (412) and it is determined that this caption is “Legolas! What do your Elf eyes see?”, the contents containing this caption are determined from the caption database. For example, in this case, one of the contents containing this subtitle can be determined as "Lord of the Rings: Two T owers".
  • the processor unit filter is arranged to include the parameter "Lord of the Rings: Two Towers".
  • the processor unit determines the matches by running the face recognition algorithm based on the players of the content with the content name "Lord of the Rings: Two Towers” from the data source (251 ). In this case, for example, as a result, it can be ensured that the person's name "Viggo Mortensen” is displayed in the user interface.
  • the processor unit may also display the data received from the person detail information source (252) in the user interface. This information may include, for example, the date and place of birth of the identified person, movies they have played, etc. Thus, by narrowing the pool of people with face recognition, both the probability of the results being correct and the system resources are used in a reduced way.
  • the said face recognition algorithms are one of the face recognition algorithms known in the art.
  • the display of the person detected by face recognition in the user interface in a way that the first image is displayed at the same time as a face image.
  • these images can be provided side by side.
  • it can be ensured that these images are displayed overlapping by changing the transparency ratio.
  • the processor unit can update the face recognition algorithm or edit the filter according to the input it receives from the user terminal (100).
  • the screen (400) and the user terminal (100) are integrated.
  • This embodiment may be provided in a smart television.
  • the image capture unit (110) captures the screen (400) image.
  • the screen (400) and the user terminal (100) when integrated, they may be a computer, a smart television, or a tablet computer.
  • Text detection and text reading processes in the auxiliary information region (412) can be performed with optical character recognition algorithms.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Studio Devices (AREA)
  • Collating Specific Patterns (AREA)

Abstract

The invention is a face recognition system for identifying the said person in at least one image frame including the face of at least one person on a screen (400) that allows displaying visual content, comprising a user terminal (100) having an image capture unit (110) for capturing the image of the said screen (400). Accordingly, it comprises a processor unit that is associated to communicate with the said image capture unit (110), the said processor unit is configured to receive the first image captured by the said image capture unit as an input, detect at least one face region (411) in the said first image, detect at least one auxiliary information region (412) in the said first image, detect at least one sign and at least one text in the said auxiliary information region (412), create a filter containing at least one parameter value according to at least one of the detected sign and text, access a database storing the person information comprising at least one face image of the persons such that the said person information is associated with at least one of the said parameters, and perform a face recognition process based on the faces of the persons who fit the filter created for at least one of the faces in the said face regions (411).

Description

A FACE RECOGNITION SYSTEM TO IDENTIFY THE PERSON ON THE SCREEN
TECHNICAL FIELD
The invention relates to identification systems for identifying the said person in at least one image frame including the face of at least one person on a screen that allows displaying visual content, comprising a user terminal having an image capture unit for capturing the image of the said screen.
BACKGROUND
People can watch content such as series, movies, and sports competitions by playing live broadcasts or by playing recorded data. People may want to know who other people in this content such as actors, athletes, etc. are. For this purpose, the user can learn the identities of the individuals by researching electronic program guides and web pages containing the information of the monitored content.
US application numbered US2014068670 discloses a system and method in which additional information about actors in series or movie scenes can be accessed in a system with an optional content playback service. The existing additional information can be accessed through a user interface where the user can interact on the screen, and the pictures and detailed information of the actors can be accessed. However, for this, additional information must be associated with each video in advance. In addition, it is not possible to identify athletes in events such as live sports competitions with this method.
Application numbered US2009091629 discloses a system in which a screenshot is taken on the playback device and face recognition is applied to the face in this screenshot, and the matches in the database used in face recognition are presented to the user. However, when a scan is made in this way since a very wide scan will be made in the databases, system resources are used a lot and since the number of similar people may be high, the probability of obtaining wrong results increases. In addition, the device on which the content is monitored must also be specially configured, and it is not possible to identify people on screens that are not programmed to do this work. As a result, all the problems mentioned above have made it necessary to make an innovation in the relevant technical field.
BRIEF DESCRIPTION OF THE INVENTION
The present invention relates to a system to eliminate the above-mentioned disadvantages and bring new advantages to the relevant technical field.
An object of the invention is to provide a system and method to detect the people in the content displayed on the screen, such as the television screen, with less use of system resources and with increased accuracy.
Another object of the invention is to provide a system and method to obtain more accurate results close to advanced face recognition algorithms by using lesser face recognition algorithms.
Another object of the invention is to reduce the system resources used by existing face recognition algorithms and to increase their accuracy.
Another object of the invention is to provide a system to identify the people on the screen without requiring the special programming of the device with the screen.
To achieve all the objects mentioned above and that will emerge from the following detailed description, the present invention is a recognition system that includes a user terminal with an image capture unit for identifying the said person in at least one image frame containing the face of at least one person on a screen that enables the display of image content. Accordingly, it comprises a processor unit that is associated to communicate with the said image capture unit, and the said processor unit is configured to:
- receive a first image captured by the image capture unit as input,
- detect at least one face region in the said first image,
- detect at least one auxiliary information region in the said first image,
- detect at least one sign and at least one text in the said auxiliary information region,
- create a filter comprising at least one parameter value with respect to at least one of the detected signs and text, - access a database storing the person information comprising at least one face image of the persons such that the said person information is associated with at least one of the said parameters, and
- perform the face recognition process based on face images of persons who fit the filter created for at least one of the faces in the said face regions. Thus, by searching only in a person pool suitable for filters, the possibility of obtaining correct results is increased and system resources are used in a reduced way. In addition, accurate results can be obtained by using less complicated face recognition algorithms.
A possible embodiment of the invention is characterized in that the said database is configured to include person information indicating the time intervals at which persons are present in the said content or an organization related to the content, and the processor unit is configured to:
- detect that it has received a user entry from the user terminal that the content is broadcasted live, and
- add the current date and time information to the filter.
Another possible embodiment of the invention is characterized in that the said database is configured to include person information indicating the time intervals at which persons are present in the said content or an organization related to the content, and the processor unit is configured to:
- question whether one of the said texts expresses that the content is being broadcasted live; and
- add the current date and time information to the filter if it is determined that it expresses that it is broadcasted live. Thus, when it is determined from the texts or signs that the content belongs to a sports competition between the two teams, it is ensured that the filtration process is performed specifically for the people in the existing teams in this sports competition, and the system resources are used in a reduced way and a more accurate result is obtained.
Another possible embodiment of the invention is characterized in that if the processor unit detects that the detected text is a subtitle text, it is configured to:
- access a subtitle database where subtitles related to content are stored in relation to content names,
- query the detected text in the subtitle database, - detect the content names containing subtitles that match the detected text, and
- add the detected content name to the filter. Thus, since only the people in the content matching the subtitle are searched, the likelihood of the face recognition process obtaining correct results is increased and the system resources used in doing so are significantly reduced.
Another possible embodiment of the invention is characterized in that the processor unit is configured to display the user terminal of at least one person matched with the face in the face region on a user interface as a result of the face recognition process.
Another possible embodiment of the invention is characterized in that the processor unit is configured to request feedback from the said user interface on whether the matching person is correct or not and to edit the face recognition algorithm used in the face recognition process according to the feedback it receives.
Another possible embodiment of the invention is characterized in that the processor unit is configured to access a person detail information source containing additional information about the person and to display additional information about at least one of the matching persons in the user interface.
Another possible embodiment of the invention is characterized in that the processor unit comprises a face image of at least one person from the said additional information.
Another possible embodiment of the invention is characterized in that the processor unit is configured to display the matching results in the user interface to display the face image of the matching person and the first image on the same screen. Thus, the user can easily determine whether the match is correct or not.
Another possible embodiment of the invention is characterized in that the processor unit is configured to display the created filter in the user interface for the user to approve or edit and to perform face recognition processes if the user receives input from the user terminal regarding the approval.
Another possible embodiment of the invention is characterized in that the said sign is at least one of a broadcast channel symbol, a symbol for the content, a sports club jersey, and a sports club symbol. Another possible embodiment of the invention is characterized in that the said text is at least one of the subtitle, content name, content section information, text about whether the content is broadcasted live, text about the time the content is broadcasted, sports club name, sports club abbreviation, and competition score text.
Another possible embodiment of the invention is characterized in that the said screen and the user terminal are integrated, and the image capture unit has hardware configured to receive the screen image of the user terminal. Thus, identification can be performed by taking a screenshot on a screen such as a computer or smart TV.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 shows a representative view of the system.
Figure 2 shows a representative view of the first image.
Figure 3 shows a representative view of the system.
Figure 4 shows a representative view of the first image.
Figure 5 shows a schematic view of the system.
DETAILED DESCRIPTION OF THE INVENTION
In this detailed description, the subject matter of the invention is described by using examples only for a better understanding, which will have no limiting effect.
Referring to Figure 1 , the invention is a recognition system comprising a user terminal (100) having an image capture unit (1 10) for receiving an image of the said screen (400), for the identification of the said person in at least one image frame including the face of at least one person in a screen (400) enabling the display of video content. A processor unit (not shown in the figure) allows narrowing the database to be searched for face recognition using the text and signs in the image in addition to the face to identify the person matching a face in the images captured by the image capture unit (110), i.e. to apply an additional filter to the search. The face recognition system (10) comprises a user terminal (100) to enable image acquisition and display of the user; a server (200) that can communicate with the user terminal (100) through a communication network (300); a data source (251 ) that has a database that stores the person information containing at least one face image of the persons, to be associated with at least one of the said parameters, to which the said server (200) provides access to perform face recognition operations. In more detail, the face recognition system (10) may also include a person detail information source (252) containing the details of the persons of the server (200).
The said processor unit may be a terminal processor (120) of the user terminal (100). In a possible embodiment of the invention, the processor unit may be the server processor (210) of a server (200). In another embodiment of the invention, the processor unit may include a co-operating terminal processor (120) and server processor (210) to perform some of the steps of the invention.
(100)
The said user terminal (100) may include a communication unit (130) for communicating with the processor unit through the communication network (300). The communication network (300) may be an internet or similar network. The communication unit (130) may also be, for example, hardware that enables wireless connection to the internet network. The user terminal (100) may include a user interface to enable data to be presented to the user. The image capture unit (1 10) of the user terminal (100) may be a camera. The terminal processor unit (120) of the user terminal (100) may be associated with a memory unit (140), it may include software consisting of command lines registered in the said memory unit (140) and contributing to the operation of the invention. The user terminal (100) may be a smartphone, tablet computer, smartwatch, general-purpose computer, etc.
The screen (400) of this description is a device for displaying content. It may be a monitor, television, projector, etc. The screen (400) may play content received from a content source or in a memory unit (140). The screen (400) may receive broadcast content such as, for example, satellite broadcast, terrestrial broadcast, or cable broadcast; it may be able to receive content from an optional video source, a memory. The content may be a video or just an image frame. It may be content, movies, series, sports competitions, etc.
The processor unit receives as input the first image captured by the image capture unit (1 10) as the characteristic aspect of the invention. Then, it detects at least one face region (411 ) in the said first image. The face region (41 1 ) herein refers to the portion where the faces of the persons in the first image are detected. In the first image, there may be one or more people. The processor unit then detects at least one auxiliary information region (412) in the said first image. The auxiliary information region (412) may be in the body part of the detected face, in the corners of the screen (400), and the region close to the lower edge of the screen (400). Auxiliary information regions are the regions where the channel name, sports competition score, subtitle, athlete jersey, content name (series name, movie name, etc.) information is located. The representative view of the face region (41 1 ) and the auxiliary regions in the first image is given in Figure 2.
In a possible embodiment of the invention, the processor unit may access the electronic program guide (EPG) of the channel when the channel name is detected and may determine the name of the content accordingly. It also updates the filter by the name of the content.
The processor unit detects at least one sign and at least one text in the auxiliary information region (412). The processor unit then creates a filter containing at least one parameter value with respect to at least one of the detected signs and text. The processor unit determines the sequence name as a parameter, for example when it detects a sequence name in the text. The processor unit then accesses the aforementioned database and performs face recognition based on facial images of persons who fit the filter created for at least one of the faces in the said face regions (41 1 ). In the case where the series name is a parameter, face recognition is performed using only the images of people indexed as players in this series. Thus, system resources are used in a reduced way and faster and more accurate results can be obtained.
In a possible embodiment of the invention, the database may also include the date and time ranges in which persons are included in the contents or organizations associated with the content. For example, the date range in which a person is in a series or the date range in which an athlete is in a sports club can also be associated with people. Accordingly, in a possible embodiment of the invention, the processor unit is configured to add the current date and time information to the filter if it detects that it receives user input from the user terminal (100) that the content is live streamed. In another possible embodiment of the invention, the processor unit questions whether one of the said texts expresses that the content is broadcasted live or not, and if it determines that it expresses that it is broadcasted live, it adds the current date and time information to the filtering filter. This text can be the expression "live" for example. Thus, by updating the filter according to instantaneous date and time, for example, when it is desired to recognize a person in a football match, a search is made based on the current staff and employees of football clubs. From other auxiliary information, as shown in Figure 3 or Figure 4, club abbreviations can be detected and scanned according to the tool name parameter, for example, determined according to these abbreviations.
In another possible embodiment of the invention, weighted colors or field color on the screen (400) can be determined as a sign and added to the sports type filter according to the field color and scanning can be done by the staff of the sports club in this sports branch.
The processor unit is configured to request feedback on whether the matching person is correct from the said user interface and to edit the face recognition algorithm used in the face recognition process according to the feedback it receives.
In a possible embodiment, the face recognition system (10) includes a subtitle database. The subtitle database contains subtitles in various languages associated with content names. If the processor unit detects that a text in the auxiliary information region (412) is a subtitle, it queries that subtitle in the subtitle database (253). As a result of the query, it adds the content names containing the subtitle as parameters to the filter. For example, when a caption is detected in the auxiliary information region (412) and it is determined that this caption is “Legolas! What do your Elf eyes see?”, the contents containing this caption are determined from the caption database. For example, in this case, one of the contents containing this subtitle can be determined as "Lord of the Rings: Two T owers". The processor unit filter is arranged to include the parameter "Lord of the Rings: Two Towers". The processor unit then determines the matches by running the face recognition algorithm based on the players of the content with the content name "Lord of the Rings: Two Towers" from the data source (251 ). In this case, for example, as a result, it can be ensured that the person's name "Viggo Mortensen" is displayed in the user interface. The processor unit may also display the data received from the person detail information source (252) in the user interface. This information may include, for example, the date and place of birth of the identified person, movies they have played, etc. Thus, by narrowing the pool of people with face recognition, both the probability of the results being correct and the system resources are used in a reduced way.
The said face recognition algorithms are one of the face recognition algorithms known in the art.
In a possible embodiment of the invention, it is the display of the person detected by face recognition in the user interface in a way that the first image is displayed at the same time as a face image. Thus, the user can easily observe whether the match is correct or not. In a possible embodiment of the invention, these images can be provided side by side. In a possible embodiment of the invention, it can be ensured that these images are displayed overlapping by changing the transparency ratio.
In a possible embodiment of the invention, it is requested to make an entry from the user interface as to whether the matched person is the right person. The processor unit can update the face recognition algorithm or edit the filter according to the input it receives from the user terminal (100).
In a possible embodiment of the invention, the screen (400) and the user terminal (100) are integrated. This embodiment may be provided in a smart television. The image capture unit (110) captures the screen (400) image. In this embodiment, when the screen (400) and the user terminal (100) are integrated, they may be a computer, a smart television, or a tablet computer.
Text detection and text reading processes in the auxiliary information region (412) can be performed with optical character recognition algorithms.
The scope of protection of the invention is specified in the attached claims and cannot be limited to those explained for exemplary purposes in this detailed description. It is evident that a person skilled in the art may exhibit similar embodiments in light of the above-mentioned facts without drifting apart from the main theme of the invention.
REFERENCE NUMBERS GIVEN IN THE FIGURE
10 Face recognition system
100 User terminal
110 Image capture unit
120 Terminal processor
130 Communication unit
140 Memory unit
200 Server
210 Server processor
251 Data source
252 Person detail information source
253 Subtitle database
300 Communication network
400 Screen
410 First image frame
411 Face region
412 Auxiliary information region

Claims

CLAIMS A face recognition system (10) for identifying the said person in at least one image frame including the face of at least one person on a screen (400) that allows displaying visual content, comprising a user terminal (100) having an image capture unit (1 10) for capturing the image of the said screen (400), characterized in that it comprises a processor unit associated to communicate with the said image capture unit (1 10), and the processor unit is configured to:
- receive a first image captured by the image capture unit as input,
- detect at least one face region (41 1 ) in the said first image,
- detect at least one auxiliary information region (412) in the said first image,
- detect at least one of at least one sign and at least one text in the said auxiliary information region (412),
- form a filter comprising at least one parameter value with respect to at least one of the detected signs and text,
- access a database storing the person information comprising at least one face image of the persons such that the said person information is associated with at least one of the said parameters, and
- perform the face recognition process based on face images of persons who fit the filter created for at least one of the faces in the said face regions (411 ). A face recognition system (10) according to claim 1 , characterized in that the said database is configured to include the person information indicating the time intervals when the persons are in the said content or an organization related to the content, and the processor unit is configured to
- detect that it has received a user entry from the user terminal that the content is being broadcasted live, and
- add the current date and time information to the filter. A face recognition system (10) according to claim 1 , characterized in that the said database is configured to include the person information indicating the time intervals when the persons are in the said content or an organization related to the content, and the processor unit is configured to, - question whether one of the said texts expresses that the content is being broadcasted live; and
- add the current date and time information to the filter if it is determined that it expresses that it is broadcasted live.
4. A face recognition system (10) according to claim 1 , characterized in that if the processor unit detects that the detected text is a subtitle text, it is configured to
- access a subtitle database (253) where subtitles related to content are stored in relation to content names,
- query the detected text in the subtitle database (253),
- detect the content names containing subtitles that match the detected text, and
- add the detected content name to the filter.
5. A face recognition system (10) according to claim 1 , characterized in that the processor unit is configured to display the user terminal (100) of at least one person matched to the face in the face region (41 1 ) in the user interface as a result of the face recognition process.
6. A face recognition system (10) according to claim 5, characterized in that the processor unit is configured to request feedback on whether the matching person is correct from the said user interface and to edit the face recognition algorithm used in the face recognition process according to the feedback it receives.
7. A face recognition system (10) according to claim 5, characterized in that the processor unit is configured to access a person detail information source (252) containing additional information about the person and to display additional information about at least one of the matching persons in the user interface.
8. A face recognition system (10) according to claim 7, characterized in that the processor unit comprises a face image of at least one person from the said additional information.
9. A face recognition system (10) according to claim 8, characterized in that the processor unit is configured to display the matching results on the user interface to display the face image of the matching person and the first image on the same screen (400). A face recognition system (10) according to claim 1 , characterized in that the processor unit is configured to display the created filter in the user interface for the user to approve or edit, and to perform face recognition processes in case of receiving an input related to the approval from the user terminal (100). A face recognition system (10) according to claim 1 , characterized in that the said sign is at least one of a broadcast channel symbol, a symbol related to the content, a sports club jersey, and a sports club symbol. A face recognition system (10) according to claim 1 , characterized in that the said text is at least one of the subtitle, content name, content section information, text on whether the content is broadcasted live, text on the time the content is broadcasted, sports club name, sports club abbreviation, and competition score text. A face recognition system (10) according to claim 1 , characterized in that the said processor unit is a terminal processor (120). A face recognition system (10) according to claim 1 , characterized in that the said processor unit is a server processor (210). A face recognition system (10) according to claim 1 , characterized in that the said screen (400) and the user terminal (100) are integrated, and the image capture unit (110) has hardware configured to receive the screen image of the user terminal (100).
PCT/TR2022/051066 2021-10-22 2022-09-30 A face recognition system to identify the person on the screen WO2023069047A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
TR2021/016527A TR2021016527A2 (en) 2021-10-22 2021-10-22 A FACE RECOGNITION SYSTEM TO IDENTIFY PEOPLE ON THE SCREEN
TR2021/016527 2021-10-22

Publications (1)

Publication Number Publication Date
WO2023069047A1 true WO2023069047A1 (en) 2023-04-27

Family

ID=85113409

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2022/051066 WO2023069047A1 (en) 2021-10-22 2022-09-30 A face recognition system to identify the person on the screen

Country Status (2)

Country Link
TR (1) TR2021016527A2 (en)
WO (1) WO2023069047A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114640863A (en) * 2022-03-04 2022-06-17 广州方硅信息技术有限公司 Method, system and device for displaying character information in live broadcast room and computer equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120045093A1 (en) * 2010-08-23 2012-02-23 Nokia Corporation Method and apparatus for recognizing objects in media content
US20140282660A1 (en) * 2013-03-14 2014-09-18 Ant Oztaskent Methods, systems, and media for presenting mobile content corresponding to media content
US20160371534A1 (en) * 2015-06-16 2016-12-22 Microsoft Corporation Automatic recognition of entities in media-captured events

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120045093A1 (en) * 2010-08-23 2012-02-23 Nokia Corporation Method and apparatus for recognizing objects in media content
US20140282660A1 (en) * 2013-03-14 2014-09-18 Ant Oztaskent Methods, systems, and media for presenting mobile content corresponding to media content
US20160371534A1 (en) * 2015-06-16 2016-12-22 Microsoft Corporation Automatic recognition of entities in media-captured events

Also Published As

Publication number Publication date
TR2021016527A2 (en) 2021-11-22

Similar Documents

Publication Publication Date Title
US11272248B2 (en) Methods for identifying video segments and displaying contextually targeted content on a connected television
US10820048B2 (en) Methods for identifying video segments and displaying contextually targeted content on a connected television
CN103686344B (en) Strengthen video system and method
US20160019240A1 (en) Voice directed context sensitive visual search
US10652592B2 (en) Named entity disambiguation for providing TV content enrichment
US20090213270A1 (en) Video indexing and fingerprinting for video enhancement
CN102595206B (en) Data synchronization method and device based on sport event video
JP2000106661A (en) Image processing method and system and device
KR20150083355A (en) Augmented media service providing method, apparatus thereof, and system thereof
CN110557671A (en) Method and system for automatically processing unhealthy content of video
KR20140043070A (en) Devices, systems, methods, and media for detecting, indexing, and comparing video signals from a video display in a background scene using a camera-enabled device
CN108040125A (en) Content recognition and method for pushing and TV syndrome AI assistant devices
CN110519620A (en) Recommend the method and television set of TV programme in television set
KR20100116412A (en) Apparatus and method for providing advertisement information based on video scene
WO2023069047A1 (en) A face recognition system to identify the person on the screen
CN110502117A (en) Screenshot method and electric terminal in electric terminal
US20180366089A1 (en) Head mounted display cooperative display system, system including dispay apparatus and head mounted display, and display apparatus thereof
CN107306358A (en) Control method for playing back and device
CN111444822B (en) Object recognition method and device, storage medium and electronic device
CN110099298A (en) Multimedia content processing method and terminal device
JP5343658B2 (en) Recording / playback apparatus and content search program
JP2016086342A (en) Information processing device, information processing method, and program
Maier et al. Is there a visual bias in televised debates? Evidence from Germany, 2002–2017
CN111432279A (en) Method and device for controlling smart television and smart television
US20100050200A1 (en) Program information prompting method and apparatus and television set using the same