CN113273171A - Image processing apparatus, image processing server, image processing method, computer program, and storage medium - Google Patents

Image processing apparatus, image processing server, image processing method, computer program, and storage medium Download PDF

Info

Publication number
CN113273171A
CN113273171A CN201980088091.XA CN201980088091A CN113273171A CN 113273171 A CN113273171 A CN 113273171A CN 201980088091 A CN201980088091 A CN 201980088091A CN 113273171 A CN113273171 A CN 113273171A
Authority
CN
China
Prior art keywords
image processing
information
specific object
processing apparatus
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980088091.XA
Other languages
Chinese (zh)
Inventor
吉田武弘
白川雄资
春山裕介
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP2018209469A external-priority patent/JP7289630B2/en
Priority claimed from JP2018209494A external-priority patent/JP7233887B2/en
Priority claimed from JP2018209480A external-priority patent/JP7233886B2/en
Application filed by Canon Inc filed Critical Canon Inc
Publication of CN113273171A publication Critical patent/CN113273171A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G03PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
    • G03BAPPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
    • G03B15/00Special procedures for taking photographs; Apparatus therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • H04N23/633Control of cameras or camera modules by using electronic viewfinders for displaying additional information relating to control or operation of the camera
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/67Focus control based on electronic image sensor signals
    • H04N23/672Focus control based on electronic image sensor signals based on the phase difference signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/73Circuitry for compensating brightness variation in the scene by influencing the exposure time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/90Arrangement of cameras or camera modules, e.g. multiple cameras in TV studios or sports stadiums
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/95Computational photography systems, e.g. light-field imaging systems
    • H04N23/958Computational photography systems, e.g. light-field imaging systems for extended depth of field imaging
    • H04N23/959Computational photography systems, e.g. light-field imaging systems for extended depth of field imaging by adjusting depth of field during image capture, e.g. maximising or setting range based on scene characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/16Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using electromagnetic waves other than radio waves

Abstract

When a professional photographer or a general audience acquires a video, the position information of a specific object can be displayed in time. The image processing apparatus includes: a display section configured to display an image; a selection section configured to select a specific object from the image displayed on the display section; a specification information generation section configured to generate specification information of the specific object selected by the selection section; a transmission section configured to transmit the designation information generated by the designation information generation section to a server; an acquisition section configured to acquire, from the server, position information of the specific object based on the specification information; and a control section configured to cause the display section to display additional information based on the positional information of the specific object acquired by the acquisition section.

Description

Image processing apparatus, image processing server, image processing method, computer program, and storage medium
Technical Field
The present invention relates to an image processing apparatus and the like used for photography and video surveillance.
Background
In recent years, with the progress of internationalization, many visitors have come to visit japan. In addition, in the sports game, the shooting opportunities for athletes of various countries are also remarkably increased.
However, for example, in the scene of a sports match, it is difficult for professional photographers and general persons who take photographs to find a specific athlete from among many athletes. In addition, especially in sporting events, the players are moving rapidly and several players cross each other during the event, so that the positions of the players are often not visible. This is not limited to sports games, and the same may occur when shooting or monitoring a specific person or the like in a crowd.
Patent document 1 discloses a plurality of cameras that capture an object from a plurality of directions, and a plurality of image processing apparatuses that extract predetermined regions from images captured by corresponding cameras of the plurality of cameras. In addition, an image generating device that generates a virtual viewpoint image based on image data of a predetermined area extracted by a plurality of image processing devices from images captured by a plurality of cameras is also disclosed.
Patent document 2 discloses an auto focus detection apparatus that drives a focus lens based on an AF evaluation value acquired from a captured image and controls auto focus detection.
Reference list
Patent document
Patent document 1: japanese patent laid-open publication No. 2017-211828
Patent document 2: japanese patent No. 5322629
Disclosure of Invention
Technical problem
However, in the case of a sports scene or the like in which many players are gathered together, the players may overlap or may not see the players. In addition, the athlete may be out of sight, and thus it is more difficult to photograph the athlete at an appropriate timing.
Further, although it is necessary for a professional photographer to immediately send a shot photograph to a news editor or the like in particular, if the professional photographer does not know the judgment result, it takes time to recognize the judgment, which is a disadvantage.
Furthermore, even if the photographer finds an athlete he or she desires to take, the photographer needs to track the athlete after focusing on the athlete he or she desires to take. In sports involving fast movements, such tracking is very difficult, and if the photographer is concentrating on the tracking, it may not be possible to take good pictures, which is a disadvantage.
In addition, although various information of the omni-directional video and the field of the match and competition can be grasped on the server side and various valuable information inside and outside the sports field can be obtained, the conventional system has a problem that the server is not sufficiently utilized.
Also, there are many cases in which a general user who monitors a game with a terminal in an arena or at home may not see a specific player or cannot keep up with the state of the game. Also, in racing cars, airplane competitions, horse racing, etc., objects such as a particular vehicle, airplane, horse, etc., may not be visible. In addition, in the case where a specific character is tracked in a street corner, the specific character may be buried in a crowd. In addition, in such a case where a person is focused on visually tracking a specific object of interest, there are problems as follows: photographing, focusing, exposure adjustment, and the like of the subject may not be smoothly performed.
The present invention is made to solve the above-described problems, and provides an image processing apparatus capable of providing a photographer or an observer with timely display of useful information.
Means for solving the problems
An image processing apparatus, comprising: a display section configured to display an image; a selection section configured to select a specific object from the image displayed on the display section; a specification information generation section configured to generate specification information of the specific object selected by the selection section; a transmission section configured to transmit the designation information generated by the designation information generation section to a server; an acquisition section configured to acquire, from the server, position information of the specific object based on the specification information; and a control section configured to cause the display section to display additional information based on the positional information of the specific object acquired by the acquisition section.
Advantageous effects of the invention
According to the present invention, if a user designates a specific object as a target, the user easily grasps the position of the specific object on the screen, and when the user is monitoring or photographing the specific object, for example, the user hardly sees the specific object.
Drawings
Fig. 1 is a diagram schematically showing the configuration of a system using an exemplary image processing apparatus.
Fig. 2 is a detailed block diagram of the server side.
Fig. 3 is a detailed block diagram of the terminal side.
Fig. 4 is a detailed block diagram of the terminal side.
Fig. 5 is a diagram showing an example of a focused athlete display start sequence.
Fig. 6 is a diagram showing an example of an attention athlete displaying a tracking sequence.
Fig. 7 is a diagram showing another example of an attention athlete displaying a tracking sequence.
Fig. 8 is a diagram showing a focused athlete display tracking control flow on the camera side.
Fig. 9 is a diagram showing another example of the focus athlete display tracking control flow on the camera side.
Fig. 10 is a block diagram showing a functional configuration example of the tracking unit 371 of the digital camera.
Fig. 11 is a diagram showing a focused athlete detection control flow on the server side.
Fig. 12 is a diagram showing a flow of detecting a player uniform number on the server side.
Fig. 13 is a diagram showing another example of the focused athlete detection control flow on the server side.
Fig. 14 is a diagram showing another example of the focused athlete detection control flow on the server side.
Fig. 15 is a diagram showing another example of the focused athlete detection control flow on the server side.
Fig. 16 is a diagram showing another example of the focused athlete detection control flow on the server side.
Fig. 17 is a diagram showing a display example of the camera display unit.
Fig. 18 is a diagram showing another example of the focus athlete display tracking control flow on the camera side.
Fig. 19 is a diagram showing another example of the focus athlete display tracking control flow on the camera side.
Fig. 20 is a diagram showing another example of the focus athlete display tracking control flow on the camera side.
Fig. 21 is a diagram showing another example of the focus athlete display tracking control flow on the camera side.
Fig. 22 is a diagram showing a server-side player foul detection flow.
Fig. 23 is a diagram showing a flow of arrival array (try) determination control on the server side.
Fig. 24 is a diagram showing a flow of hit matrix judgment based on the ball position and the action of the referee.
Fig. 25 is a diagram showing a flow of decision of the hit matrix and the score according to the screen display.
Fig. 26 is a diagram showing a flow of hit matrix judgment from audio information.
Fig. 27 is a diagram showing a flow of hit determination control on the camera side.
Fig. 28 is a diagram showing a server-side player foul decision control flow.
Fig. 29 is a diagram showing a server-side player infraction determination flow based on the action and voice of the referee.
Fig. 30 is a diagram showing a foul judgment flow on the camera side.
Fig. 31 is a diagram showing an action of the referee with respect to the hit matrix judgment.
Fig. 32 is a diagram showing an example of a detection control flow of an attention athlete including a substitution.
Fig. 33 is a diagram showing an example of a focused athlete detection control flow.
Fig. 34 is a diagram showing an AF display example of a camera display unit of a focused athlete.
Fig. 35 is a diagram showing an example of attention athlete detection control and AF flow.
Fig. 36 is a diagram showing another example of attention to an athlete detection control and AF flow.
Fig. 37 is a diagram showing a display example of the camera display unit at the time of automatic tracking.
Fig. 38 is a diagram showing a display example of a camera display unit of a focused athlete at the time of automatic tracking.
Fig. 39 is a diagram showing an example of an attention athlete detection control flow at the time of automatic tracking.
Fig. 40 is a diagram showing an example of an attention athlete detection control flow at the time of automatic tracking.
Fig. 41 is a diagram showing an example of a focus athlete change detection control flow.
Fig. 42 is a diagram showing an example of a focus athlete change detection control flow.
Fig. 43 is a diagram showing an example of a substitute athlete identification control flow.
Detailed Description
Embodiments for implementing the present invention will be described using examples.
First, an outline of a system using an image processing apparatus supporting photography and video monitoring will be described using fig. 1.
In fig. 1, the server (image processing server) side having a plurality of cameras (fixed cameras, mobile cameras using an unmanned aerial vehicle, or the like) for the server tracks the position of a concerned player (specific object) in the entire field of the arena and the latest situation of the race in real time. In addition, an example is also shown in which a server provides information necessary for, for example, camera shooting or image monitoring to terminals carried by respective viewers in time.
Typically, professional photographers and general photographers may be in locations where they cannot recognize or track from the angle or field of view they are taking with the camera. The same applies to spectators who are outside the arena and do not take a picture. In contrast, the server-side system can grasp and map the information (coordinate information of the field, etc.) of the entire game field and the omnidirectional video in advance based on the videos from the plurality of cameras for the server.
Therefore, information that is difficult for an individual user to see and understand can be grasped and distributed by the server to significantly improve the service to the audience.
In other words, a plurality of cameras (fixed camera and mobile camera) for the server may be used to track the position, score, foul, judgment result of referee, and other latest conditions for each player. Further, the data may be analyzed by the server based on information displayed on a large screen or the like. Therefore, the whole situation can be accurately identified, and information can be timely transmitted to terminals (such as camera terminals, smart phones, tablet computers and the like) owned by professional photographers and audiences. Therefore, the audience can grasp the latest situation of the match in time. In particular, although professional photographers need to immediately send a shot photograph to a news editor or the like, it is difficult to accurately grasp the entire situation of a game or the like only by observing the camera screen because the field of view is small.
However, if the configuration of the present example is used, the situation of a game or the like can be quickly known, and a photo to be sent to a news editor or the like can be quickly selected.
Further, as a terminal (image processing apparatus) used by professional photographers and viewers, a digital camera, a smart phone, a configuration for connecting a camera and a smart phone, a tablet PC, a TV, or the like is conceivable. Viewers watching the game using terminals (image processing apparatuses) such as PCs, TVs, etc. at home can be provided with the same service through the internet or TV broadcasting, so that the viewers can grasp the situation of the game more accurately and enjoy the game with more fun.
In fig. 1, reference numerals 101 to 103 denote cameras for servers, and 101 (fixed cameras 1), 102 (fixed cameras 2), 103 (fixed cameras 3), 104 (large screen), 110 (server), 111 (input section), and 112 (base station) perform video acquisition, audio acquisition, and the like to provide information to general professional photographers and viewers. Although the number of cameras for the server is three (including the cameras 101 to 103) in the present embodiment, one or more cameras may be used. In addition, these cameras for the server may not be fixed cameras, but may be, for example, cameras installed in a drone or the like. In addition to video acquisition and audio acquisition, input information (e.g., audio information, etc.) other than video may be input from an input section to expand services to general professional photographers, viewers, and the like.
Reference numeral 105 denotes a wired or wireless LAN or the internet, and 106 denotes a connection line for inputting information output from the input section 111 to the server 110. Reference numeral 107 denotes a connection line for transmitting/receiving signals to/from the base station 112, and 108 denotes an antenna unit of the base station for performing wireless communication.
In other words, the 100-series blocks are blocks for supporting a professional photographer, a general viewer, or the like to perform video shooting or the like.
Meanwhile, in fig. 1, reference numerals 401 (terminal 1), 402 (terminal 2), and 403 (terminal 3) denote terminals, and for example, denote video display terminal devices such as cameras, smart phones, tablet PCs, TVs, and the like for professional photographers and viewers to photograph and monitor. Here, reference numerals 404 (antenna), 405 (antenna), and 406 (antenna) are antennas of 401 (terminal 1), 202 (terminal 2), and 203 (terminal 3), respectively, for performing wireless communication.
When the server detects the position of the athlete of interest, for example, ID information of the athlete of interest or the like is transmitted from the terminal to the server side, and various information such as position information about the athlete is transmitted from the server side to the terminal. Since the athlete is moving and the situation of the game is changing, a process of detecting the concerned athlete in a short time is required. Therefore, the wireless communication in this case uses, for example, 5G or the like.
Further, 401 (terminal 1), 402 (terminal 2), and 403 (terminal 3) may have a configuration in which a camera, a smartphone, or the like is connected in combination. On the lower right side of fig. 1, reference numeral 301 denotes a smartphone that mainly controls communication with a server. In addition, if application software is installed in the smartphone, various video acquisition services can be implemented. In addition, reference numeral 300 denotes a (digital) camera, which is an image processing apparatus that allows a professional photographer or a viewer to take a photograph or monitor an image. Here, the camera 300 is connected to the smartphone 301 through USB or bluetooth (registered trademark). Reference numeral 320 denotes an antenna of the smartphone 301 for wireless communication with the base station 112.
Further, although the terminal exchanges video and control signals with the server in a wireless manner in the case where the terminal is a smartphone or the like, a connection for communicating with the terminal may be adaptively used in wireless communication and wired communication. The connection may be controlled such that, for example, if the wireless communication environment is 5G, wireless communication is performed, if the wireless communication environment is LTE, information having a large amount of data is wire-communicated, and a control signal having a small amount of data is wirelessly communicated. Further, the connection may be switched to wired communication according to the degree of congestion of the wireless communication line.
Next, the block configuration on the server side will be described in detail using fig. 2. The same reference numerals in fig. 2 as in fig. 1 denote the same constituent parts, and a description thereof will be omitted.
Reference numeral 201 denotes an ethernet (registered trademark) controller, and reference numeral 204 denotes a detection unit that detects a race position from a player's character (so-called position). Here, the player's character (position) is set in advance by registration or the like. As characters (positions) in football, for example, 1 and 3 are called pillar fronts, 2 are called hook players, 4 and 5 are called lock players, 6 and 7 are called wing fronts, 8 is called number 8, 9 is called forward, and 10 is called take-over. In addition, 11 and 14 are called side fronts, 12 and 13 are called front fronts, and 15 is called backmost.
For athlete positioning, typically when the attack is engaged securely, the front is in front of the attack and the back is behind the attack.
In other words, since the approximate position of the player is determined according to the role (position) of the player, the player of interest can be tracked efficiently and accurately by knowing the role (position) of the player of interest and tracking the player.
In addition, the player's character is typically identified by a uniform number. However, in certain situations, a 10 player may be injured, a 15 player may be acting as a take-over (into the position of the 10 player), and a replacement player may be into the position of the 15 player. Here, the uniform number of the replacement athlete may be any one of 16 to 23. However, the position is not always confirmed only with the uniform number. Therefore, although the detection unit of 204 detects a race position from the preset characters of players and information of the detected race position is input to the CPU211 in the server 110, the preset characters may be changed due to replacement of players or the like in the race.
Reference numeral 205 denotes a contour information detection unit, and for example, when professional photographers and viewers take videos at their positions and angles at the magnification of a camera while monitoring the videos using their terminals, the server 110 notifies the positions of the concerned players to each of the terminals 401 to 403 and the like. Further, when the server 110 notifies each of the terminals 401 to 403, etc., of contour information of the concerned player being photographed, each of the terminals 401 to 403 can more reliably identify the concerned player. The contour information detected by block 205 is sent to the CPU 211.
Reference numeral 206 denotes an athlete face recognition unit that finds an athlete from a video using AI, particularly an image recognition technique such as deep learning, based on previously registered face photograph information of the athlete concerned. Information of the face recognition result detected by the face recognition unit 206 is input to the CPU 211.
Reference numeral 207 denotes a physique recognition unit of an athlete which finds a focused athlete using the above-described image recognition technique based on physique information of the athlete registered in advance.
Reference numeral 208 denotes a uniform number detection unit of the player, which finds the concerned player using the above-described image recognition technique based on a number (uniform number or the like) registered in advance. In addition, it goes without saying that when the number of the athlete is to be detected, not only the number on the back side of the bib but also the number on the front side can be detected. Reference numeral 209 denotes a positional information creation unit that recognizes the position, direction, and angle of view of each camera from positional information of the cameras 101, 102, and 103 obtained using GPS and information on the direction and angle of view of the camera. Further, information on the absolute position of the player on the playing field is acquired based on the video from each camera using a triangulation method.
The position information creation unit 209 may acquire, from the video, the position on the screen of a stick, a line of the field of play (e.g., a sideline or a bottom line), or the like as a reference index for reference position detection installed in advance in the field of play. The absolute position of the concerned player on the field of the arena can then be obtained using the bar, line, etc. as reference coordinates.
Reference numeral 210 denotes a camera position information/direction detection unit that detects the position of each terminal, the direction in which the camera of each terminal faces, and the angle of view from the position information, the direction information, and the angle of view information of each terminal transmitted from each of the terminals 401 to 403.
Reference numeral 211 denotes a Central Processing Unit (CPU) serving as a computer, which is a central arithmetic processing device that performs control described in the following example based on a computer program for control stored in a program memory 712 serving as a storage medium. Further, the CPU also functions as a display control section that controls information displayed on a display unit 214 to be described below. Reference numeral 213 denotes a data memory that stores various data referred to by the CPU 211. The data storage 213 stores information on past competitions, information on past players, information on today competitions (competitions), information on the number of audiences, weather, etc., information on players who pay attention to, the current situation of players, etc. The information concerning the athlete includes information on their face, uniform number, build, and the like.
Reference numeral 1101 denotes a data bus inside the server 110.
Next, the terminal side serving as an image processing apparatus will be described in detail using fig. 3 and 4. Fig. 3 and 4 are block diagrams showing a configuration example of a terminal, which shows the overall configuration of a digital camera 500 as an example of a terminal using two figures.
The digital cameras shown in fig. 3 and 4 can capture moving images and still images and record the captured information. Further, although both fig. 3 and 4 show a Central Processing Unit (CPU)318, a program memory 319, and a data memory 320, the units are the same block and include only one CPU, program memory, and data memory.
In fig. 3, reference numeral 301 denotes an ethernet (registered trademark) controller. Reference numeral 302 denotes a storage medium that stores moving images and still images taken using a digital camera in a predetermined format.
Reference numeral 303 denotes an image sensor serving as an image device such as a CCD or a CMOS, which converts an optical signal of an optical image into an electric signal, and further converts analog information of the image information into digital data and outputs the data. Reference numeral 304 denotes a signal processing unit that performs various corrections such as white balance correction or gamma correction on the digital data output from the image sensor 303 and outputs the corrected data. Reference numeral 305 denotes a sensor driving unit that drives horizontal/vertical lines for reading information from the image sensor 303, and controls the timing at which the image sensor 303 outputs digital data, and the like.
Reference numeral 306 denotes an operation unit input section. The input is performed by selecting or setting various conditions for photographing with the digital camera, or according to a trigger operation for photographing, a selection operation for using a flash, an operation for replacing a battery, or the like. Further, the operation unit input section 306 may select/set whether or not to perform Autofocus (AF) on the focused player based on the position information from the server. Information for selecting/setting whether or not Autofocus (AF) is to be performed on the athlete of interest is output from the operation unit input section 306 to the bus 370.
Further, the operation unit input section 306 may select/set whether or not the focused athlete is to be automatically tracked based on the position information from the server. Information on which player is to be designated as the focused player (specific object), whether automatic tracking of the focused player is to be performed based on the position information from the server, and the like is generated by the operation unit input section 306 serving as the selection section. In other words, the operation unit input section 306 functions as a specification information generation section that generates specification information on a specific object.
Reference numeral 307 denotes a wireless communication unit that functions as a transmitting/receiving section to wirelessly communicate a camera terminal owned by a professional photographer, general audience, or the like with the server side. Reference numeral 308 denotes a magnification detection unit that detects the photographing magnification of the digital camera. Reference numeral 309 denotes an operation unit output section for displaying UI information such as a menu or setting information on an image display unit 380 that displays information captured by the digital camera or the like. Reference numeral 310 denotes a compression/decompression circuit, digital data (raw data) from the image sensor 303 is expanded by the signal processing unit 304, and the compression/decompression circuit 310 converts the data into a JPEG image file or a HEIF image file, or compresses the data into raw data without change so as to be a raw image file.
Meanwhile, when a raw image file is developed in a camera to generate a JPEG image file or an HEIF image file, a process of decompressing compressed information to return the file to raw data is performed.
Reference numeral 311 denotes a face recognition unit that refers to face photograph information of a concerned player registered in advance in a server to find the player from a video by image recognition using AI (in particular, deep learning or the like). Information on the face recognition result detected by the face recognition unit 311 is input to the CPU 318 via the bus 370.
Reference numeral 312 is a physique recognition unit that refers to physique photograph information of the athlete of interest registered in advance in the server to find the athlete of interest from the video using the above-described image recognition technology.
Reference numeral 313 denotes an athlete uniform number detection unit which finds a focused athlete using the uniform number (which may also be a face number) of the athlete using the above-described image recognition technique. Reference numeral 314 denotes a direction detection unit that detects a direction in which a lens of the terminal faces. Reference numeral 315 denotes a position detection unit that detects position information of the terminal using, for example, GPS or the like.
Reference numeral 316 denotes a power management unit which detects a power state of the terminal and supplies power to the entire terminal after detecting that the power button is pressed in a state where the power switch is turned off. Reference numeral 318 denotes a CPU serving as a computer that performs control described in the following example based on a computer program for control stored in a program memory 319 serving as a storage medium. Further, the CPU also functions as a display control section that controls image information to be displayed on the image display unit 380. Further, the image display unit 380 is a display unit using liquid crystal, organic EL, or the like.
The data storage 320 stores setting conditions of the digital camera, and stores still images and moving images photographed, attribute information of the still images and the moving images, and the like.
In fig. 4, reference numeral 350 denotes a photographic lens unit including a first fixed group lens 351, a zoom lens 352, a diaphragm 355, a third fixed group lens 358, a focus lens 359, a zoom motor 353, a diaphragm motor 356, and a focus motor 360. The first fixed group lens 351, the zoom lens 352, the diaphragm 355, the third fixed group lens 358, and the focus lens 359 constitute a photographing optical system. Further, although each of the lenses 351, 352, 358, and 359 is illustrated as one lens, they may include a plurality of lenses.
Further, the photographing lens unit 350 may be configured as an interchangeable lens unit detachable from the digital camera.
The zoom control unit 354 controls the operation of the zoom motor 353 and changes the focal length (angle of view) of the photographic lens unit 350. The diaphragm control unit 357 controls the operation of the diaphragm motor 356 and changes the aperture diameter of the diaphragm 355.
The focus control unit 361 calculates the defocus amount and defocus direction of the photographing lens unit 350 based on the phase difference between a pair of focus detection signals (a image and B image) obtained from the image sensor 303. Further, the focus control unit 361 converts the defocus amount and the defocus direction into the driving amount and the driving direction of the focus motor 360. The focus control unit 361 controls the operation of the focus motor 360 based on the driving amount and the driving direction to drive the focus lens 359, thereby controlling focusing (focus adjustment) of the photographic lens unit 350.
As described above, the focus control unit 361 performs phase difference detection type Auto Focus (AF). Further, the focus control unit 361 may perform contrast detection type AF to search for a peak value of the contrast of the image signal obtained from the image sensor 303.
Reference numeral 371 denotes a tracking unit for tracking the concerned player by the digital camera itself. Tracking as referred to herein means, for example: the display of a frame around the concerned player is moved within the screen, and the concerned player who is tracking is focused on, or the exposure is adjusted.
Next, an example of focusing on the player display start sequence will be described using fig. 5. The sequence is performed by the server 110 and the camera 500. Fig. 5 (a) shows a sequence in which the server 110 side answers a question (request) on the camera 500 side. Further, the server 110 side provides information on the absolute position of the concerned player to the camera 500 side.
The camera 500 notifies the server 110 of attention to player specifying information (ID information such as a uniform number or a player name). At this time, the user may touch the position of the focused athlete on the screen of the terminal, or may keep his or her finger in contact with the screen and surround the focused athlete with the finger.
Alternatively, the user may touch the name of the concerned player on a list of a plurality of players in a menu displayed on the screen, or may cause a character input screen to be displayed on the screen to input the name or uniform number of the player.
At this time, the user may touch the face position of the concerned player on the screen to recognize the face image or the uniform number, so that the name, the uniform number, and the like of the player may be transmitted. Alternatively, the server side may recognize the image by transmitting the face image to the server without performing image recognition. In addition, if a predetermined password exists in this case, the password may be transmitted to the server.
The server side transmits information on the absolute position of the player to the camera based on the attention player specifying information (ID information such as a uniform number or the name of the player) using the block supporting video capturing. If a password is transmitted from the camera, the content of information to be transmitted to the camera is changed according to the password.
Fig. 5 (B) shows another focused player display start sequence. The camera notifies the server of position information of a camera currently being used by a professional photographer or general viewer for photographing, a direction of the camera, a magnification of the camera, and focused player specification information (a specified uniform number or a name of a player).
The server side creates a free viewpoint video using the position information of the camera, the direction of the camera, and the magnification of the camera. Further, the server side transmits position information indicating the position of the player in the video actually seen by the camera, and contour information of the player photographed by the camera to the camera based on the focused player specifying information (the specified uniform number or the name of the player).
The camera displays the concerned player more accurately and clearly on the screen of the display unit of the camera based on the position information and contour information transmitted from the server, and performs AF and AE for the concerned player.
Further, when a building or the like to be noticed is specified, for example, the server may transmit outline information of the building to the camera instead of notifying the camera of the focused player specification information (the specified uniform number or the player's name).
Although an example of finding a focused athlete showing a start sequence of the focused athlete has been briefly described, the terminal side may wish to continue tracking the athlete. Therefore, the focus athlete display tracking sequence will be described using (a) of fig. 6. In fig. 6 (a), the camera 500 serving as a terminal periodically makes a query (request) to the server 110 a plurality of times to continuously check the position of the player.
In fig. 6 (a), for the position information of the athlete, ID information of the athlete of interest is transmitted from the camera 500 to the server 110 to temporarily place the athlete of interest in the field of view of the camera. Thereafter, by continuously repeating the above-described "start of camera display of the focused player", it is possible to realize "tracking of camera display of the focused player". Specifically, the operation of identifying the position of the focused player is repeated a plurality of times by periodically transmitting a focused player display start sequence (a1, B1..) from the camera to the server and periodically receiving a focused player display start sequence (a2, B2.....) from the server.
Further, in fig. 6 (B), for the position information of the player, the ID information of the player of interest is transmitted from the camera 500 to the server 110, and the position information of the player of interest is first acquired from the server. Then, after the focused athlete is placed in the field of view of the camera with reference to the position information, the camera 500 proceeds to self-track the focused athlete by image recognition. In fig. 7 (a), the camera 500 further self-tracks the concerned player through image recognition in the concerned player display tracking sequence of fig. 6 (B); however, when the athlete is not seen thereafter, the device side requests the server for position information concerning the athlete.
Specifically, when the focused player is not seen by the camera, the camera transmits a focused player display start sequence to the server (a1), and receives a focused player display start sequence from the server (a2), thereby identifying the position of the focused player. Fig. 7 (B) shows the following case: the server 110 further predicts that the focused athlete is likely not seen by the camera 500 in the focused athlete display tracking sequence of fig. 6 (B).
In other words, the figure shows a push type control in which, when it is predicted that tracking will fail, position information of the concerned player is notified without waiting for a request from the camera 500. Thus, professional photographers and general viewers can continuously detect the position of the concerned player on the display unit of the camera which is very easy to use, and for example, the number of missing photo opportunities can be greatly reduced. Here, the case where the professional photographer or the general audience does not see the concerned athlete includes: a situation where the player is in a crowd robbing (maul), a free dense struggle (rock) or a side by side struggle (scrum) and is therefore not visible from the outside (a situation where the player is in a blind spot), or a situation where the player is not visible from the direction of a certain camera.
Further, while an example of a service that assists professional photographers and viewers in photography has been described, this example may be used for remote camera control. By sending information from the server, the athlete can be tracked and photographed at a decisive moment using a remote camera mounted on the automatic head.
In addition, although the present example is described using an example of a photographing assistant, the terminal may be a home TV. In other words, the viewer who is watching the TV designates the focused player, and the server transmits the position information of the focused player and the like to the TV, so that the focused player can be clearly displayed using the frame display and the like. Further, the focused athlete may be indicated by a cursor (e.g., an arrow, etc.) other than a frame, or the color or brightness of a region of the focused athlete's position may be different from other portions. If the concerned player is outside the screen of the terminal, an arrow or a character may be used to display the direction in which the terminal deviates from the screen.
In addition, if the player is outside the displayed screen, the length or thickness of the arrow, the number, the scale, etc. may be used to display how far the player is away (away) from the current viewing angle of the terminal, how much the terminal needs to be rotated to bring the player in focus on the displayed screen, etc.
Further, if the focused player is inside the screen, control may be performed such that additional information is displayed on the screen, and if the focused player moves outside the screen, the user may select not to display the player outside the screen with an arrow or the like.
Alternatively, by automatically determining the situation of the game, even if the player moves out of the screen in the case where the concerned player goes to a substitution table or the like, the concerned player outside the screen can be displayed without using an arrow or the like. Usability can be further improved if the user is allowed to select a mode in which the display of the additional information is automatically turned off and a mode in which the display is not turned off.
Next, details of the control sequence of the camera side of fig. 7 will be described using (a) and (B) of fig. 8. Fig. 8 (a) and (B) show the attention athlete display tracking control flow on the camera side.
In fig. 8 (a), S101 denotes initialization. In S102, it is determined whether or not shooting is selected, and if shooting is selected, the process proceeds to S103, and if shooting is not selected, the process proceeds to S101. In S103, camera setting information is acquired. In S104, it is determined whether or not the shooting (designation) attention athlete is selected, and if the shooting attention athlete is selected, the process proceeds to S105, and if the shooting attention athlete is not selected, the process proceeds to S110 to perform other processes. In S105, if there are information of the concerned player (ID information of the concerned player, etc.) and a password, the information and the password are transmitted from the camera to the server. Therefore, the server side detects the position information of the concerned player and transmits the information to the camera. In S106, position information of the concerned player and the like are received from the server.
In S107, the camera tracks the concerned player by itself while referring to the position information transmitted from the server. Here, the camera tracks the athlete of interest by itself performing, for example, image recognition. In this case, the athlete is tracked based on the recognition result of any one of or a combination of the uniform number of the athlete, facial information of the athlete, physique of the athlete, and the like. In other words, an athlete of interest is tracked by identifying an image of a portion or the entire shape of the athlete. However, when the user's shooting position is not good, when the field of view of the camera is narrow, or when the athlete is hidden behind another object due to the shooting angle or the like, the athlete may not be seen, and if the athlete is not seen, a request for position information may be transmitted to the server again.
S107-2 shows an example of the marker display as additional information for the concerned player. In other words, as the additional information, a cursor indicating the attention player is displayed, a frame is displayed at the position of the attention player, the color or brightness of the position of the attention player is changed to be conspicuous, or a combined display of these is performed. In addition to the indicia, the display may be in character. Further, when the image display unit displays a live view image from the image sensor, additional information indicating a position may be superimposed on the attention player.
The flow of S107-2 for displaying the mark is illustrated in (B) of fig. 8, and will be described below. Further, the user may choose to skip the tracking operation of S107 described above without performing tracking. Alternatively, it may be provided to select the following modes: the tracking operation is performed when the concerned player is present on the screen, and is not performed when the player is out of the screen. Further, when the situation of the game is automatically determined, for example, when the concerned player enters a substitution seat or the like, the tracking operation (displaying additional information such as an arrow or the like) for the concerned player outside the screen may be automatically stopped.
Alternatively, whether inside or outside the screen, if the server knows that the attention athlete has entered a substitution table, the display of the position of the attention athlete, the autofocus for the attention athlete, the automatic exposure adjustment for the attention athlete on the screen may be stopped.
It is determined in S108 whether the follow-up tracking of the focused player is OK (successful), and if the follow-up tracking of the focused player is successful, the process proceeds to S107, and the tracking of the focused player is continued by the camera itself, whereas if the follow-up tracking of the focused player is unsuccessful, the process proceeds to S109.
It is determined in S109 whether the shooting by the attention athlete is ended, and if the shooting by the attention athlete is ended, the process proceeds to S101. If the focused player continues to be photographed, the process proceeds to S105, information of the focused player is transmitted to the server again, the information of the focused player is received from the server in S106, the position of the focused player is identified again, and photographing of the focused player continues. In other words, if tracking fails, the result of S108 is no, and in this case, tracking is continued, and the process returns to S105 to request the location information from the server.
Fig. 8 (B) shows a flow of attention athlete mark display control on the camera side. In S120, the relative position of the concerned player on the display unit is obtained by calculation. In S121, additional information indicating a position or the like is superimposed on the attention player while the image display unit displays a live view image from the image sensor.
In the above example, the server 110 reads the video of the entire playing field and obtains the coordinates, and is therefore also able to grasp from the videos taken by professional photographers and spectators where they took the playing field.
In other words, the server grasps the video of the entire game field from a plurality of cameras (fixed camera and mobile camera) for the server in advance. Therefore, it is possible to map absolute position information of the athlete of interest in the field into videos that professional photographers and viewers watch through the terminal and the digital camera.
In addition, when a terminal such as a camera of a professional photographer, a viewer, receives information of the absolute position of the athlete from a server, the information of the absolute position may be mapped into a video that is being photographed or monitored now.
Further, assume that the information of the absolute position of the concerned player within the field from the server is (X, Y). It is necessary to convert the information of the absolute position into relative position information (X ', Y') viewed from the cameras based on the position information of the respective cameras. The conversion of the information of the absolute position into the relative position information may be performed by the camera side (as in S120), or may be performed by the server side, and then the relative position information may be transmitted to each terminal (camera, etc.).
If the conversion is performed by a terminal such as a camera, information (X, Y) of an absolute position transmitted from the server is converted into relative position information (X ', Y') according to position information of each camera obtained using GPS or the like. The positional information within the display screen on the camera side is set based on the relative positional information.
On the other hand, if the conversion is performed by the server, the server converts the information (X, Y) of the absolute position into the relative position information (X ', Y') of each camera based on the position information of each camera obtained using GPS or the like. The server transmits the relative position information to each camera, and the camera that has received the relative position information sets the relative position information as position information within the display screen on the side of each camera.
As described above, the situation in which the focused athlete is not seen by the professional photographer and the terminals of the audience such as the cameras is reduced, and thus a good picture of the focused athlete can be taken without missing the opportunity.
Further, another example of the control sequence based on the camera side of fig. 7 is shown in fig. 9. Fig. 9 shows another example of the focused athlete display tracking control flow on the terminal side such as a camera. In fig. 9, S101, S102, S103, S104, S105, S106, S107-2, and S110 are used for the same control as in fig. 8, and thus a description thereof will be omitted.
It is determined in S131 whether the continuation tracking of the focused player is OK (successful), and if the continuation tracking of the focused player is successful, the process proceeds to S134, and if the continuation tracking of the focused player is unsuccessful, the process proceeds to S132. It is determined in S132 whether the shooting by the attention athlete is ended, and if the shooting by the attention athlete is ended, the process proceeds to S133. If the focused player continues to be photographed, the process proceeds to S105, information of the focused player is transmitted to the server again, the information of the focused player is received from the server in S106, the position of the focused player is identified again, and photographing of the focused player continues. It is determined in S133 whether the server detects the position of the focused player, and if the server detects the position of the focused player, the process proceeds to S106, and if the server does not detect the position of the focused player, the process proceeds to S101.
It is determined in S134 whether the server detects the position of the focused player, and if the server detects the position of the focused player, the process proceeds to S106, and if the server does not detect the position of the focused player, the process proceeds to S107.
Next, the tracking unit 371 of the digital camera will be described using fig. 10.
Fig. 10 is a block diagram showing a functional configuration example of the tracking unit 371 of the digital camera. The tracking unit 371 includes a matching unit 3710, a feature extraction unit 3711, and a distance map generation unit 3712. The feature extraction unit 3711 specifies an image region (object region) to be tracked based on the position information transmitted from the server.
Further, a feature value is extracted from the image of the subject region. Meanwhile, the matching section 3710 refers to the extracted feature values to search for a region having a high similarity to the subject region of the previous frame as the subject region within the captured images of the respective frames supplied in succession. Further, the distance map generating unit 3712 can acquire information of the distance of the object from the pair of parallax images (the a image and the B image) from the image sensor, and thus the accuracy with which the matching unit 3710 specifies the object region can be improved. However, the distance map generating unit 3712 is not necessarily provided.
When the matching section 3710 searches for a region having a high similarity to the subject region as the subject region based on the feature value of the subject region in the image supplied from the feature extraction section 3711, for example, template matching, histogram matching, or the like is used.
Next, the flow of attention athlete detection control on the server side will be described using fig. 11 and 12.
The server performs image recognition on the athlete of interest based on ID information of the athlete of interest or the like transmitted from a terminal such as a camera. The server detects position information of the athlete based on video from a plurality of cameras (fixed cameras, moving cameras, etc.) used for the server, and transmits the position information of the athlete to camera terminals of professional photographers and viewers, and the like.
In particular, if there is position information of the concerned player provided from the server when the professional photographer and the viewer take a picture, the concerned player can be reliably photographed without error. Further, when the camera tracks a player of interest and the player is not seen due to a blind spot or the like, information from the server is also important. Further, the server side continuously detects position information of the athlete based on video from a plurality of cameras for the server.
Terminals such as cameras owned by professional photographers and general viewers transmit ID information of the concerned player to a server, and track the concerned player based on position information acquired from the server. Meanwhile, terminals such as cameras owned by professional photographers and general viewers can detect the positions of players of interest by themselves.
Fig. 11 shows the main flow of the attention athlete detection control on the server side.
In fig. 11, initialization is first performed in S201. Next, it is determined in S202 whether or not shooting is selected in the camera, and if shooting is selected, the process proceeds to S203 to acquire camera setting information. At this time, if the camera setting information includes the password, the password is acquired. If photographing is not selected, the process proceeds to S201. It is determined in S204 whether or not the photographing (designation) focused player is selected, and if the photographing focused player is selected, the process proceeds to S205, and the server receives ID information (e.g., the name or uniform number of the player, etc.) of the focused player from the camera. If the photographing-focused athlete is not selected in S204, the process proceeds to S210 to perform other processes.
In S206, the server finds the concerned player in the screen by image recognition based on videos from a plurality of cameras (fixed cameras, moving cameras, and the like) based on the ID information of the concerned player. In S207, the server tracks the attention athlete based on the video from the plurality of cameras. In S208, it is determined whether the follow-up tracking of the concerned player is OK (success), and if the follow-up tracking of the concerned player is successful, the process returns to S207 to continue tracking the concerned player based on the information from the plurality of cameras. If the follow-up tracking of the concerned player is not successful in S208, the process proceeds to S209.
It is determined in S209 whether the shooting of the focused player ends, and if the shooting of the focused player ends, the process returns to S201, and if the shooting of the focused player continues in S209, the process returns to S206. Then, the server searches information from a plurality of cameras (fixed camera and mobile camera) based on the ID information of the athlete of interest to find the athlete of interest, and continues to track the athlete of interest based on videos from the plurality of cameras in S207.
Next, an example of the above-described method of finding a focused athlete in the SS206 and tracking the athlete in S207 will be described using fig. 12.
Fig. 12 shows a flow of attention athlete detection control of the server using uniform number information. In fig. 12, in S401, the server acquires a uniform number from the data storage 213 based on ID information of a player of interest, searches for the uniform number from video information of a plurality of cameras for the server through image recognition, and acquires position information of the player having the uniform number. In S402, information of the absolute position of the athlete of interest is further acquired by combining the position information acquired from the videos of the plurality of cameras for the server.
By combining the information for the plurality of cameras of the server as described above, the accuracy of the information of the absolute position of the player having a certain uniform number is improved. In S403, the absolute position of the athlete of interest detected in S402 is transmitted to a terminal such as a camera owned by a professional photographer and a viewer. In S404, it is determined whether or not the focused athlete is continuously tracked, and if the focused athlete is continuously tracked, the process returns to S401, and if the focused athlete is not continuously tracked, the flow of fig. 12 ends.
Further, the uniform number of the concerned player is found using the video from at least one of the plurality of cameras for the server, the information of the size, angle and background (playing field) shown is input, and thus the position information of the concerned player can be acquired. In addition, the uniform number of the concerned player is set to be found using videos from a plurality of cameras for the server as well, information of the size, angle, and background (course) shown is input, and thus the accuracy of the position information of the concerned player can be improved.
Next, an example of another detection method for detecting the position of the athlete of interest will be described using fig. 13.
In this example, it is assumed that the athlete himself allows the position sensor to be installed in clothing such as a uniform, or the athlete wears the position sensor using a belt around his arms, waist, legs, or the like. Further, when the position sensor wirelessly transmits information to the server side using the communication section to generate position information, the server (a plurality of cameras on the side, etc.) recognizes a signal from the position sensor of the player, and the server notifies the camera owned by the professional photographer and general audience of the position information.
Fig. 13 shows a detailed flow of the focused player detection control performed by the server side using the information of the position sensor. In fig. 13, S301 is a process in which the server receives and acquires information of position sensors of an athlete of interest from a plurality of cameras. Each of the plurality of cameras includes a detection section that receives radio waves from the position sensor, detects the direction of the received radio waves and the level of the received radio waves, and acquires these factors as information of the position sensor. The information of the position sensor also includes the direction of the received radio wave and the level of the received radio wave.
In S302, the absolute position of the athlete of interest is detected based on information from the position sensors of the plurality of cameras. The absolute position of the athlete of interest is sent to the camera in S303. It is determined in S304 whether or not to continue tracking the focused player, and if the focused player continues to be tracked, the process proceeds to S301, and if the focused player does not continue to be tracked, the control ends.
In the case of this example, at least one of the plurality of cameras (the fixed camera and the moving camera) has a detection section that detects information from a position sensor owned by the athlete in addition to acquiring images and sounds.
At least one of the plurality of cameras may receive information from a position sensor of the athlete and identify a direction of the received radio waves and a level of the received radio waves.
Although the position of the athlete may be detected based on the detection result of only one camera as described above, in the present example, the information of the position sensor of the athlete is set to be recognized by each of the plurality of cameras. Further, by combining information of the direction of radio waves and the level of radio waves, it is possible to more accurately analyze the position information of the athlete, and the plurality of cameras receive sensor information of the athlete of interest through radio waves.
Next, fig. 14 shows a flow of the focused athlete detection control using the face recognition information on the server side.
The data storage 213 of the server stores a plurality of pieces of information of faces photographed in the past by players registered as players of the competition team. Further, the server has a section for detecting face information of the athlete based on video from cameras for the plurality of servers. Then, the server compares face information of players registered as players of the game and taken in the past with a plurality of photographs based on the face information detected by the cameras for the plurality of servers to recognize faces using, for example, AI and detect players of interest.
In fig. 14, S501 is for the server to acquire a plurality of pieces of face information of the athlete of interest from the data storage 213 based on the ID information of the athlete of interest, and to acquire position information of the athlete of the face information using videos from a plurality of cameras for the server. If an athlete corresponding to face information of the concerned athlete is found using a video from one of a plurality of cameras for a server, information showing the size, angle, and background (course) is input, and thus position information of the concerned athlete can be acquired. Similarly, the player corresponding to the face information of the player of interest is found using a plurality of cameras for the server, and information on the size, angle, and background (course) shown is input, so that the position information of the player of interest can be acquired with high accuracy.
In S502, the absolute position of the player of interest is detected based on the position information of the player of interest acquired from the videos of the plurality of cameras in S501. In S503, the absolute position information of the athlete of interest detected in S502 is transmitted to cameras owned by professional photographers and general viewers. It is determined in S504 whether the focused athlete continues to be tracked, and if the focused athlete continues to be tracked, the process proceeds to S501, and if the focused athlete does not continue to be tracked, the control ends.
Next, a method for detecting the position of the athlete using the physique (body type) of the athlete will be described using fig. 15.
The data storage 213 of the server stores a plurality of physical image information of players registered as players of the competition team who have been photographed in the past. Further, the server has a means for detecting physical information of the athlete based on videos from a plurality of cameras for the server. Then, the server compares the physical information detected from the plurality of cameras for the server with a plurality of physical image information of the player registered as a match player and photographed in the past using, for example, AI, and detects the player.
Fig. 15 shows a detailed flow of the focused athlete detection control performed by the server using the physique (body type) identification information. In S601 of fig. 15, the server acquires a plurality of physique image information from the data storage 213 based on the ID information of the athlete of interest, and acquires position information of the athlete having the physique using video information from a plurality of cameras for the server. If a player corresponding to the physique information of the concerned player is found using a video from one of a plurality of cameras for a server, information of the size, angle and background (course) shown is acquired, and thus position information of the concerned player can be acquired.
Also using videos from a plurality of cameras for the server, an athlete corresponding to a physique image of the concerned athlete is found, information of the size, angle, and background (field) shown is acquired, and thus position information of the concerned athlete can be acquired. By combining the information of a plurality of cameras, the accuracy of the position information of the athlete of interest based on the physical information can be improved.
In S602, the absolute position of the athlete concerned is detected based on the position information of the athlete having the physical information acquired in S601.
In S603, the absolute position of the athlete of interest detected in S602 is transmitted to camera terminals owned by professional photographers and general viewers. It is determined in S604 whether or not the focused athlete is continuously tracked, and if the focused athlete is continuously tracked, the process proceeds to S601, and if the focused athlete is not continuously tracked, the control of fig. 15 ends.
Although position sensor information, uniform number recognition, face recognition, physique recognition, and the like have been described with respect to the athlete, image recognition may be performed on information of uniform (design and color), shoes, the athlete's hair style, the athlete's motion, and the like, so that the accuracy of recognizing the focused athlete may be improved.
Next, fig. 16 is an auxiliary detection method of the attention athlete detection method, which shows a flow of detecting an attention athlete based on a basic character (so-called position) in a game field.
The data storage 213 of the server stores information of the player's character (position) in the playing field. Further, since the position based on the player character is changed according to the position of the ball, information of the position is also stored.
The server detects the current position of the ball from the video of the plurality of cameras and identifies the status of the game (whether the player is attacking or defending). Using this information, the approximate location of the athlete can be easily detected. In other words, the position of the player according to his character is estimated by determining the situation of the game and focusing on the player's character. This determination is made primarily on the server side.
An example of the flow of attention athlete detection control that takes into account an athlete character is shown in fig. 16. In S701 of fig. 16, the server detects position information of the ball based on videos from the plurality of cameras. The position of the player is roughly estimated using the position information of the ball. Further, a zone for searching for a player using facial information is identified according to the player's character (such as front or back) identified by a uniform number. In S702, the server acquires a plurality of pieces of face information of the concerned player from the data storage 213, compares the information with the video information of the plurality of cameras, and acquires position information of the player having the face information.
In S703, the absolute position of the athlete of interest is detected based on the position information of the athlete of interest acquired from the videos of the plurality of cameras in S702. In S704, the absolute position of the athlete of interest detected in S703 is transmitted to cameras owned by professional photographers and general viewers. It is determined in S705 whether or not the focused player continues to be tracked, and if the focused player continues to be tracked, the process proceeds to S701, and if the focused player does not continue to be tracked, the flow of fig. 16 ends.
Here, although the situation of the game (whether a certain team is attacking or defending) is determined based on the position of the ball, for example, the control based on the situation of the game is not limited to the position of the ball. For example, when a team makes a foul, the opposing team is given a penalty for the ball, etc. In this case, the team who obtains the penalty ball is highly likely to advance from the current position of the ball in the game. Thus, control may be based on the situation of a game in which the team is predicted to advance. Such a prediction can be made assuming that the position of the ball is a foul.
As described above, according to the present example, when a professional photographer and a general audience take a picture with a camera, the audience watches a game with their terminal, and the like, the professional photographer and the general audience can be timely notified of the position information of the concerned athlete. Therefore, the professional photographer and the general audience can sequentially recognize the position of the concerned player and satisfactorily photograph the good performance of the concerned player.
Next, an example of a display method when the position information of the focused athlete is displayed on a camera owned by a professional photographer or a general viewer will be described using (a) to (D) of fig. 17.
In this example, in a case where an image display unit of a terminal such as a camera displays a live view image from an image sensor, when position information of a focused player is transmitted from a server, a mark, a cursor, an arrow, a frame, or the like serving as additional information is displayed so as to be superimposed on the position of the focused player. Here, if the attention athlete is outside the screen of the image display unit, the direction in which the attention athlete is located is displayed on the peripheral portion on the screen of the display unit. Viewing this display, it is possible to quickly recognize a direction in which the camera needs to face in order to place a focused player in a picture of a shooting area, in the case where a professional photographer or a general viewer views a picture of a terminal such as a camera owned by him or her.
Fig. 17 (a) shows a display example of the position information of the focused athlete in the video of the display unit of the camera. Reference numeral 3201 denotes a display unit of the camera. Here, if the attention athlete is outside the display screen of the display unit and on the right side of the display screen, a right arrow is displayed near the right side of the screen of the display unit as shown in 3202. In addition, if the attention athlete is outside the display area and on the lower side of the display screen, a downward arrow is displayed near the lower side of the screen of the display unit as shown in 3203.
In addition, if the attention athlete is outside the display area and on the left side of the display screen, a left arrow is displayed near the left side of the screen of the display unit as shown in 3204. In addition, if the attention athlete is outside the display area and on the upper side of the display screen, an upward arrow is displayed near the upper side of the screen of the display unit as shown in 3205. Further, if the focused player is in the diagonally upper right direction, for example, as shown in fig. 17 (B), a diagonally upper right arrow is displayed near the position in the diagonally upper right direction on the screen. Therefore, it can be known that the concerned player is in the obliquely right upper direction as shown in fig. 17 (B).
These arrows help professional photographers and general viewers know the direction in which the camera needs to be moved in order to place a focused athlete in the camera's shooting area when they shoot the focused athlete. Therefore, the professional photographer and the general audience can quickly place the concerned player on the photographed screen even if the player is not seen, and can photograph the concerned player without missing an appropriate shutter timing.
Next, fig. 17 (C) is a diagram showing an example in which the direction and length of an arrow are displayed to indicate the direction and degree to which the camera needs to be moved in order to place the attention athlete in the shooting area. Reference numeral 3401 denotes a display unit of the camera.
Here, if the attention athlete is outside the display area and on the right side of the display screen, a right arrow is displayed near the right side of the screen of the display unit as shown by 3402. If the attention athlete is outside the display area and on the lower side of the display screen, a down arrow is displayed near the lower side of the screen of the display unit as shown by 3403. If the attention athlete is outside the display area and on the left side of the display screen, a left arrow is displayed near the left side of the screen of the display unit as shown by 3404.
If the attention athlete is outside the display area and on the upper side of the display screen, an upward arrow is displayed near the upper side of the screen of the display unit as shown by 3405. Further, in the above description, the degree to which the player is out of the screen (off the screen), in other words, the degree to which the camera needs to be rotated in order to photograph the player of interest, is indicated by the length of the arrow. The more the position of the concerned player deviates from the field of view of the screen, the longer the length of the arrow becomes.
In fig. 17 (C), since the length of the rightward arrow indicated by 3402 is relatively short, it can be seen that the attention athlete can be placed in the shooting area by moving the camera rightward by a small angle.
Also, since the length of the upward arrow shown by 3405 is relatively short, it can be seen that the athlete of interest can be placed in the photographing region as long as the camera is rotated upward by a small angle. Meanwhile, since the length of the downward arrow shown by 3403 is medium, it can be seen that the athlete of interest can be placed in the photographing region by rotating the camera by an angle larger than 3402 and 3405. Further, since the length of the left arrow shown by 3404 is relatively long, it can be seen that the athlete of interest can be placed in the photographing region by rotating the camera by a larger angle than the rotation angle in 3403 in the direction of the athlete of interest.
Therefore, professional photographers and general audiences can easily place the focused player in the shooting area (within the display screen), and can shoot the focused player without missing an appropriate shutter timing.
Next, (D) of fig. 17 is a diagram showing an example of changing the thickness of the arrow while keeping the length of the arrow constant. In other words, if the rotation angle in the shooting area is large, in other words, if the rotation angle of the camera for placing the attention athlete in the shooting area is large, the thickness of the arrow may be increased. Reference numeral 3601 denotes a display unit of the camera. Here, if the attention athlete is outside the display area and on the right side of the display screen, a right arrow is displayed at a peripheral portion on the right side of the screen of the display unit as shown by 3602. In addition, if the attention athlete is outside the display area and on the lower side of the display screen, a downward arrow is displayed at the peripheral portion of the lower side of the screen of the display unit as shown by 3603. Further, if the attention athlete is outside the display area and on the left side of the display screen, a left arrow is displayed on the peripheral portion on the left side of the screen of the display unit as shown by 3604.
In addition, if the attention athlete is outside the display area and on the upper side of the display screen, an upward arrow is displayed on the peripheral portion of the upper side of the screen of the display unit as shown by 3605. Further, in the above description, the rotation angle of the camera is indicated by the thickness of the arrow. The thickness of the arrow increases with increasing rotation angle. In fig. 17 (D), arrows indicated by 3603 and arrows indicated by 3604 to the left are thicker than arrows indicated by 3602 and 3605, and thus it can be seen that the athlete of interest can be placed in the shooting area by rotating the camera by a relatively large angle.
By such a display, professional photographers and general viewers can quickly find out the attention players who cannot see, and can shoot the good performance of the attention players without missing an appropriate shutter timing.
Further, although the arrow and its length and thickness are used to display the direction and amount of deviation of the focused player from the screen in the above example, the example is not limited thereto. For example, instead of the arrow, a text may be used to display only a message indicating that the focused player is not within the screen, such as "the focused player is outside the diagonally upper right side of the screen". In this case, a warning using sound, blinking, or the like may be displayed. Alternatively, "the position is deviated to the right", "the position is deviated 20 degrees to the right in the horizontal direction", etc. may be displayed, a needle (e.g., compass) rotated in the direction of the attention player may be displayed at the edge of the screen, or the degree of deviation from the screen may be displayed at the corner of the screen using a number or a scale.
In other words, the amount of deviation may be displayed by displaying a scale and using a cursor located at the scale position, or the length of the bar may be displayed to vary along the scale according to the amount of deviation.
Next, fig. 18 is a diagram showing another example of the focus athlete display tracking control flow on the camera side.
In fig. 18, steps having the same reference numerals as those of fig. 8, i.e., steps other than S3300, are the same as those of fig. 8, and thus will not be described again. S3300 is to track the athlete of interest by the camera itself. Here, when attention is paid to that the player is outside the area where the camera is shooting, an arrow indicating the player's direction is displayed on the display unit. The detailed flow of S3300 is shown in fig. 19.
In S3311 of fig. 19, the camera receives the absolute position information of the concerned player from the server. In S3312, the camera converts the absolute position information of the athlete of interest into relative position information based on the position, direction, magnification, and the like used for camera shooting. In S3313, the position of the athlete of interest is displayed on the display unit based on the information of the relative position seen from the camera. In S3314, it is determined whether the present time focuses on whether the athlete is outside the shooting area of the camera, that is, outside the screen of the display unit of the camera, and if the athlete is outside the screen, the process proceeds to S3316, and if the athlete is inside the screen, the process proceeds to S3315.
In S3315, an arrow indicating the position of the focused player is not displayed on the display unit of the camera. Instead, a mark such as a frame indicating the position of the concerned player is displayed. In S3316, the position of the focused athlete is displayed in the peripheral portion of the display unit of the camera using an arrow. It is determined in S3317 whether to continue tracking the focused player, and if the focused player continues to be tracked, the process proceeds to S3311, and if the focused player tracking ends, the S3300 flow ends.
Fig. 20 is a diagram illustrating a flow of the display for (C) of fig. 17 in S3300 of fig. 18. In fig. 20, the steps from S3311 to S3515 and S3517 are the same as those in fig. 19, and thus a description thereof will be omitted. In S3516, the position of the focused athlete is displayed at a peripheral portion within the screen of the display unit of the camera using an arrow. Here, the length of the arrow changes according to the rotation angle of the camera placed by the player in the display screen. The arrow becomes longer as the camera rotation angle increases.
Fig. 21 is a diagram illustrating a flow of the display for (D) of fig. 17 in S3300 of fig. 18. In fig. 21, the steps from S3311 to S3515 and S3517 are the same as those in fig. 19 and 20, and thus a description thereof will be omitted. In S3716, the position of the focused player is displayed at the peripheral portion within the screen of the display unit of the camera using an arrow, and the thickness of the arrow is changed according to the rotation angle of the camera in which the player is placed within the display screen. The arrow becomes thicker as the camera rotation angle increases.
Further, although the number of concerned athletes is one in the example, the number of concerned athletes may be plural. Further, assume that the attention athlete is switched in the middle of the game. The concerned player may be all players participating in the game. Further, it is assumed that videos and images include not only moving images but also still images. Furthermore, tracking of an athlete of interest is primarily described. However, in the case of not only tracking a player of interest, information of the player holding or catching the ball may be transmitted to a professional photographer and a viewer and displayed. In addition, although the example is described using an example of tracking an athlete, needless to say, the present invention can also be applied to a system of tracking a person such as a criminal using a plurality of monitoring cameras.
Alternatively, the present invention may be applied to a system for tracking a specific car or the like in racing, a system for tracking a horse in horse racing, or the like, without being limited to tracking a person. Further, although an example in which the attention athlete is specified with a camera terminal or the like has been described in the example, the server side may specify the attention athlete.
Next, a flow of detecting whether the concerned player makes a foul will be described based on fig. 22.
In the flow of fig. 22, for example, in a football game, a penalty box (penalty box) foul or the like is judged based on videos of a plurality of cameras, information of players who temporarily leave is detected, and the server transmits the information to cameras or the like owned by professional photographers and general audiences. In addition, players deemed to be sent to the penalty seat are forced to leave the game for 10 minutes. The penalty varies according to the severity of the infraction at which the player infraction was made, and in the case where a red card means immediate departure, an infraction involving a penalized seat that prohibits the player from participating during 10 minutes, at least temporarily, takes the player out of the field.
A focused player detection control flow (server-side) for detecting whether a focused player has fouled is shown in fig. 22.
In S1001 of fig. 22, the server detects position information of the ball based on videos from a plurality of cameras. The position of the player is roughly estimated using the position information of the ball. Further, the area in which the player is searched using the face information is identified according to the player's character (such as forepart or defender) identified by their uniform number and name for the player who is the player who first goes out, and by the player's information registered in advance, the uniform number of the player who plays the day, and the player's character for the name, uniform number, and character of the replacement player. In S1002, the server identifies a plurality of face information of an attention player including a replacement player, and acquires position information of the player having the face information from video information of a plurality of cameras.
If the face information of the athlete in question including the replacement athlete is found using the video from one of the plurality of cameras, information of the size, angle, and background (field) shown is input, and thus position information of the athlete in question including the replacement athlete can be acquired.
If the face information of the athlete of interest including the replacement athlete is found using the video from the plurality of cameras as well, the information of the size, angle, and background (field) shown is input, and therefore the position information of the athlete of interest including the replacement athlete can be acquired with high accuracy. In S1003, an absolute position of the concerned player including the replacement player is detected based on the input information.
In S1004, the absolute position of the athlete of interest detected in S1003 is transmitted to a camera terminal owned by a professional photographer and a general viewer. It is determined in S1005 whether or not the focused athlete is continuously tracked, and if the focused athlete is continuously tracked, the process proceeds to S1006, and if the focused athlete is not continuously tracked, the flow of fig. 22 ends. It is determined in S1006 whether a penalty of temporary departure has occurred, and if a penalty of temporary departure has occurred, the process proceeds to S1007, and if no penalty of temporary departure has occurred, the process proceeds to S1005.
It is determined in S1007 whether a penalty of temporary departure relates to a red card, and if it is a red card penalty, the process proceeds to S1008, and if it is not a red card penalty, the process proceeds to S1009. The process proceeds to S1009 when a penalty of a penalty agent occurs. At S1008, the server identifies the player getting the red card, excludes the player from the participating players, and updates the information list of the participating players.
At S1009, the server identifies the player sent to the penalty point, excludes the player from the participating players for 10 minutes (10 minutes departure is instructive), and updates the participating players' information list. Here, the player sent to the penalty seat is identified when he or she returns to the field and updates the information list of the participating players. Here, the player sent to the penalty seat is identified when he or she returns to the field, the information list of the participating players is updated, and the process proceeds to S1001.
Further, although the case of the competition (whether a certain team is attacking or defending) is determined according to the position of the ball on the premise that the player's character is equal to the position, for example, the control based on the case of the competition (the case of the competition) is not limited to the position of the ball.
For example, when a team makes a foul, the opposing team is given a penalty for the ball, etc. In this case, the team who obtains the penalty ball is highly likely to advance from the current position of the ball in the game. Thus, control may be based on the situation of a game in which the team is predicted to advance. The position of the ball can be predicted based on the penalty.
Further, at least for the foul of the temporary departure in S1006, the server may identify and detect, for example, the player who has fouled the departure using a plurality of cameras.
In addition, there is a method of recognizing and detecting a call of a referee or others as audio information. In addition, a foul can be detected from foul information displayed on a large screen.
As described above, if professional photographers and viewers know the judgment conditions in real time, they can predict the next position of the ball. In addition, if information of a foul is displayed on the display unit of each camera, the camera terminal side can predict the next assumed position of the ball by viewing the display and take a photograph at a more appropriate shutter timing.
Professional photographers and general viewers are interested in making decisions in the game. A three referee system comprising one referee and two patrolmen is used. There are in particular situations where it is difficult to judge, such as in a scenario where the athlete seems to decide on a hit, or a judgment on a foul in the game, relying only on the human eye. Therefore, when it is difficult to judge by naked eyes, video judgment (television game official or TMO) is performed to support judgment by a referee.
When professional photographers and general viewers photograph an hit scene with their terminals, such as cameras, the professional photographers and general viewers want to immediately know whether an image photographed at that time is recognized as hit or not. Therefore, the determination is accurately recognized by following the subsequent determination from the videos of the plurality of cameras or by the server performing analysis based on the information displayed on the electronic signboard. Then, since the judgment result of the referee is transmitted to terminals (such as cameras) of professional photographers and general audiences, the user can correctly recognize the judgment result in time.
There are several methods to detect whether an arrival array is performed. The array determination control flow (of the server side) which will be described in detail below is shown in fig. 23.
In fig. 23, S1101 denotes initialization. Here, the TRY determination flag is cleared. It is determined in S1102 whether shooting is selected, and if shooting is selected, the process proceeds to S1103, and if shooting is not selected, the process proceeds to S1101. In S1103, camera setting information is acquired. In S1104, a ball used in a game is tracked in videos of a plurality of cameras.
In S1105, it is determined whether the TRY determination flag is 0, and if the TRY determination flag is 0, the process proceeds to S1106, and if the TRY determination flag is not 0, the process proceeds to S1107. It is determined in S1106 whether or not an array is seemingly performed, and if it is seemingly performed, the processing proceeds to S1107, and if it is seemingly not performed, the processing proceeds to S1112. Here, what appears to be reached is: the player brings the ball into the state of the shooting zone.
For example, there are cases where a player passes forward (knock-on) immediately before reaching a shot, or where a defender catches a ball by hand or body in an attempt to prevent landing. In other words, there is a state where the array is not confirmed. In S1107, the TRY determination flag is set to 1. In S1108, it is determined whether or not an array is performed.
Specific examples of the judgment matrix are shown in fig. 24 (a) and (B), fig. 25 (a) and (B), and fig. 26, and these examples will be described later.
It is determined in S1109 whether or not the determination result of the presence or absence of an arrival pattern has occurred in the control of S1108, and if the determination result of the presence or absence of an arrival pattern has occurred, the process proceeds to S1110, whereas if the determination result of the absence of an arrival pattern has not occurred, the process proceeds to S1112. In S1110, the TRY determination flag is set to 0. In S1111, the server transmits information on whether there is an array to a terminal such as a camera owned by a professional photographer and general viewers. It is determined in S1112 whether the race ends, and if the race ends, the process proceeds to S1101, and if the race does not end, the process proceeds to S1104.
The control for determining the presence or absence of an arrival array has been described above. However, the CPU not only performs control, but also may perform attention athlete detecting control shown in fig. 11, for example, in parallel at the same time.
Further, the control performed by the servers simultaneously in parallel is not limited to this, and other plural control operations may be performed simultaneously. Meanwhile, the same is true for terminals such as cameras owned by professional photographers and general viewers, and other plural control operations can be performed at the same time.
When shooting a scene of an array, whether the array succeeds or fails should be correctly identified. The server checks the judgment as to whether the array was successfully reached or successfully added (conversion). Here, examples of the array have been described. However, the control is not limited to the array, and similar control may be performed for other scoring scenarios.
When videos of a plurality of cameras are analyzed and there is a match that seems to have hit an battle, the motion of the ball is analyzed using the plurality of cameras and it is recognized whether the ball actually lands within a predetermined area. The server transmits information about whether there is a hit or not analyzed according to the recognized movement of the ball, together with player information, to terminals such as cameras owned by professional photographers and general audiences.
Fig. 24 (a) shows a flow of determining presence or absence of an array using the movement of the ball on the server side.
In S1201 of (a) of fig. 24, the server detects the position of the ball from the images of the plurality of cameras. In S1202, based on the motion of the ball, it is identified whether or not there is an arrival in a scene regarded as an arrival in the images of the plurality of cameras.
Next, fig. 24 (B) shows a flow of determination of whether or not there is an arrival matrix based on the action of the referee.
When videos of a plurality of cameras are analyzed and there is a game that seems to be hit, the regular-based motions of the referees in the vicinity of the concerned player may then be analyzed using the videos of the plurality of cameras to perform image recognition, and whether hit exists may be recognized based on the motions of the referees.
The server transmits information on the result of the analysis of whether there is an arrival matrix (action recognition result) according to the recognized action of the referee, together with the athlete information, to terminals such as cameras owned by professional photographers and general audiences.
In S1301 of (B) of fig. 24, the server detects, from videos of a plurality of cameras, an action of a referee who is making a motion close to a motion of a judgment matrix. In S1302, it is identified whether or not an array of arrivals exists in a scene regarded as an array of arrivals in videos from a plurality of cameras based on the action of the referee.
Fig. 31 shows the actions of the referee who is judging a hit. Fig. 31 (a) shows the action taken by the referee when the hit is successful. Fig. 31 (B) shows the action taken by the referee when the arrival is unsuccessful.
Further, a flow of identifying the presence or absence of an arrival from information displayed on a large screen of an arena will be described using fig. 25.
When videos of a plurality of cameras are analyzed and there is a match that seems to have made a hit, the plurality of cameras input information to be projected on a large screen of the arena, and whether a hit exists can be identified based on the information on the screen.
The server transmits information about whether there is an arrival battle analyzed from the recognized information on the screen to terminals such as cameras owned by professional photographers and general audiences together with athlete information.
Fig. 25 (a) shows a flow of determination of presence or absence of an array of arrival based on the determination result on the server side displayed on the screen. In S1401 of (a) of fig. 25, the server detects information of the determination result displayed on the screen after the motion that seems to reach the battle from the images of the plurality of cameras. In S1402, it is identified whether or not an array exists in a scene regarded as an array in videos from a plurality of cameras, according to the determination result displayed on the screen.
Next, the flow of recognition of the presence or absence of an array based on the score information displayed on the screen will be described using (B) of fig. 25.
When videos of a plurality of cameras are analyzed and there is a match that seems to have made a hit, score information projected on a screen is input based on images of the plurality of cameras, and whether a hit exists is identified based on the score information on the screen. The server transmits information about whether there is an array of arrivals analyzed from the recognized score information on the screen to terminals such as cameras owned by professional photographers and general audiences, together with athlete information.
If the array is reached, five points are obtained, then the additional shooting is successful, and two points are obtained. In addition, if a penalty or drop kick goal (drop goal) is successful, a third will be achieved. Whether an arrival is successful may be identified by comparing the score before the arrival is considered to have been made with the score after the arrival has been made.
Fig. 25 (B) shows a flow of determination of whether or not there is an array of arrival by the server based on the score information on the screen.
In S1501 of (B) of fig. 25, the server detects score information displayed on the screen after the action that seems to be reached from the images of the plurality of cameras. In S1502, whether or not there is an array in a scene regarded as an array in videos from a plurality of cameras based on a difference in score information displayed on a screen is identified. In S1503, a hit, an additional shot, a penalty, or a drop shot is identified based on a difference in score information on a screen showing whether there is a hit in a scene regarded as a hit in images from a plurality of cameras.
Next, a flow of identifying the presence or absence of an arrival from audio information announced in a venue will be described using fig. 26.
When audio information input from microphones attached to a plurality of cameras (fixed cameras and moving cameras) is analyzed and there is a race that seems to have made a hit, whether a hit exists is identified based on the audio information from the microphones. The server transmits information about the presence or absence of the hit to a terminal such as a camera owned by a professional photographer and general audiences together with athlete information analyzed based on the recognized audio information.
A specific flow of the server determining presence or absence of an arrival matrix using audio information is shown in fig. 26. In S1601 of fig. 26, the server detects audio information collected after the action that seems to be hit from the microphones of the plurality of cameras. In S1602, presence of an array in a scene regarded as an array is identified based on audio information from microphones of a plurality of cameras.
Although the determination of the hit score is described in the above example, an additional score and a penalty score after the hit may be considered in addition to the hit score.
Further, although a flow of detecting whether or not an array is performed on the server side is shown in fig. 23, control on whether or not an array is performed on the terminal side such as a camera will be described.
Fig. 27 shows a flow of array judgment control on the terminal side such as a camera. Since the steps of S101 to S107-2, S109, and S110 in fig. 27 are the same as those in fig. 9, a description thereof will be omitted.
It is determined in S1620 whether the follow-up of the concerned player is OK (success), and if the follow-up of the concerned player is successful, the process proceeds to S1621. It is determined in S1621 whether the reaching determination result is transmitted from the server, and if the reaching determination result is transmitted from the server, the process proceeds to S1622, and if the reaching determination result is not transmitted from the server, the process returns to S107, and the camera itself continues to track the concerned player. In S1622, the array decision result is displayed on the display unit of the camera.
Further, if the continuation of tracking of the concerned player is not successful in S1620, the process proceeds to S109, it is determined whether the shooting of the concerned player is ended, and if the shooting is not ended, the process returns to S105. If the photographing is ended, the process proceeds to S1623, it is determined whether an array of arrival determination result is transmitted from the server, and if the array of arrival determination result is transmitted from the server, the process proceeds to S1624, and the array of arrival determination result is displayed on the display unit of the camera terminal. If the arrival array determination result is not transmitted from the server, the process returns to S101.
As described above, when there appears to be an array, the camera terminal side can display whether the array is successful.
Thus, for example, a general viewer or photographer can correctly recognize evaluation of a taken photograph. Then, the photographer recognizes the judgment only by looking at the display unit of the camera, and can thus appropriately select a photograph to be sent to the publisher, and can be prepared earlier for the next shooting.
Next, judgment of the player's infraction will be described. The advantage given to the opponent's team may vary depending on the level of penalty for the foul. If the infraction is severe, the player will be awarded yellow cards, enter the penalty station, and be prohibited from participating for 10 minutes.
Furthermore, if the infraction is severe and given a red card, the player should leave the field immediately. It is important that the server recognizes the foul with a plurality of cameras, transmits information to terminals such as cameras owned by professional photographers and general audiences, and notifies the professional photographers and general audiences of the foul information together with athlete information from the cameras.
An example server-side player penalty determination control flow for describing a method of detecting whether a penalty occurred is shown in fig. 28.
In fig. 28, S1701 denotes initialization. Here, the determination flag is cleared. Next, it is determined in S1702 whether or not shooting is selected, and if shooting is selected, the process advances to S1703 to acquire camera setting information. If photographing is not selected, the process returns to S1701. In S1704, all players participating in the race are tracked using a plurality of cameras. It is determined in S1705 whether the determination flag is 0, and if the determination flag is 0, the process proceeds to S1706, whereas if the determination flag is not 0, the process proceeds to S7107.
It is determined in S1706 whether the player seems to make a foul, and if the player seems to make a foul, the process proceeds to S1707, where the determination flag is set to 1. If the athlete does not seem to have fouled, the process proceeds to S1712. Here, an offending player refers to a player who is likely to be offended because a game may be deemed to be offended according to the manner in which the player blocks or strikes the opposing player. Furthermore, even if there is a player foul, there is a level of foul. In other words, the level of the player's foul is not confirmed in this example. In S1708, it is determined whether any player has fouled. Fig. 29 (a) and (B) show a specific example of the flow of judging the player infraction, the details of which will be described later.
It is determined in S1709 whether a determination result of the presence or absence of a player infraction has occurred in the control of S1708, and if the determination result of the presence or absence of a player infraction has occurred, the process proceeds to S1710, and if the determination result of the absence of a player infraction has not occurred, the process proceeds to S1712. In S1710, the determination flag is set to 0. In S1711, the server transmits information on whether there is a foul of the player and information on a foul level when the foul occurs to terminals such as cameras owned by professional photographers and general audiences.
It is determined whether the race ends in S1712, and if the race ends, the process proceeds to S1701, and if the race does not end, the process proceeds to S1704.
The control to determine the presence or absence of a foul has been described above. However, this flow is not only for performing control, but other plural control operations may be performed simultaneously or in parallel. Meanwhile, the same is true for terminals such as cameras owned by professional photographers and general viewers, and a plurality of control operations can be performed simultaneously or in parallel on the terminal side.
When videos of a plurality of cameras are analyzed and there is a game in which a foul appears to occur, the motions of the referees are then analyzed using the plurality of cameras, and whether there is a foul can be recognized based on the motions of the referees. The server transmits information on the presence or absence of an offense analyzed according to the recognized action of the referee, together with the athlete information, to terminals such as cameras owned by professional photographers and general audiences.
Fig. 29 (a) shows an example of a server-side player infraction determination flow based on the action of the referee. In S1801 of fig. 29 (a), the server detects, from the videos of the plurality of cameras, the action of the referee regarding the movement that seems to be the foul of the player.
In S1802, it is identified whether or not there is a player infraction in a scene regarded as the player infraction in videos of the plurality of cameras, based on the action of the referee.
Next, an example of a flow of identifying the presence or absence of an infraction from audio information announced in a venue will be described using (B) of fig. 29.
When audio information input from microphones attached to a plurality of cameras is analyzed and there is a race where a foul appears to have occurred, the audio information from the microphones is analyzed to identify whether a foul exists from the audio information.
The server then transmits information on whether there is a foul analyzed from the recognized audio information to terminals such as cameras owned by professional photographers and general audiences together with the athlete information.
Fig. 29 (B) shows a server-side foul judgment flow of the athlete based on the audio information. In S1901 of (B) of fig. 29, the server detects audio information collected after the movement that seems to have been infracted by the player from the microphones of the plurality of cameras. In S1902, whether or not there is a player infraction in a scene deemed to be a player infraction, and a level of infraction in the case of a player infraction are identified based on the audio information.
Although the case of a foul is recognized by a plurality of cameras (fixed camera and mobile camera) when the foul occurs, the information thereof is transmitted to and displayed at a terminal such as a camera owned by a professional photographer and a general audience.
Here, fig. 30 shows a case where the camera-side foul judgment control flow displays a foul on the camera side based on the camera-side hit-and-miss judgment control flow.
Only S7421 to S7424 of fig. 30 are different from fig. 27. In other words, the difference is that S1621, S1622, S1623, S1624 of fig. 27 are replaced with S7421, S7422, S7423, S7424, and each description of "hit matrix judgment" is changed to "foul judgment".
According to the present example described above, the server analyzes the information of the surroundings other than the concerned player (other than the specific object) and transmits the analysis result to the image processing apparatus such as the camera, so that the terminal side such as the camera can grasp the real-time situation of the game such as the hit, the score, and the foul. Thus, photographers and the like can obtain very advantageous information, especially when they select photos and send them out to the press during the game in time. In addition, the athlete's movements may be stored as big data in the server to predict the athlete's actions using AI based on the big data. Further, although the number of attention athletes is specified as only one in the example, the number of attention athletes may be plural.
Further, assume that the attention athlete is switched in the middle of the game. The concerned players may be all players participating in the game. Further, it is assumed that videos and images include not only moving images but also still images. Furthermore, tracking of an athlete of interest is primarily described. However, in the case of not only tracking the concerned player, information of the player holding or catching the ball may be transmitted to a professional photographer and a viewer and displayed.
Further, although an example in which an athlete is tracked is described in the example, it goes without saying that the present invention can be applied to a system or the like in which a plurality of monitoring cameras are used to track a person such as a criminal. Alternatively, the present invention may be applied to a system for tracking a specific car or the like in racing, a system for tracking a horse in horse racing, or the like, without being limited to tracking a person. Further, although an example in which the attention athlete is specified with a camera terminal or the like has been described in the example, the server side may specify the attention athlete.
Furthermore, if the determination is made with further focus on the role of the athlete including a replacement athlete when identifying the face of the athlete, the detection of the athlete by the server may be shortened and the accuracy of the detection of the athlete including a replacement athlete may be further improved.
An example of the flow of the attention athlete detection control including a replacement athlete in this case is shown in fig. 32.
In S801 of fig. 32, the server detects position information of the ball from videos from a plurality of cameras. The position of the concerned player is roughly estimated using the position information of the ball. Further, the area of the player searching for the face information is identified according to the player's character (such as forepart or defender) (identified by their uniform number for the player's character as the player who first goes out of the field, and identified by the player information registered in advance, the uniform number of the player who plays the day, and the player's character for the replacement player, for the name, uniform number, and character of the replacement player).
In S802, the server identifies face information of a concerned player including a replacement player in the area identified in S801, and acquires position information of the player having the face information through input of video information of a plurality of cameras.
If the face information of the athlete of interest including the replacement athlete is found using the video from each of the plurality of cameras, information of the size, angle, and background (field) shown is input, and thus the position information of the athlete of interest including the replacement athlete can be acquired. If the face information of the concerned player including the replacement player is found using the video from the plurality of cameras as well, the information of the size, angle and background (field) shown is input, and thus the accuracy of the position information of the concerned player including the replacement player can be improved.
In S803, the absolute position of the athlete of interest is detected based on the positions of the athlete of interest acquired from the video information of the plurality of cameras detected in S802. In S804, the absolute position of the athlete of interest detected in S803 is transmitted to camera terminals owned by professional photographers and general viewers.
It is determined in S805 whether or not the focused athlete continues to be tracked, and if the focused athlete continues to be tracked, the process proceeds to S801, and if the focused athlete does not continue to be tracked, the flow of fig. 32 ends.
Here, although the situation of the race or competition (whether a certain team is attacking or defending) is determined based on the position of the ball, for example, the control based on the situation of the competition is not limited to the position of the ball. For example, when a team makes a foul, the opposing team is given a penalty for the ball, etc. In this case, the team who obtains the penalty is highly likely to advance from the current position of the ball in the game. Thus, control may be based on the situation of a game in which the team is predicted to advance. As described above, the position of the ball can be predicted based on the infraction.
Information on changes of the athlete, and thus information on departure and approach of the athlete (including the position of the athlete) can be identified based on video of cameras used by a plurality of servers. In addition, the server transmits information to camera terminals owned by professional photographers and general viewers. The position of the entrance of the sportsman into the field when the sportsman changes is tracked while informing the professional photographer and the camera terminal owned by the general audience of the position.
A method of supporting athlete detection in athlete changes based on a replacement athlete detection control will be described using fig. 33. The following flow is shown in fig. 33: the face recognition information is used to detect an athlete of interest including a replacement athlete, taking into account the role (position) of the athlete when the athlete changes.
The same reference numerals in fig. 33 as in fig. 32 denote the same steps, and a description thereof will be omitted.
It is determined in S905 whether the focused athlete continues to be tracked, and if the focused athlete continues to be tracked, the process proceeds to S906, and if the focused athlete does not continue to be tracked, the control ends. It is determined in S906 whether an athlete change is made, and if an athlete change is made, the process proceeds to S907, and if no athlete change is made, the process proceeds to S901. In S907, the server identifies player changes and updates the list of information of the participating players.
Here, although the situation of the game (whether a certain team is attacking or defending) is determined based on the position of the ball, for example, the control based on the situation of the game is not limited to the position of the ball. For example, when a team makes a foul, the opposing team is given a penalty for the ball, etc. In this case, the team who obtains the penalty is highly likely to advance from the current position of the ball in the game. Thus, control may be based on the situation of a game in which the team is predicted to advance. As described above, the position of the ball can be predicted based on the infraction. Further, for player changes in S906, the server may identify and detect player departures and player approaches using multiple cameras, for example.
In addition to this, there are the following methods: calls by referees or others are identified and detected as audio information. Further, there is a method of: the athlete is detected based on athlete change information displayed on the large screen. Further, the player can be detected from a player list displayed on a large screen.
When a focused player is detected, information on a player who has performed a foul such as a penalty and who has temporarily left the game can be detected from videos of cameras for a plurality of servers.
In addition, the server transmits information to camera terminals owned by professional photographers and general viewers. The player penalized at the penalty station is prohibited from playing for 10 minutes.
In the example of fig. 33, information of player variations, and thus information of player departure and player approach (including player position), is identified based on videos of multiple cameras. In addition, the server transmits information to camera terminals owned by professional photographers and general viewers. The position of the entrance of the sportsman into the field when the sportsman changes is tracked and simultaneously notified to camera terminals owned by professional photographers and general audiences.
However, in situations other than player changes, players need to leave the field. The penalty varies according to the severity of the infraction when the player makes an infraction, and an infraction involving a red card means an immediate departure, whereas an infraction involving a penal that prohibits the player from participating in the game for 10 minutes causes the player to leave the game temporarily. Thus, for methods that support detecting athletes in athlete changes and detecting foul athletes based on capturing alternate athletes, a focused athlete detection control flow as in fig. 22 is used.
In this example, the attention athlete is registered in advance, the position of the attention athlete is displayed on the display unit of the camera attached with the mark, and further, Autofocus (AF) is adjusted for the attention athlete. Therefore, it is advantageous for professional photographers and general audiences to quickly photograph attention athletes.
Fig. 34 shows a display unit on the camera side concerning Auto Focus (AF) of an athlete of interest. In fig. 34, as the player of interest, a player who carries out hand pass ball at the center of fig. 34 (a) is registered. The camera performs Auto Focus (AF) on the athlete in question. The video seen by the photographer from the display unit of the camera is (B) of fig. 34, and Autofocusing (AF) is performed on the focused player, so photographing can be performed without missing a photo opportunity. In addition, the exposure may be automatically adjusted for the athlete in question at this time.
Next, fig. 35 and 36 show an AF flow for the attention athlete in the attention athlete display tracking control on the camera side. The same reference numerals as in fig. 8 denote the same steps, and a description thereof will be omitted.
In S3807 of fig. 35, the tracking of the attention athlete is performed by the camera itself based on the position information of the attention athlete. At this time, an attention athlete is subjected to Auto Focus (AF), and a mark is attached to the attention athlete on the display unit. It is determined in S3808 whether or not the follow-up tracking of the concerned player is OK (successful), and if the follow-up tracking of the concerned player is successful, the process proceeds to S3807, and the tracking of the concerned player is continued by the camera itself, and if the follow-up tracking of the concerned player is unsuccessful, the process proceeds to S109.
Fig. 36 shows details of the flow of S3807. In S3811 of fig. 36, the camera receives the absolute position information of the concerned player from the server. At 3812, the camera converts the absolute position information of the athlete of interest into relative position information based on the position, direction, magnification, etc. used for camera photography. In S3813, information of the focused athlete is displayed on the display unit based on the information of the relative position seen from the camera. In S3814, information from the operation unit input section 906 is input to determine whether a mode for performing Autofocus (AF) on the focused player based on the position information from the server is selected.
Then, if Autofocus (AF) for the focused player is selected, the process proceeds to S3815, and if Autofocus (AF) for the focused player is not selected, the process proceeds to S3816. Further, it is assumed that in a case where Automatic Focusing (AF) is not selected for the focused player, AF or AE is performed according to a frame displayed at the center within a display screen or the like of the camera regardless of position information of the focused player.
In addition, a well-known method may be applied to the Auto Focus (AF) method of S3815, and a description thereof will be omitted. In addition, the exposure may be adjusted for the concerned player in S3815. It is determined in S3816 whether or not the focused player continues to be tracked, and if the focused player continues to be tracked, the process proceeds to S3811, and if the tracking of the focused player is ended, the flow of fig. 36 ends.
Through the above control, the camera terminals of professional photographers and general audiences can not only identify the concerned player, but also rapidly perform AF and AE on the concerned player, and thus can photograph in time.
Further, a section for selecting an automatic tracking mode of the concerned player may be provided on the camera side. Here, if the attention athlete auto-tracking mode is selected, the camera places the attention athlete on the screen of the display unit using the auto-zoom function. Therefore, professional photographers and general audiences can use the mode more easily.
Fig. 37 shows a display example of the camera display unit at the time of automatic tracking.
In fig. 37, 3901 denotes a display unit of a camera. In 3901, seven athletes, including A, B, C, D, E, F, G, H, are placed in the camera shooting area of a professional photographer or general audience. Here, the athlete is concerned to be K and outside the shooting area of the camera.
3902 denotes a reduced state of the display unit of the camera when the automatic tracking mode is turned on. Due to the zoom function, the camera automatically has a wide angle, and a control is performed to place the attention athlete K in the shooting area. Fig. 38 is a diagram showing a more specific display example, and in (a) of fig. 38, in displaying a live view image from an image sensor, a focused player outside a display screen is represented by an arrow superimposed on a display directed in the arrow direction. In addition, fig. 38 (B) shows a case where the zoom becomes wide due to the auto-tracking mode. Here, the arrow indicates the position of the concerned player in the screen. Since the attention athlete is positioned within the display screen, the situation of the game can be easily grasped, and an image that the user wants to take can be easily obtained.
Fig. 39 and 40 show the tracking control flow displayed by the concerned player on the camera side, in other words, the flow when the concerned player is in the automatic tracking mode.
The same reference numerals as in fig. 8 denote the same steps in fig. 39, and a description thereof will be omitted. In S4007 of fig. 39, the camera itself tracks the attention player. Further, if the automatic tracking mode is selected by the operation unit, automatic tracking of the concerned player is performed, and if automatic tracking is not selected, automatic tracking is not performed. When a camera terminal owned by a professional photographer or a general audience automatically tracks a focused player, zoom magnification is automatically controlled so that the player is reduced and placed in a screen of a display unit of a camera when the player is not in a camera area. The control of S4007 is shown in detail in fig. 40 and will be described below.
It is determined in S4008 whether the continuation tracking of the focused player is successful (OK), and if the continuation tracking of the focused player is successful, the process proceeds to S4007 and the tracking of the focused player is continued by the camera itself, whereas if the continuation tracking of the focused player is unsuccessful, the process proceeds to S109.
Next, S4007 will be described in detail based on fig. 40. In S4011 of fig. 40, the camera receives absolute position information of the concerned player from the server. In S4012, the camera converts the absolute position information of the concerned player into relative position information based on the position, direction, magnification, and the like used for camera shooting.
In S4013, information of the focused athlete is displayed on the display unit based on the relative position information seen from the camera. It is determined whether the concerned player is outside the photographing region of the camera in S4014. If the attention athlete is outside the photographing region of the camera (outside the display image), the process proceeds to S4015, and if the attention athlete is inside the photographing region of the camera (inside the display image), the process proceeds to S4018. In S4015, information from the operation unit input section 906 is input to determine whether the user selects the focused athlete automatic tracking mode.
If the focused athlete automatic tracking mode is selected, the process proceeds to S4016, and if the focused athlete automatic tracking mode has not been selected, the process proceeds to S4018. In S4016, the player is reduced to the wide angle until the focused player is displayed on the display unit of the camera. In S4017, Autofocus (AF) is performed for the athlete concerned. At this time, AE is also performed so that the attention player is properly exposed. It is determined in S4018 whether or not the focused player continues to be tracked, and if the focused player continues to be tracked, the process proceeds to S4011, and if the tracking of the focused player ends, the flow of fig. 40 ends.
As described above, the video of the entire game field is read by the server, and the position where shooting is started is grasped from the video shot by the professional photographer and the general audience. The server may acquire video of the entire venue from multiple cameras and may map the location information of the venue into the video viewed by professional photographers and general viewers. In addition, when the cameras of the professional photographer and the audience receive the absolute position information of the concerned player from the server, the absolute position information can be mapped into the video being taken now. In other words, cameras of professional photographers and general audiences can recognize a focused athlete and take a picture in time.
Here, if the attention athlete is not within the photographing region of the camera, the zoom is adjusted to a wide angle and control is performed such that the attention athlete is placed within the photographing region of the camera. In addition, since the cameras automatically adjust the focus and exposure of the athlete in question, the cameras of professional photographers and general audiences can quickly and reliably capture video in which the athlete in question is focused.
Further, since Auto Exposure (AE) is automatically performed in addition to Auto Focus (AF), an optimum image can be obtained without waiting for the user to make an adjustment. Further, only AE may be performed without AF. Further, the user can selectively turn off the control of one of AF and AE using a selection switch, not shown.
As described above, in a game of football, soccer, or the like, players avoid opponents by making steps difficult to predict, and thus it is very difficult to keep track of players concerned. The advantage of this example is that the athlete can be kept tracked in this case, or a re-detection/re-tracking can be done very quickly even if the concerned athlete is not seen. Furthermore, in football, soccer, etc., the ball holders change one by one. At this time, the concerned players photographed by the photographer change one by one. In this case, for example, there is a method of tracking a ball-holding player.
The server may keep track of the current situation in the venue and predict the events that may occur next. Then, the server transmits the prediction information to camera terminals owned by professional photographers and general viewers. The prediction information is displayed on camera terminals owned by professional photographers and general viewers. Professional photographers and general viewers can more certainly obtain photo opportunities by viewing this information.
As a specific example of the prediction function, a case of predicting a change of an athlete will be described.
The server determines (analyzes) a race situation (race situation) using cameras for a plurality of servers, predicts what will happen next, and transmits information based on the operation to camera terminals owned by professional photographers and general audiences. In football, the possibility of player variation when the player is injured or the like is high. Thus, a focused athlete change detection control flow is shown in fig. 41, where the timing of an athlete change is predicted based on the readiness of a replacement athlete.
The same reference numerals in fig. 41 as those in fig. 11 denote the same steps, and a description thereof will be omitted. In S4107, the concerned athlete is tracked. A specific flow will be described using fig. 42. In 4108, replacement athletes are identified. Details of S4108 will be described in fig. 43. It is determined in S4109 whether or not the focused athlete is continuously tracked, and if the focused athlete is continuously tracked, the process proceeds to S4107, and if the focused athlete is not continuously tracked, the process proceeds to S4110. It is determined in S4110 whether the photographing of the attention athlete ends, and if the photographing of the attention athlete ends, the process proceeds to S4111, and if the photographing of the attention athlete does not end, the process proceeds to S206. In S4111 it is determined whether there is a movement of a replacement player, and if there is a movement of a replacement player, the process proceeds to S201, and if there is no movement of a replacement player, the process proceeds to S4108.
Next, the flow of S4107 is described using fig. 42.
In this example, it is assumed that the athlete himself allows the position sensor to be installed in clothing such as a uniform, or the athlete wears the position sensor using a belt around his arms, waist, legs, or the like. Further, when the position sensor wirelessly transmits information to the server side using the communication section to generate position information, the server recognizes a signal from the position sensor of the athlete, and the server notifies the position information to terminals such as cameras owned by professional photographers and general viewers.
In S4201 of fig. 42, the server acquires information of the position sensor of the concerned player from the plurality of cameras. Each of the plurality of cameras includes a detection section that receives radio waves from the position sensor, detects the direction of the received radio waves and the level of the received radio waves, and outputs these factors as information of the position sensor. The absolute position of the athlete of interest is detected based on information from the position sensors of the plurality of cameras in S4202. In S4203, the absolute position information of the concerned player is transmitted to the camera terminals owned by the professional photographer and the general audience. It is determined whether the concerned player is injured in S4204, and if the concerned player is injured, the process proceeds to S4206, and the fact that the concerned player is injured is stored in a storage portion such as the data storage 213.
If the concerned athlete is not injured, the process proceeds to S4205. It is determined in S4205 whether or not the focused athlete is continuously tracked, and if the focused athlete is continuously tracked, the process proceeds to S4201, and if the focused athlete is not continuously tracked, the flow of fig. 42 ends.
The alternate player identification control flow of S4108 is shown in fig. 43.
In S4301 of fig. 43, the server acquires information of position sensors of the replacement athlete from a plurality of cameras. The information of the position sensor also includes the direction of the received radio wave and the level of the received radio wave.
In S4302, the absolute position of the replacement athlete is detected based on information from the position sensors of the plurality of cameras. In S4303, attention is focused on the movement of the replacement athlete. In particular, if the athlete of interest is a replacement athlete, then his or her movements are noted. It is determined in S4304 whether there is a movement of a replacement player, and if there is a movement of a replacement player, the flow of fig. 43 ends, and if there is no movement of a replacement player, the process proceeds to S4301.
The photo opportunity can be reliably used if the blocking of the athlete, etc. can be predicted. In other words, professional photographers and general audiences can take high-value photographs if they can take unexpected movements of athletes.
In addition, for example, when hitting a ball in baseball, player variations may be predicted based on statistical data such as an increased likelihood of pitcher variations.
The athlete's movements may be stored as big data in a server to predict the athlete's movements using AI based on the big data.
Although the number of concerned athletes is one in the example, the number of concerned athletes may be plural. Further, assume that the attention athlete is switched in the middle of the game.
In addition, in the above description, it is assumed that video includes not only moving images but also still images.
In the above example, the position of the attention athlete can be displayed on a terminal side such as a camera in time, so that the audience and the photographer can shoot the attention athlete without missing a photo opportunity.
Further, it is assumed that the designation of the attention athlete is switched in the middle of the game. The concerned players may be all players participating in the game. Further, it is assumed that videos and images include not only moving images but also still images. Furthermore, tracking of an athlete of interest is primarily described. However, in the case of not only tracking a player of interest, information of the player holding or catching the ball may be transmitted to a professional photographer and a viewer and displayed.
In addition, although an example in which a football player or the like is tracked is described in the example, other sportsmen may be followed, and needless to say, the present invention may be applied to a system or the like in which a plurality of monitoring cameras are used to track a specific character such as a criminal. Alternatively, the present invention may be applied to a system for tracking a specific car or the like in racing, a system for tracking a horse in horse racing, or the like, without being limited to tracking a person. Further, although an example in which the attention athlete is specified with a camera terminal or the like has been described in the example, the server side may specify the attention athlete.
In addition, for example, although there are many cases where privileges are given to some audiences, sponsors, and the like in international competitions such as the olympic games, the level of additional value provision may be changed according to the privilege or contract level in the present example. Control according to such a level can be achieved by inputting a password or the like, and a professional photographer who makes a special contract can acquire high-value video and various information inside and outside a stadium by inputting a password, thereby being able to take good photographs.
Although the illustrative examples of the present invention have been described above, the present invention is not limited thereto, and various modifications and changes can be made within the scope of the gist of the present invention.
In addition, a program (software) that realizes the functions of the above-described examples with respect to a part or the entire control of the present invention may be supplied to the image pickup apparatus and the information processing apparatus via a network or various storage media. In addition, the computer (CPU, MPU, etc.) of the image pickup apparatus and the information processing apparatus can read and execute the program. In this case, the program and the storage medium storing the program constitute the scope of the present invention.
(Cross-reference to related applications)
The present application claims priority from japanese patent applications 2018-. In addition, the entire contents of these japanese patent applications are incorporated by reference into the present specification.
List of reference numerals
101. 102, 103 camera
401. 402, 403 terminal
110 server
371 tracking cell
380 image display unit.

Claims (53)

1. An image processing apparatus, comprising:
a display section configured to display an image;
a selection section configured to select a specific object from the image displayed on the display section;
a specification information generation section configured to generate specification information of the specific object selected by the selection section;
a transmission section configured to transmit the designation information generated by the designation information generation section to a server;
an acquisition section configured to acquire, from the server, position information of the specific object based on the specification information; and
a control section configured to cause the display section to display additional information based on the positional information of the specific object acquired by the acquisition section.
2. The image processing apparatus according to claim 1, wherein the control section causes the position of the specific object within the display screen of the display section to be displayed as the additional information based on the position information.
3. The image processing apparatus according to claim 1 or 2, wherein the server identifies the specific object in the video, generates position information of the specific object based on the identification result, and transmits the position information to the image processing apparatus.
4. The image processing apparatus according to claim 2, wherein the server recognizes the image of the specific object, generates position information based on a recognition result, and transmits the position information to the image processing apparatus.
5. The image processing apparatus according to any one of claims 2 to 4, wherein the server generates the position information of the specific subject based on a result obtained by recognizing a signal from a position sensor worn by the specific subject.
6. The image processing apparatus according to claim 5, wherein the additional information includes at least one of a cursor and an area having a different color or brightness.
7. The image processing apparatus according to any one of claims 1 to 6, wherein the additional information indicates a direction in which the specific object is located when viewed from a screen with the specific object outside the screen.
8. The image processing apparatus according to any one of claims 1 to 7, wherein the additional information indicates a degree to which the specific object deviates from a screen.
9. The image processing apparatus according to claim 8, wherein the additional information indicates how far the specific object deviates from the picture using a length or thickness of an arrow.
10. The image processing apparatus according to claim 8, wherein the additional information indicates how far the specific object deviates from the screen with a number or a scale.
11. The image processing apparatus according to claim 4, wherein the server identifies a number worn by the specific object and a part or all of a shape of the specific object.
12. The image processing apparatus according to any one of claims 1 to 11, wherein the server generates position information of the specific object based on a result obtained by recognizing an image of the specific object of videos of a plurality of cameras, and transmits the position information to the image processing apparatus.
13. The image processing apparatus according to any one of claims 1 to 12, further comprising:
a tracking section configured to track the specific object after acquiring the position information of the specific object from the server.
14. The image processing apparatus according to any one of claims 1 to 13, wherein the transmission section requests transmission of the position information to the server in a case where the tracking by the tracking section fails.
15. The image processing apparatus according to claim 14, wherein in a case where it is predicted that the tracking of the specific object by the image processing apparatus will fail, the server notifies the image processing apparatus of the position information of the specific object without waiting for a request from the image processing apparatus.
16. The image processing apparatus according to any one of claims 1 to 15, wherein the server acquires a video of an entire field where the specific object exists in advance, and generates the position information.
17. The image processing apparatus according to claim 16, wherein the server generates relative position information when the specific object is viewed from the image processing apparatus, based on position information of the specific object in the field.
18. The image processing apparatus according to claim 17, wherein the server transmits first position information of the specific object in the field to the image processing apparatus, and the image processing apparatus generates relative position information when the specific object is viewed from the image processing apparatus based on the first position information.
19. The image processing apparatus according to any one of claims 1 to 18, wherein the selection section selects a plurality of specific objects.
20. The image processing apparatus according to any one of claims 1 to 19, wherein the image displayed on the display section is a live view image obtained by a photographing section, and the additional information is superimposed and displayed on the live view image based on the position information.
21. An image processing apparatus, comprising:
a display section configured to display an image;
a selection section configured to select a specific object from the image displayed on the display section;
a specification information generation section configured to generate specification information of the specific object selected by the selection section;
a transmission section configured to transmit the designation information generated by the designation information generation section to a server;
an acquisition section configured to acquire, from the server, position information of the specific object based on the specification information; and
a control section configured to cause the specific object to be displayed outside the screen if the position of the specific object is outside the screen of the display section, based on the position information of the specific object acquired by the acquisition section.
22. The image processing apparatus according to claim 21, wherein in a case where the specific object is outside the screen of the display section, the control section causes a direction in which the specific object is outside the screen to be displayed.
23. The image processing apparatus according to claim 21 or 22, wherein in a case where the specific object is outside a screen of the display section, the control section causes a degree to which the specific object deviates from the screen to be displayed.
24. The image processing apparatus according to any one of claims 1 to 23, wherein the server analyzes a video from a camera, recognizes movement of the specific object, generates a movement recognition result, and transmits the movement recognition result to the image processing apparatus.
25. The image processing apparatus according to claim 24, wherein the movement recognition result includes a determination result of movement of the specific object in a predetermined motion according to a predetermined rule.
26. The image processing apparatus according to claim 24, wherein the movement recognition result includes a recognition result of a movement in a predetermined motion that is related to a score.
27. The image processing apparatus according to claim 24, wherein the movement recognition result includes a recognition result of a movement in a predetermined motion that is related to an infraction.
28. The image processing apparatus according to any one of claims 1 to 27, wherein the server analyzes surrounding information other than the specific object based on a video from a camera, and transmits an analysis result to the image processing apparatus.
29. The image processing apparatus according to claim 28, wherein the server analyzes a video, generates a movement recognition result based on a result obtained by recognizing movement of an object other than the specific object, and transmits the movement recognition result to the image processing apparatus.
30. The image processing apparatus according to claim 29, wherein the movement recognition result includes a recognition result regarding a movement of a referee in a predetermined game.
31. The image processing apparatus according to any one of claims 1 to 30, wherein the server analyzes surrounding information other than the specific object based on a sound accompanying a video, and transmits an analysis result to the image processing apparatus.
32. The image processing apparatus according to any one of claims 1 to 31, wherein the selection section selects the specific object by image recognition by a user selecting an image of the specific object from images displayed on the display section.
33. The image processing apparatus according to claim 32, wherein the specification information generation section generates the specification information of the specific object based on a result obtained by identifying the image of the specific object selected by the selection section.
34. The image processing apparatus according to any one of claims 1 to 33, wherein the server generates information of the position of the specific object based on a predetermined reference index in videos of a plurality of cameras and transmits the information to the image processing apparatus.
35. The image processing apparatus according to claim 34, wherein the reference index includes a bar or a line set in advance in an arena.
36. The image processing apparatus according to any one of claims 1 to 35, wherein the control section controls at least one of exposure adjustment and focus adjustment for the specific object based on the position information of the specific object acquired by the acquisition section.
37. The image processing apparatus according to any one of claims 1 to 36, wherein the server recognizes a signal from a position sensor worn by the specific object, and generates position information of the specific object based on a recognition result.
38. The image processing apparatus according to any one of claims 1 to 37, further comprising:
an estimating part configured to estimate a position of a specific player, which is the specific object in a predetermined game, according to a preset character of the specific player.
39. The image processing apparatus according to claim 38, wherein the estimating section further estimates the position of the specific athlete with reference to a character of a replacement athlete.
40. The image processing apparatus according to claim 38 or 39, wherein the estimating section estimates the position of the specific athlete based on a result of analysis of a game situation.
41. The image processing apparatus according to any one of claims 38 to 40, wherein the estimating section identifies an athlete change and estimates a position of the specific athlete.
42. The image processing apparatus according to any one of claims 1 to 41, wherein the control section is capable of selecting a mode for controlling at least one of exposure adjustment and focus adjustment for the specific object based on the positional information of the specific object acquired by the acquisition section, and a mode for controlling at least one of exposure adjustment and focus adjustment for the specific object without being based on the positional information of the specific object acquired by the acquisition section.
43. The image processing apparatus according to any one of claims 1 to 42, wherein the control portion is capable of selecting a mode for displaying the additional information in a case where the position of the specific object is outside a display screen of the display portion, and a mode for not displaying the additional information.
44. The image processing apparatus according to any one of claims 1 to 43, wherein the specific object is a specific player in a predetermined race, and wherein the control portion is capable of selecting a mode for displaying the additional information and a mode for not displaying the additional information based on an analysis result of a race situation.
45. The image processing apparatus according to claim 13, wherein the control section is capable of selecting whether or not to operate the tracking section when the position of the specific object is outside the screen of the display section.
46. The image processing apparatus according to claim 13, wherein the specific object is a specific player in a predetermined race, and wherein the control section is capable of selecting whether to operate the tracking section based on a race situation.
47. An image processing method, comprising:
a display step of displaying an image;
a selection step of selecting a specific object from the image displayed in the display step;
a specifying information generating step of generating specifying information of the specific object selected in the selecting step;
a transmission step of transmitting the designation information generated in the designation information generation step to a server;
an acquisition step of acquiring, from the server, position information of the specific object generated by the server based on the specification information; and
a control step of controlling at least one of exposure adjustment and focus adjustment for the specific object based on the position information of the specific object acquired in the acquisition step.
48. An image processing server, comprising:
a receiving section configured to receive designation information of a specific object transmitted from the image processing apparatus;
a generating section configured to search for the specific object from a video based on the specification information received by the receiving section to generate data on a position of the specific object, and analyze the video and recognize movement of the specific object to generate a movement recognition result; and
a transmission section configured to transmit the position information of the specific object generated by the generation section and the information on the movement recognition result to the image processing apparatus.
49. An image processing server, comprising:
a receiving section configured to receive designation information of a specific object transmitted from the image processing apparatus;
a generating section configured to search for the specific object from a video based on the designation information received by the receiving section to generate data on a position of the specific object; and
a transmission section configured to transmit the data on the position of the specific object generated by the generation section to the image processing apparatus.
50. A computer program that causes a computer to function as each section of the image processing apparatus according to any one of claims 1 to 46.
51. A computer-readable storage medium storing a computer program according to claim 50.
52. A computer program for causing a computer to function as each section of the image processing server according to claim 48 or 49.
53. A computer-readable storage medium storing a computer program according to claim 52.
CN201980088091.XA 2018-11-07 2019-10-17 Image processing apparatus, image processing server, image processing method, computer program, and storage medium Pending CN113273171A (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
JP2018209469A JP7289630B2 (en) 2018-11-07 2018-11-07 Image processing device
JP2018-209494 2018-11-07
JP2018209494A JP7233887B2 (en) 2018-11-07 2018-11-07 Image processing device
JP2018209480A JP7233886B2 (en) 2018-11-07 2018-11-07 Image processing device
JP2018-209469 2018-11-07
JP2018-209480 2018-11-07
PCT/JP2019/040874 WO2020095647A1 (en) 2018-11-07 2019-10-17 Image processing device, image processing server, image processing method, computer program, and storage medium

Publications (1)

Publication Number Publication Date
CN113273171A true CN113273171A (en) 2021-08-17

Family

ID=70612398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980088091.XA Pending CN113273171A (en) 2018-11-07 2019-10-17 Image processing apparatus, image processing server, image processing method, computer program, and storage medium

Country Status (3)

Country Link
US (1) US20210258496A1 (en)
CN (1) CN113273171A (en)
WO (1) WO2020095647A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210166497A1 (en) * 2019-12-01 2021-06-03 Active Track, Llc Artificial intelligence-based timing, imaging, and tracking system for the participatory athletic event market

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050018045A1 (en) * 2003-03-14 2005-01-27 Thomas Graham Alexander Video processing
CN101686352A (en) * 2008-09-22 2010-03-31 索尼株式会社 Display control apparatus, display control method, and program
CN103731600A (en) * 2012-10-12 2014-04-16 索尼公司 Image processing device, image processing system, image processing method, and program
JP2015046756A (en) * 2013-08-28 2015-03-12 株式会社ニコン System, server, electronic apparatus, and program
US20150382076A1 (en) * 2012-07-02 2015-12-31 Infomotion Sports Technologies, Inc. Computer-implemented capture of live sporting event data
US20160182814A1 (en) * 2014-12-19 2016-06-23 Microsoft Technology Licensing, Llc Automatic camera adjustment to follow a target
CN106878044A (en) * 2015-09-28 2017-06-20 佳能株式会社 Remote support system, message processing device, image processing equipment and control method
WO2017134706A1 (en) * 2016-02-03 2017-08-10 パナソニックIpマネジメント株式会社 Video display method and video display device
US20180350084A1 (en) * 2017-06-05 2018-12-06 Track160, Ltd. Techniques for object tracking

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4750158B2 (en) * 2001-09-28 2011-08-17 富士フイルム株式会社 Shooting support device
JP2008227877A (en) * 2007-03-13 2008-09-25 Hitachi Ltd Video information processor
JP2010198104A (en) * 2009-02-23 2010-09-09 Nec Corp Image display system, portable terminal system, portable terminal equipment, server, and image display method thereof
GB2489454A (en) * 2011-03-29 2012-10-03 Sony Corp A method of annotating objects in a displayed image
JP2013168854A (en) * 2012-02-16 2013-08-29 Nikon Corp Imaging device, server device, and management system
US10091411B2 (en) * 2014-06-17 2018-10-02 Lg Electronics Inc. Mobile terminal and controlling method thereof for continuously tracking object included in video
EP3176756A4 (en) * 2014-07-28 2017-08-09 Panasonic Intellectual Property Management Co., Ltd. Augmented reality display system, terminal device and augmented reality display method
US10536622B2 (en) * 2018-05-15 2020-01-14 Sony Corporation Camera depth prediction using generative adversarial network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050018045A1 (en) * 2003-03-14 2005-01-27 Thomas Graham Alexander Video processing
CN101686352A (en) * 2008-09-22 2010-03-31 索尼株式会社 Display control apparatus, display control method, and program
US20150382076A1 (en) * 2012-07-02 2015-12-31 Infomotion Sports Technologies, Inc. Computer-implemented capture of live sporting event data
CN103731600A (en) * 2012-10-12 2014-04-16 索尼公司 Image processing device, image processing system, image processing method, and program
JP2015046756A (en) * 2013-08-28 2015-03-12 株式会社ニコン System, server, electronic apparatus, and program
US20160182814A1 (en) * 2014-12-19 2016-06-23 Microsoft Technology Licensing, Llc Automatic camera adjustment to follow a target
CN106878044A (en) * 2015-09-28 2017-06-20 佳能株式会社 Remote support system, message processing device, image processing equipment and control method
WO2017134706A1 (en) * 2016-02-03 2017-08-10 パナソニックIpマネジメント株式会社 Video display method and video display device
US20180350084A1 (en) * 2017-06-05 2018-12-06 Track160, Ltd. Techniques for object tracking

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210166497A1 (en) * 2019-12-01 2021-06-03 Active Track, Llc Artificial intelligence-based timing, imaging, and tracking system for the participatory athletic event market
US11501582B2 (en) * 2019-12-01 2022-11-15 Active Track, Llc Artificial intelligence-based timing, imaging, and tracking system for the participatory athletic event market

Also Published As

Publication number Publication date
US20210258496A1 (en) 2021-08-19
WO2020095647A1 (en) 2020-05-14

Similar Documents

Publication Publication Date Title
US20200221014A1 (en) Image pickup device and method of tracking subject thereof
WO2018030206A1 (en) Camerawork generating method and video processing device
JP6460105B2 (en) Imaging method, imaging system, and terminal device
JP7132730B2 (en) Information processing device and information processing method
US10110850B1 (en) Systems and methods for directing content generation using a first-person point-of-view device
JP4835898B2 (en) Video display method and video display device
KR20160031992A (en) Method for providing real-time video and device thereof as well as server and terminal device
US9615015B2 (en) Systems methods for camera control using historical or predicted event data
JP2005100367A (en) Image generating apparatus, image generating method and image generating program
JP2020086983A (en) Image processing device, image processing method, and program
JP4121974B2 (en) Image capturing system and image capturing method
JP2008005208A (en) Camera automatic control system for athletics, camera automatic control method, camera automatic control unit, and program
US20210258505A1 (en) Image processing apparatus, image processing method, and storage medium
WO2021124750A1 (en) Information processing device, information processing method, and program
CN113273171A (en) Image processing apparatus, image processing server, image processing method, computer program, and storage medium
JP4121973B2 (en) Scene extraction system and scene extraction method
JP7282519B2 (en) Image processing device or image processing server
JP7289630B2 (en) Image processing device
WO2021200184A1 (en) Information processing device, information processing method, and program
JP7233886B2 (en) Image processing device
JP7235098B2 (en) Information distribution device, information distribution method, information distribution program
JP2006174124A (en) Video distributing and reproducing system, video distribution device, and video reproduction device
JP2015217122A (en) Image notification device, imaging device, image notification method, and image notification program
JP7233887B2 (en) Image processing device
CN111586281B (en) Scene processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination