WO2004093015A1 - 画像認識装置及び画像認識プログラム - Google Patents
画像認識装置及び画像認識プログラム Download PDFInfo
- Publication number
- WO2004093015A1 WO2004093015A1 PCT/JP2003/004672 JP0304672W WO2004093015A1 WO 2004093015 A1 WO2004093015 A1 WO 2004093015A1 JP 0304672 W JP0304672 W JP 0304672W WO 2004093015 A1 WO2004093015 A1 WO 2004093015A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- time
- unit
- concealment
- tool
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30221—Sports video; Sports image
Definitions
- the present invention relates to an image recognizing apparatus capable of appropriately recognizing an image content, which has conventionally been difficult to recognize, in sports content such as a broadcasted sports program. Background art
- a method for recognizing image contents such as “successful passing” and “successful smashing” is needed. For example, by manually inputting which section of the video information is “successful in passing” or “smashed in succession”, the content of the image is recognized, and The positions of poles, players and courtlines.
- a method of recognizing the image content by comprehensively judging the temporal change of the spatial relative relationship can be considered.
- the present invention employs the following means.
- the movements of players in sports that compete between areas partitioned by obstacles such as nets are recorded on a program being broadcasted in the sport or on a recording medium such as a material video before being broadcasted or a VTR.
- An image recognition device for recognizing from the content to be obtained, a video information acquisition unit for acquiring video information showing at least one player's action during play from the content, and the video information Image information acquired by the acquisition unit It is determined whether a tool such as a pole, which is a target for counting the score of the sport by moving between the areas included in the report, is in a state of being concealed by a predetermined target body
- a concealment state determination unit and a concealment start time when the use tool is determined to be in a state of being concealed from the state not concealed by the object, and concealed by the object.
- the hitting time information specifying unit for specifying the hitting time at which the tool was hit, based on the concealment release time when it is determined that the device has not been concealed from the state in which the tool has been concealed, and rule information for performing the sport.
- the concealment state determination unit determines that the concealment state has been changed from a state in which the tool is not concealed to the object by the concealment state determination unit.
- the impact time information identifying unit strikes the tool based on the concealment release time and the concealment release time when the concealment state determining unit determines that the concealed state has been changed from the concealed state to the unconcealed state.
- the hitting time specified is specified, and based on the specified hitting time, the video information showing the action of the player during play, and the rule information for executing the rule, the image content recognizing unit reliably assures the player.
- the behavior is specified so that, for example, the recognition of forehand swing, knock-and-swing and overhead swing due to overlap or occlusion does not occur.
- An image recognition device excellent in image recognition can be provided.
- the concealment state determination unit may determine whether or not the tool is within a predetermined distance of the object.
- a determination unit, and the distance determination unit determines that the tool is within a predetermined distance of the object, and changes from a state in which the tool is not concealed to the object to a state in which the tool is concealed.
- the time is identified as the concealment start time, and the distance determination unit determines that the tool is within a predetermined distance of the object, and the tool is concealed by the object. It is desirable to have a concealment start release time specifying unit that specifies the time when the state changes from the hidden state to the unconcealed state as the concealment release time.
- the video information acquisition unit uses an obstacle such as a net or a boundary line indicating the area and a boundary outside the area.
- a domain element extraction unit that extracts, from the video information, equipment information that indicates the position of the athlete and the equipment information that indicates the position of the athlete and the equipment that moves between the areas and counts the score of the sport. I prefer to be.
- the player position information indicates an area including the player and the equipment that the player always uses during play. It is desired to be location information.
- the domain element extraction unit may use the facility information extracted by the domain element extraction unit.
- the domain element extraction unit may use the facility information and the player position information extracted by the domain element extraction unit. There is a method of extracting use tool information from the video information based on the information.
- the use facility information, the player position information, the use tool information, and the rule information are to be subjected to image extraction. Anything that is based on knowledge of a sporting event is acceptable.
- An acoustic information acquisition unit that acquires acoustic information synchronized with the information from the content, wherein the impact time information identification unit acquires the combination of the concealment start time and the concealment release time and the acoustic information acquisition unit. It is desirable to specify the impact time based on the acoustic information.
- the impact time information identifying unit may determine the time at which the acoustic information indicates a value greater than a predetermined level. There is a method of specifying the time.
- the acoustic information acquisition unit is provided with a filter unit that passes a predetermined frequency band, and the acoustic information is stored in the filter unit. It is desirable that the sound has passed through the evening area, and in particular, the sound generated when the athlete's shoes are rubbed against the coat during play, the sound of the wind, and other environmental sounds such as noise are suitably used.
- the filter section is constituted by a non-pass filter.
- the batting time information specifying unit determines the batting time based on batting sound candidate data having a predetermined time including the batting sound extracted from the acoustic information. It is preferable to specify
- a plurality of percussion sound candidate data is extracted from the acoustic information so that the percussion sound candidate data of the next time and the percussion sound candidate data of the next time have mutually overlapping times, and the plurality of percussion sound candidate data are extracted.
- the hit time information specifying unit may specify the hit time.
- the plurality of percussion sound candidate data are configured to have the same data length, and the plurality of percussion sound candidate data are extracted from the acoustic information at regular time intervals. With this configuration, the impact sound can be extracted efficiently.
- a striking sound pattern information storage unit for storing striking sound pattern information obtained by patterning the sound of the shading sound, and the striking time information specifying unit includes a striking sound pattern stored in the striking sound pattern information storage unit. It is desirable to specify the impact time based on evening information and the acoustic information.
- the action of a player in a sport that competes between areas defined by obstacles such as nets is described as a program during the broadcast of the sport or a program before the broadcast.
- An image recognition device that recognizes from material recorded on a recording medium such as a VTR or a VTR, and an image showing at least one player's action during play from the content
- a video information acquisition unit for acquiring information; and a transfer between the areas included in the video information acquired by the video information acquisition unit.
- a concealment state determination unit that determines whether or not a tool such as a pole to be moved and counted for the score of the sport is concealed by a predetermined target object; and the concealment state determination unit.
- the concealment start time when it is determined that the use tool has changed from a state in which the object is not concealed to the object to a state in which the object is concealed, and a state in which the object is concealed from the state in which the object is concealed.
- the hitting time information specifying unit that specifies the hitting time of hitting the use tool, the video information obtained by the video information obtaining unit, and the hitting time information specifying unit
- An image content recognizing unit for recognizing the image content including the action of the player indicated by the video information based on the position of the tool at the specified hitting time and the device.
- the image information such as a hitting sound generated at the time of hitting a tool such as a pole which moves between the areas and counts the score of the sport is used.
- An acoustic information acquisition unit that acquires synchronized acoustic information from the content, wherein the impact time information identification unit acquires the set of the concealment start time and the concealment release time and the acoustic information acquisition unit.
- the strike time may be specified based on the acoustic information.
- FIG. 1 is a device configuration diagram of an image recognition device according to an embodiment of the present invention.
- FIG. 2 is a functional block diagram according to the first embodiment.
- FIG. 3 is a diagram showing a coat model used for extracting a coat line from video information in the embodiment.
- FIG. 4 is a diagram showing a net model used for extracting a net line from video information in the embodiment.
- FIG. 5 is a diagram showing a coat line and a net line extracted from video information in the embodiment.
- Sixth is a diagram showing detection of a player area in the embodiment.
- FIG. 7 is a diagram showing detection of a pole area in the embodiment.
- FIG. 8 is a diagram showing tracking of a pole position in the embodiment.
- FIG. 9 is a diagram showing a storage mode of a rule information storage unit in the embodiment.
- FIG. 10 is a diagram showing a mode of identifying the action of the player in the embodiment.
- FIG. 11 is a flowchart showing a process of performing image recognition from video information in the embodiment.
- FIG. 12 is a diagram showing the relationship between the coefficient a and the scale F value of the overall detection accuracy.
- FIG. 13 is a functional block diagram of an image recognition device according to another embodiment of the present invention.
- FIG. 14 is a flowchart showing a process of performing image recognition from video information in the embodiment.
- FIG. 15 is a functional block diagram of an image recognition device according to another embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION
- FIG. 1 is a device configuration diagram showing a device configuration of an image recognition device according to the present embodiment.
- FIG. 2 is a functional block diagram in the embodiment.
- the image recognition device is characterized by content related to sports recorded on a broadcast program or a recording medium displayed using a recording / reproducing device such as a television receiver or a VTR.
- a recording / reproducing device such as a television receiver or a VTR
- Main components include a CPU 14 that operates and functions as an image recognition device 1, and a user interface 15 such as a keyboard and a mouse that receives user information of the user. Element.
- content refers to the movement of the player and the coat is reflected vertically from diagonally above.
- the video includes shots taken by such angles, the referees, and shots for the audience, as well as audio from commentators.
- a tennis program will be described as an example of “contents”.
- the operation of the CPU 14 and the like causes the domain element extraction unit 101 and the rule as shown in FIG. It has functions as an information storage unit 102, a concealment state determination unit 201, an impact time information identification unit 105, an image content recognition unit 106, and the like.
- an information storage unit 102 a concealment state determination unit 201, an impact time information identification unit 105, an image content recognition unit 106, and the like.
- the domain element extraction unit 101 uses the facility information such as obstacles such as nets, a coat as a partitioned area and a coat line as a boundary line indicating a boundary outside the coat, Athlete position information indicating the position of the athlete, and equipment used to move between the above-mentioned courts and count the score of the sport concerned are obtained from the video information displayed on the television receiver. It extracts a part of the function as a video information acquisition unit that acquires video information that reflects at least one player's action during play from the content. It is configured as follows.
- the used facility information to be extracted is a coat line and a net line
- the extracted player position information is the position information of the player 1 and the position of the player 2 that are to compete.
- the tool information to be extracted is referred to as tennis pole (hereinafter referred to as “pole”).
- the usage extracted by the domain element extraction unit 101 The facility information, player position information, and equipment information are hereinafter collectively referred to as domain elements.
- the use facility information is extracted from coat feature points Pc, ..., Pc! 4 (hereinafter collectively referred to as “P c”), a coat model that defines the coat lines L C l , ' ⁇ ⁇ , and L c 9 (hereinafter collectively referred to as “L c”).
- P c coat feature points
- L C l coat lines
- L c L c 9
- net feature points P ni,..., P n 3 (hereinafter, collectively referred to as “P n”) indicating typical points of the net line
- L n L n 2 (hereinafter referred to as “L n”) is extracted from the video information by referring to the net model that defines the line and the net line in this order.
- a method of giving the initial feature point P c (0) as an input a method in which an operator inputs using the user interface 15 or a method in which the device 1 is automatically Either the method of detecting and inputting the initial feature point P c (0).
- the binary image B (t) of the original image is ANDed with the neighborhood of the coat line Lc (t-1) to obtain a binary value consisting only of the coat neighborhood.
- co-line binary image It is set to generate Be (t). Then, this is Hough-transformed for each line, peak detection is performed within the range limited by each detection window Wc (t-1), the coat feature point Pc (t) is updated, and again By setting the coat line L c (t) to Huff transform and updating the detection window W c (t), the coat line is extracted from the video information.
- a line L n (0) and a detection window W n (0) are prepared.
- an image B n obtained by removing the coat line binary image from the binary image of the original image (t) B (t)
- One Be (t) is generated as a net-line binary image, and this is used to perform Hough transform and peak detection within the detection window to obtain feature points P n (t) is updated, and the setting is made to extract the net line from the video information.
- coat lines and net lines can be extracted as shown in FIG.
- player position information is extracted by specifying the area where the overlap is maximum in the binary image from which the coat line and net line have been removed from the video information. It is configured as follows.
- 8 ((t) BIN (I (t) — I (t-s)) and B 2 (t)-BIN (I (t + s) — I (t)) I have.
- BIN is a function indicating that the argument in () is binarized.
- the configuration is such that poles are extracted by switching between the detection mode and the tracking mode according to the distance from the player position information extracted in this way.
- the set of one pole candidate Ba remaining in the area can be identified as the pole orbit BW of the section at that time.
- the template T b (x, y) is a kind of tool provided to extract or et pole video information, in this embodiment, expanded in the video Oh That had was tentatively b x X b y the size of the pole to be collapsed, and set as a b x X b y either et te Npure one bets an enlarged slightly periphery to the outside of this ing.
- the tracking mode, the template Bok T b (x, y) is to track the pole track BW Ri by the template Tomah etching with.
- the pole trajectory BW can be regarded as almost a straight line in a short time, and the search is performed using the position obtained by adding the previously detected movement amount as it is to the current frame as the predicted center.
- the detection mode is executed, and if not, the tracking mode is repeated. It is set to do.
- the pole trajectory BW in an arbitrary time interval can be obtained.
- the trajectory BW of the pole is displayed superimposed on video information at an arbitrary time in order to express the trajectory BW of the pole easily.
- the rule information storage unit 102 stores rule information necessary for performing the sport, and is formed in a predetermined area of the external storage device 12 or the internal memory 13. More specifically, as shown in FIG. 9, for example, as shown in FIG. 9, the server converts the indexed rule information index “service” to the rule information “service immediately before starting the service”. Back towards the baseline net, both feet between the center mark and the virtual extension of the sideline. Stand on the ground. Throw the pole with your hand in any direction in the air, and hit the racket before the pole falls to the ground. The service is deemed to have been completed the moment the racket and pole meet.
- the rule information index "Ball on the coat line” is defined as the rule information index "Polls falling on the coat line are separated by the coat line. It is deemed to have fallen within the court. "
- the concealment state determination unit 201 determines whether or not the pole extracted by the domain element extraction unit 101 is in a state of being concealed by the predetermined target player area P. is there . In the present embodiment, the concealment state determination unit 201 determines whether or not the pole extracted by the domain element extraction unit 101 is within a predetermined distance from the use area P. The distance determination unit 201a and the distance determination unit 201a determine that the pole is within a predetermined distance with respect to the player area p, and the ball is concealed from a state where it is not concealed in the player area p. The time when the state becomes the concealment start time is specified as the concealment start time, and the distance is determined by the distance determination unit 201a with respect to the player area p. The concealment start release time specification unit 201b is configured to specify the time when the state is changed from the state concealed by P to the state not concealed as the concealment release time.
- the pole is within a predetermined range of the player area p. It is assumed that the detection pole positions at the time when it is determined are b (1) to b (7). Then, among the pole positions b (1) to b (7), the time at which the pole position immediately before being concealed in the player area p is specified as the concealment start time t0, and the player is selected. The time at which the pole position immediately after appearing from the area p can be specified is set as the concealment release time t1, and the concealment start release time specification unit 201b specifies the time.
- the state in which the pole is hidden behind the player area p is defined as the “concealed state”.
- the state in which the pole overlaps in front of the player area p is referred to as “concealed state”.
- the predetermined target object to be concealed is not limited to the player area P, but may be information on facilities used such as a net line and a coat line.
- the batting time information specifying unit 105 specifies the batting time t a based on the concealment start time t 0 and the concealment release time t 1 specified by the concealment start release time specifying unit 200 b. It is.
- the concealment start time t 0 and the concealment release time t 1 specified by the concealment start release time specifying unit 2 0 1 b are substituted into the following equation (Equation 1) to calculate: good is, to identify the blow time t a.
- the value of the coefficient a to be set is not limited to this, for example, different values are set for the player on the side and the player on the back side.
- the value of the impact time ta obtained as described above may be, for example, a value approximated to an integer by an appropriate method or a value rounded to a number within the range of significant figures.
- the number of significant digits can be appropriately set according to the embodiment.
- the image content recognition unit 106 includes the coat line and the net line extracted by the domain element extraction unit 101, the player position information, the position of the pole, and the hit time information identification unit. based the position of use devices in specified striking time t a 1 0 5, to the rule information stored in the rule information storing section 1 0 2, recognizes the image contents including the operation of the player indicated by the image information It does.
- the yo Ri specifically, Remind as the first 0 Figure obtains the pole position P i (t a) at the striking time t a which is identified by the batting during time information specifying unit 1 0 5, this pole Position P j (t a
- the position of the athlete for example, if the pole is above the identification line at the top of the circumscribed rectangle surrounding the athlete at this pole hit time t a , "overhead-swing”; Or Knockside , It is set to judge the movement of the player as “forehand_swing” and “backhand-swing” respectively.
- the identification line is set at the upper part of the player area which is determined at a fixed ratio according to the vertical length of the circumscribed square of the player.
- a coat line and a net line are respectively extracted from the video information showing the movement of the player during play (step S101), and these coat lines are extracted from the video information.
- the player position information is extracted using the binary image from which the net line has been removed (step S102).
- a pole is extracted from the video information based on the extracted player position information (step S103).
- the batting time information identification unit 105 determines , to identify the hit time t a basis of the concealment start releasing time specifying unit 2 0 1 b at the determined hiding start time and (to) and hiding cancel time t 1 (step S 1 0 5).
- the pole is Ri is concealed Ri overlapped players Even in cases such as when image recognition malfunctions, "forehand_swing”, which indicates forehand swing operation, and "backhand—swi,” which indicates knock-knock and end-swing operation. ng ", and” o Verhead_swing ", which indicates overhead swing motion (step S106).
- the concealment state determination unit 201 determines that the tool has changed from a state not concealed by the object to a state concealed by the concealment state determination unit 201. Based on the start time and the concealment release time when the concealment state determination unit 201 determines that the state has been changed from the state concealed by the target object to the state not concealed, the strike time information identification unit 10 0 5 specifies the time at which the tool was hit, and further, based on the specified time of the hit, the video information showing the action of the player during play, and the rule information for performing the rule, the image content.
- Recognition unit 106 reliably identifies player movement Therefore, for example, image recognition excellent in image recognition that can avoid misidentification of forehand swing, knockhand swing, and overhead swing due to overlapping or concealment
- the equipment can be provided relatively inexpensively. It goes without saying that image recognition can be suitably performed when the pole and the player are not overlapped or concealed.
- the content is a tennis program
- the domain element extracted from the video information is used
- the facility information is the coat line and the net line.
- a recording / reproducing device such as a television or VTR, etc.
- the medium that is the target of content for performing image recognition is not limited to this embodiment.
- the image content including the player's motion indicated by the video information is converted into “foreand_swing”, which indicates a forehand swing operation, and “knock-and-swing”, which indicates a knock-and-swing operation.
- backhand—swing “Overhead swing”, which represents an overhead swing motion, is configured to be recognized by three types of motions. It is also possible to recognize "stay”, which indicates that the player stays in the game, and "move”, which indicates the player's movement.
- the rule information stored in the rule information storage unit 102 is defined and stored in a more complicated manner including various actions of the players, the image content recognition unit 106 becomes more complicated. It is also possible to recognize the movements of different players.
- b x X b y a given template T b (X, y), including Paul had been constructed in the jar by extracting whether et al Paul video information using, without the use of this tape Npure door Polls may be extracted in advance.
- the image recognition device is a sports television that is recorded on a broadcast program or a recording medium displayed using a recording / reproducing device such as a television receiver or a VTR. It recognizes the behavior of the players during the characteristic game from the content related to the game. Further, the device configuration of this image recognition apparatus is the same as that of the first embodiment, and thus the description is omitted.
- the image recognition device 1 will be described in terms of function.
- a concealment state determination unit 201 including a rule information storage unit 102, a distance determination unit 210 a and an concealment start release time specification unit 201 b, an acoustic information acquisition unit 103, and a hitting sound Pattern information storage unit 104, blow time information identification unit 105, image content recognition unit 106, etc. Function.
- the domain element extraction unit 101, the rule information storage unit 102, and the concealment state determination unit 201 are the same as those in the first embodiment, and thus description thereof is omitted.
- the acoustic information acquisition unit 103 acquires acoustic information including the impact sound generated at the time of hitting the pole from the content.
- the acoustic information is acquired by a 16-bit resolution, a sample rate 4. It is set to sample at 4.1 kHz.
- a filter unit (not shown) is provided in the acoustic information acquisition unit 103, for example, a sound or wind sound generated when a player's shoes and coat are rubbed during play.
- it is configured such that acoustic information other than the impact sound such as noise is filtered, and only the impact sound can be suitably extracted.
- this filter section is a bandpass filter that allows a predetermined frequency band to pass through digital circuits such as an FIR filter and an IIR filter.
- digital circuits such as an FIR filter and an IIR filter.
- the signal components in the frequency band of 100 ⁇ ⁇ to 150 Hz are set to pass.
- the striking sound pattern information storage unit 104 stores a change in sound due to a hitting condition between the pole and the rack, such as a striking sound when smashing or a striking when a forehand stroke is performed. Striking sound is classified into stroke types, such as sound, and is divided into patterns. The amplitude value is stored in a predetermined area of the external storage device 12 or the internal memory 13 in association with the amplitude value. In addition, sounds other than the sound generated when the pole hits the racket, such as the sound when the pole bounces off the coat, may be stored in a pattern.
- the batting time information specifying unit 105 is based on the concealment start time t 0 and the concealment removal time t 1 specified by the concealment start release time specifying unit 201 b (method M 1), and The striking time t a is specified based on the striking sound pattern information stored in the sound pattern information storage unit 104 and the acoustic information acquired by the acoustic information acquiring unit 103 (method M 2). More specifically, the time when the pole approaches within a certain distance from the player area p is defined as t — d 0, and the time when the pole moves away from the player outside a certain distance. Let t — d 1.
- the impact time is detected using the acoustic information of the method M2, and if the impact time is detected, the value is determined as the impact time t. as the a to adopt.
- t a approx METHODS M l (a X t 0 + (1 - a) X t 1) to identify by Ri blow time t a to.
- approx (x) represents a function that approximates x in a suitable way.
- the cause of the “missing detection” is that the acoustic information necessary to specify the impact time is good due to the microphone installation conditions, the mixing conditions in broadcasting, and the conditions of the data transmission path. Cases that cannot be obtained. Furthermore, if the hitting time obtained by the method M2 and the hitting time obtained by the method M1 are in a matching relationship, the time is specified as the hitting time. However, the accuracy of the specific impact time can be significantly improved.
- the method M 2 will be described in detail.
- the striking time information specifying unit 105 stores the acoustic information acquired by the acoustic information acquiring unit 103 in units of 208 points ( ⁇ 0.046 seconds) and 1 2 FFT processing is performed while sequentially shifting the start time at points (0.0209 seconds), and the frequency characteristic pattern of the acoustic information converted into the frequency domain at each time is stored in the hitting sound pattern information storage unit 1. It is set to match with the batting sound pattern information stored in 04. Their to, the results of these checking, if match over the frequency response pattern and the impact sound pattern information of the audio information to identify the matching time and blow time t a of this pole, the specific The operation is performed so as to output the hitting time t a to the image content recognition unit 106.
- the correlation between the frequency characteristic pattern of the acoustic information and the striking sound pattern information is determined by using a correlation function, and when the correlation function indicates a value larger than a preset threshold value, it is regarded as a match. Is set to.
- the image content recognizing unit 106 is the same as that of the first embodiment, and a description thereof will be omitted.
- a co-line and a net line are respectively extracted from the video information showing the movement of the player during play (step S201), and these coats are extracted from the video information.
- the player position information is extracted using the binary image from which the lines and net lines have been removed (step S202).
- a pole is extracted from the video information based on the extracted player position information (step S203). If the pole is within a predetermined range with respect to the player area p (step S204), the filtering unit filters the sound information including the hitting sound generated when the pole is hit from the content. (Step S205), and performs FFT processing on the acoustic information obtained by filtering while sequentially shifting the start time at predetermined intervals (step S205). 206).
- the frequency characteristic pattern of the batting sound candidate data obtained by converting into the frequency domain by the FFT processing at each time is compared with the batting sound pattern stored in the batting sound pattern information storage unit 104.
- the matching time is determined. identified as blow time t a of the pole (step S 2 0 9), if they do not match (step S 2 0 8), frequency of impact sound candidate data in the next time
- the characteristic pattern is compared with the hitting sound pattern (step S207).
- step S208 if they do not match the predetermined number of times (step S210), the batting time information specifying unit 105 obtains the information by the concealment start release time specifying unit 201b. identifying the striking time t a based on a concealed starting time t 0 and hiding cancel time t 1 (step S 2 1 1).
- the position of the player and the rule information at the batting time t a thus specified, for example, as shown in Fig. 10, when the pole overlaps or is concealed by the player
- forehand_swing which represents forehand swing
- backhand-swing which represents back-and-swing
- on-head-swing which may cause malfunctions in image recognition.
- o Verhead—swing as shown in step S212 and above, in which the equipment used in the video is athletes or nets.
- the concealment state determination unit 201 When it is difficult to specify the position of the tool by overlapping or being obscured by obstacles such as an obstacle, or when it is difficult to recognize an image using acoustic information, the concealment state determination unit 201 The tool used is hidden by the object The concealment start time when the concealment state determination unit 201 determines that the state has been changed from the non-covered state to the concealed state, and the concealment state determination unit determines that the state has been changed from the concealed state to the non-concealed state. 2 0 1 Based on the concealment release time and the time determined in the above, the batting time information specifying unit 105 specifies the batting time at which the tool was hit, and further, the specified batting time and the action of the player during play are specified.
- the image content recognition unit 106 In order for the image content recognition unit 106 to reliably identify the player's action based on the reflected video information and rule information for performing the rule, for example, the overlap or concealment Image recognition with excellent image recognition that can avoid recognition errors that could not be avoided with video information alone, such as mistakes in identification of noise, hand swing, knock hand swing, and overhead swing. Equipment can be provided relatively inexpensively. Needless to say, image recognition can be suitably performed when the pole and the player do not overlap or are concealed.In addition, identification is performed based on the acoustic information including the impact sound acquired by the acoustic information acquisition unit 103. If the hitting time is specified using the hitting time obtained and the hitting time obtained by the method of the method M2, a more accurate image recognition device can be provided.
- the noise can be appropriately filtered in the filter section, so that the robustness can be achieved.
- Image recognition with high target recognition rate since a plurality of hitting sound candidate data is obtained from the acoustic information and the hitting time is specified based on the data, it is possible to specify the exact hitting time. In this case, a plurality of hitting sound candidate data are exchanged between the preceding and following hitting sound candidate data. Since the time is set so as to overlap with the time, it is possible to prevent a problem that the hit time cannot be specified by mistake.
- the content is a tennis program
- the domain information extracted from the video information is used.
- the facility information is the coat line and the net line. It goes without saying that the used facility information to be extracted will be changed from now on if other sports programs are replaced. In addition, player position information and equipment information will be changed in the same manner.
- the medium that is the target of the content for performing image recognition, such as recognizing the behavior of a player during a characteristic game, is not limited to the present embodiment.
- the image content including the player's motion indicated by the video information is represented by "forehand-swing", which indicates a forehand swing operation, ⁇ A, which indicates a knuckle and an end-swing motion.
- "backhand-swing” was configured to recognize by three types of movements, “overhead-swing”, which indicates over-head swinging motion. However, based on the relationship with the pole position and player position, etc. It is also possible to recognize "stay”, which indicates that the player stays on the spot, and “move”, which indicates the player's moving action.
- the rule information stored in the rule information storage unit 102 is defined and stored in a more complex form including various actions of the players, the image content recognition unit 106 may be more complicated. It is also possible to recognize the movements of different players.
- b x X b y a given template that contains the pole of T b (X, y) had to have use a configured to cormorants'll be extracted whether we pole video information, without using the template of this Polls may be extracted.
- the acoustic information acquisition unit 103 is provided with a filter unit composed of a bandpass filter.
- a filter unit composed of a bandpass filter.
- an embodiment using a filter other than the nonpass filter is also conceivable.
- the frequency band to be passed is not limited to 100 Hz to 1500 Hz.
- the acoustic information acquisition unit 103 resolves the acoustic information including the striking sound generated when the pole is struck from the content with a resolution of 16 bits and a sampling rate of 44.1 kHz. It was set to acquire by sampling, but the resolution and sampling rate settings are not limited to this.
- the acoustic information acquired by the acoustic information acquisition unit 103 is converted by the hit time information identification unit 105 into units of 2048 points ( ⁇ 0.046 seconds) and 1288 points (0 0 0 2 9 seconds) FFT processing is performed while sequentially shifting the start time at intervals.
- the number of points for performing FFT processing is not limited to this, and may be set to another value.
- the correlation between the frequency characteristic pattern of the acoustic information and the percussion sound pattern is determined using a correlation function, and when the correlation function indicates a value larger than a preset threshold value, it is regarded as a match.
- the correlation function indicates a value larger than a preset threshold value, it is regarded as a match.
- the image content recognition unit 106 based on the video information acquired by the video information acquisition unit and the position of the tool used at the impact time identified by the impact time information identification unit 105.
- the image recognition device 1 is configured so as to recognize an image content including a player action indicated by the video information.
- the system can be configured with a simple configuration, and, for example, it can be applied to a device in which no rule is set, so that the versatility can be expanded.
- the striking time information identifying unit identifies the striking time at which the striking sound occurred, based on the acoustic information including the striking sound acquired by the acoustic information acquiring unit, and further identifies the striking time.
- Image content recognition based on the impact time, video information showing the player's actions during play, and rule information for performing the rules.
- ⁇ B can reliably identify the player's actions. Avoids false swing, knock-no-swing, and over-head swing discrimination errors due to dullness and concealment, and recognition errors that were inevitable with only video information. ⁇ »ut3 ⁇ 4 clothing Can be provided at relatively low cost
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03717571A EP1617374A4 (en) | 2003-04-11 | 2003-04-11 | PICTURE IDENTIFIER AND PICTURE IDENTIFICATION PROGRAM |
JP2004570857A JP4482690B2 (ja) | 2003-04-11 | 2003-04-11 | 画像認識装置及び画像認識プログラム |
US10/552,143 US7515735B2 (en) | 2003-04-11 | 2003-04-11 | Image recognition system and image recognition program |
PCT/JP2003/004672 WO2004093015A1 (ja) | 2003-04-11 | 2003-04-11 | 画像認識装置及び画像認識プログラム |
AU2003227491A AU2003227491A1 (en) | 2003-04-11 | 2003-04-11 | Image recognizing device and image recognizing program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2003/004672 WO2004093015A1 (ja) | 2003-04-11 | 2003-04-11 | 画像認識装置及び画像認識プログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004093015A1 true WO2004093015A1 (ja) | 2004-10-28 |
Family
ID=33193208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2003/004672 WO2004093015A1 (ja) | 2003-04-11 | 2003-04-11 | 画像認識装置及び画像認識プログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US7515735B2 (ja) |
EP (1) | EP1617374A4 (ja) |
JP (1) | JP4482690B2 (ja) |
AU (1) | AU2003227491A1 (ja) |
WO (1) | WO2004093015A1 (ja) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005303566A (ja) * | 2004-04-09 | 2005-10-27 | Tama Tlo Kk | ブロック分割領域における動きベクトルの分布を利用した特定シーン抽出方法及び装置 |
DE102009037316A1 (de) * | 2009-08-14 | 2011-02-17 | Karl Storz Gmbh & Co. Kg | Steuerung und Verfahren zum Betreiben einer Operationsleuchte |
JP5536491B2 (ja) * | 2010-03-01 | 2014-07-02 | ダンロップスポーツ株式会社 | ゴルフスイングの診断方法 |
GB2496429B (en) | 2011-11-11 | 2018-02-21 | Sony Corp | A method and apparatus and program |
JP6148480B2 (ja) | 2012-04-06 | 2017-06-14 | キヤノン株式会社 | 画像処理装置、画像処理方法 |
CN103390174A (zh) * | 2012-05-07 | 2013-11-13 | 深圳泰山在线科技有限公司 | 基于人体姿态识别的体育教学辅助系统和方法 |
US10223580B2 (en) * | 2013-03-26 | 2019-03-05 | Disney Enterprises, Inc. | Methods and systems for action recognition using poselet keyframes |
US9230366B1 (en) * | 2013-12-20 | 2016-01-05 | Google Inc. | Identification of dynamic objects based on depth data |
CN104688237B (zh) * | 2015-02-11 | 2017-06-20 | 深圳泰山体育科技股份有限公司 | 体质检测的测时方法及系统 |
WO2017103674A1 (en) * | 2015-12-17 | 2017-06-22 | Infinity Cube Ltd. | System and method for mobile feedback generation using video processing and object tracking |
KR102565485B1 (ko) * | 2016-01-11 | 2023-08-14 | 한국전자통신연구원 | 도시 거리 검색 서비스 제공 서버 및 방법 |
TWI584228B (zh) * | 2016-05-20 | 2017-05-21 | 銘傳大學 | 場線之擷取重建方法 |
CN107948716A (zh) * | 2017-11-28 | 2018-04-20 | 青岛海信宽带多媒体技术有限公司 | 一种视频播放方法、装置及机顶盒 |
US10719712B2 (en) * | 2018-02-26 | 2020-07-21 | Canon Kabushiki Kaisha | Classify actions in video segments using play state information |
KR101973655B1 (ko) * | 2018-03-05 | 2019-08-26 | 주식회사 디아이블 | 스포츠 코트 자동인식 및 그에 따른 인/아웃 판단 방법 및 장치 |
US20220270367A1 (en) * | 2019-03-24 | 2022-08-25 | Dibl Co., Ltd. | Method and device for automatically recognizing sport court and determining in/out on basis of same |
US11704892B2 (en) * | 2021-09-22 | 2023-07-18 | Proposal Pickleball Inc. | Apparatus and method for image classification |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0820788A2 (en) * | 1996-05-27 | 1998-01-28 | K.K. Asobou's | System and method for confirming and correcting offensive and/or defensive postures in a team ball game |
JPH11339009A (ja) * | 1998-05-26 | 1999-12-10 | Sony Corp | 解析データ生成装置 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3902266B2 (ja) | 1996-05-27 | 2007-04-04 | データスタジアム株式会社 | 団体球技における攻撃、守備体勢を確認し矯正するための 方法と装置 |
EP0905644A3 (en) * | 1997-09-26 | 2004-02-25 | Matsushita Electric Industrial Co., Ltd. | Hand gesture recognizing device |
US6072494A (en) * | 1997-10-15 | 2000-06-06 | Electric Planet, Inc. | Method and apparatus for real-time gesture recognition |
US6141041A (en) * | 1998-06-22 | 2000-10-31 | Lucent Technologies Inc. | Method and apparatus for determination and visualization of player field coverage in a sporting event |
US6816185B2 (en) | 2000-12-29 | 2004-11-09 | Miki Harmath | System and method for judging boundary lines |
US6567536B2 (en) * | 2001-02-16 | 2003-05-20 | Golftec Enterprises Llc | Method and system for physical motion analysis |
-
2003
- 2003-04-11 WO PCT/JP2003/004672 patent/WO2004093015A1/ja active Application Filing
- 2003-04-11 AU AU2003227491A patent/AU2003227491A1/en not_active Abandoned
- 2003-04-11 JP JP2004570857A patent/JP4482690B2/ja not_active Expired - Lifetime
- 2003-04-11 EP EP03717571A patent/EP1617374A4/en not_active Withdrawn
- 2003-04-11 US US10/552,143 patent/US7515735B2/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0820788A2 (en) * | 1996-05-27 | 1998-01-28 | K.K. Asobou's | System and method for confirming and correcting offensive and/or defensive postures in a team ball game |
JPH11339009A (ja) * | 1998-05-26 | 1999-12-10 | Sony Corp | 解析データ生成装置 |
Non-Patent Citations (2)
Title |
---|
MIYAMORI H.: "Improvement of Behavior Identification Accuracy for Content-based Retrieval by Collaborating Audio and Visual Information", INFORMATION PROCESSING SOCIETY OF JAPAN KENKYU HOKOKU, vol. 2002, no. 26, 8 March 2002 (2002-03-08), pages 89 - 94, XP002956678 * |
See also references of EP1617374A4 * |
Also Published As
Publication number | Publication date |
---|---|
AU2003227491A1 (en) | 2004-11-04 |
JP4482690B2 (ja) | 2010-06-16 |
US7515735B2 (en) | 2009-04-07 |
EP1617374A4 (en) | 2008-08-13 |
JPWO2004093015A1 (ja) | 2006-07-06 |
US20070104368A1 (en) | 2007-05-10 |
EP1617374A1 (en) | 2006-01-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2004093015A1 (ja) | 画像認識装置及び画像認識プログラム | |
Assfalg et al. | Soccer highlights detection and recognition using HMMs | |
Cheng et al. | Fusion of audio and motion information on HMM-based highlight extraction for baseball games | |
JP2008511186A (ja) | フレームシーケンスを含むビデオにおけるハイライトセグメントを識別する方法 | |
JPH10136297A (ja) | デジタルビデオデータから索引付け情報を抽出する方法と装置 | |
WO2006009521A1 (en) | System and method for replay generation for broadcast video | |
JP6649231B2 (ja) | 検索装置、検索方法およびプログラム | |
KR101128521B1 (ko) | 오디오 데이터를 이용한 이벤트 검출 방법 및 장치 | |
TWI408950B (zh) | 分析運動視訊之系統、方法及具有程式之電腦可讀取記錄媒體 | |
WO2004012150A1 (ja) | 画像認識装置及び画像認識プログラム | |
JP4271930B2 (ja) | 複数の状態に基づいて連続した圧縮映像を解析する方法 | |
US8768945B2 (en) | System and method of enabling identification of a right event sound corresponding to an impact related event | |
JP4546762B2 (ja) | 映像イベント判別用学習データ生成装置及びそのプログラム、並びに、映像イベント判別装置及びそのプログラム | |
WO2004013812A1 (ja) | 画像認識装置及び画像認識プログラム | |
Kijak et al. | Temporal structure analysis of broadcast tennis video using hidden Markov models | |
Chen et al. | Motion entropy feature and its applications to event-based segmentation of sports video | |
Kim et al. | Extracting semantic information from basketball video based on audio-visual features | |
KR100963744B1 (ko) | 축구 동영상의 이벤트 학습 및 검출방법 | |
JP2010081531A (ja) | 映像処理装置及びその方法 | |
Chen et al. | Event-based segmentation of sports video using motion entropy | |
Assfalg et al. | Detection and recognition of football highlights using HMM | |
JP4098551B2 (ja) | 複数のフレームを含む圧縮されているビデオを分析する方法およびシステム | |
Bertini et al. | Common visual cues for sports highlights modeling | |
Bertini et al. | Soccer videos highlight prediction and annotation in real time | |
Chen et al. | Sports video analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CN JP KR US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2003717571 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2004570857 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007104368 Country of ref document: US Ref document number: 10552143 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 2003717571 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 10552143 Country of ref document: US |