WO2008004331A1 - Procédé et dispositif d'émission vocale, liés à des images - Google Patents
Procédé et dispositif d'émission vocale, liés à des images Download PDFInfo
- Publication number
- WO2008004331A1 WO2008004331A1 PCT/JP2007/000441 JP2007000441W WO2008004331A1 WO 2008004331 A1 WO2008004331 A1 WO 2008004331A1 JP 2007000441 W JP2007000441 W JP 2007000441W WO 2008004331 A1 WO2008004331 A1 WO 2008004331A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- player
- unit
- voice
- camera
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/50—Controlling the output signals based on the game progress
- A63F13/54—Controlling the output signals based on the game progress involving acoustic signals, e.g. for simulating revolutions per minute [RPM] dependent engine sounds in a driving game or reverberation against a virtual wall
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/20—Input arrangements for video game devices
- A63F13/21—Input arrangements for video game devices characterised by their sensors, purposes or types
- A63F13/213—Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/55—Controlling game characters or game objects based on the game progress
- A63F13/56—Computing the motion of game characters with respect to other game characters, game objects or elements of the game scene, e.g. for simulating the behaviour of a group of virtual soldiers or for path finding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F13/00—Video games, i.e. games using an electronically generated display having two or more dimensions
- A63F13/40—Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment
- A63F13/44—Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment involving timing of operations, e.g. performing an action within a time slot
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/10—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
- A63F2300/1087—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals comprising photodetecting means, e.g. a camera
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/60—Methods for processing data by generating or executing the game program
- A63F2300/6063—Methods for processing data by generating or executing the game program for sound processing
- A63F2300/6081—Methods for processing data by generating or executing the game program for sound processing generating an output signal, e.g. under timing constraints, for spatialization
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/60—Methods for processing data by generating or executing the game program
- A63F2300/63—Methods for processing data by generating or executing the game program for controlling the execution of the game in time
- A63F2300/638—Methods for processing data by generating or executing the game program for controlling the execution of the game in time according to the timing of operation or a time limit
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/60—Methods for processing data by generating or executing the game program
- A63F2300/66—Methods for processing data by generating or executing the game program for rendering three dimensional images
- A63F2300/6607—Methods for processing data by generating or executing the game program for rendering three dimensional images for animating game characters, e.g. skeleton kinematics
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/60—Methods for processing data by generating or executing the game program
- A63F2300/66—Methods for processing data by generating or executing the game program for rendering three dimensional images
- A63F2300/6623—Methods for processing data by generating or executing the game program for rendering three dimensional images for animating a group of characters
Definitions
- the present invention relates to a technology for outputting sound in accordance with the movement of an object operated by a player.
- the present invention has been made in view of these problems, and it is an object of the present invention to provide a technology for realizing an interface that is easy to use for a player, in an apparatus that uses the motion of the player as an input interface. I will do what I do.
- an object taken with the camera is calculated based on the image taken by the camera, and the movement time required for the object operated by the player to reach the contact surface is calculated with reference to the movement time. Substantially simultaneously with contact with the contact surface. It is an audio output method linked to an image that is characterized by adjusting the output timing of audio so that the listener can listen to the audio.
- the time for the object to reach the touch surface is calculated, and the voice is preceded in consideration of the delay of the voice. .
- the difference between the player's recognition through visual perception and the timing at which the player listens to the audio is reduced, which reduces the sense of discomfort to the player.
- Another aspect of the present invention is an audio output device linked to an image.
- This device uses a speed vector calculation unit that calculates the speed vector of the movement of the object operated by the player toward the contact surface, using the image of the movement of the player captured by the camera, and the speed vector The distance between the object and the contact surface is used to calculate the movement time required for the object to reach the contact surface.
- the movement time calculation unit and the object when the contact surface is in contact with the contact surface.
- a delay time acquisition unit that acquires a delay time until the sound emitted from the speaker reaches the player.
- the audio control unit causes the player to listen to the audio substantially simultaneously with the contact of the object with the contact surface, based on the time obtained by subtracting the delay time from the movement time.
- FIG. 1 is a diagram showing an overall configuration of a three-dimensional position identification apparatus according to a first embodiment.
- FIG. 2 A diagram schematically illustrating the hardware configuration of the force camera and the image processing apparatus.
- FIG. 3 A plan view showing the structure of a reflector.
- FIG. 4 is a diagram showing a detailed configuration of a processing unit.
- FIG. 5 (a) is a view showing the positional relationship between an object and an entry area
- FIG. 5 (b) is a view showing a screen displayed on a display and recognized by a player.
- FIG. 6 (a) is a view showing the positional relationship between an object and an entry area
- FIG. 6 (b) is a view showing a screen displayed on a display and recognized by a player.
- FIG. 7 (a) is a view showing a positional relationship between an object and an entry area, and (b) is a view showing a screen displayed on a display and recognized by a player.
- FIG. 8 (a) is a view showing a positional relationship between an object and an entry area, and (b) is a view showing a screen displayed on a display and recognized by a player.
- FIG. 9 A three-dimensional positioning apparatus according to Embodiment 1, which is a flow chart for executing the application described in FIG. 5 to FIG.
- FIG. 10 (a) is a diagram showing the positional relationship between the object and the entry area, and (b) is a diagram showing how the character image is displayed according to the hand that is an object.
- FIG. 11 (a) is a diagram showing the positional relationship between the object and the entry area, and (b) is a diagram showing how the character image is displayed according to the hand that is an object.
- FIG. 12 (a) and (b) are diagrams showing an application example in which a character image is displayed according to the mouth which is an object.
- FIG. 13 A diagram showing a configuration of a three-dimensional position specifying device according to Embodiment 3.
- FIG. 14 is a view showing a screen displayed on the display and recognized by the player in the state shown in FIG. 13;
- FIG. 15 is a cross-sectional view of a plane perpendicular to the depth direction of the reflector.
- FIG. 16 is a diagram showing the configuration of an image processing apparatus according to a third embodiment.
- FIG. 17 A flow chart showing a procedure for executing a calculator application similar to that shown in FIGS. 5 to 8 in Embodiment 3.
- FIG. 18 A diagram showing a configuration of a three-dimensional position specifying device according to Embodiment 4.
- FIG. 19 is a diagram showing a configuration of an image cooperation audio control unit in an image processing apparatus according to a fourth embodiment.
- FIG. 20 is a diagram showing the configuration of a three-dimensional position specifying device according to Embodiment 5.
- FIG. 21 is a view for explaining the principle of a method of calculating the velocity vector of an object from one frame taken by a camera.
- FIG. 22 is a diagram showing a configuration of an image cooperation voice control unit in a fifth embodiment.
- FIG. 23 In the fifth embodiment, it is a flowchart showing a process of outputting sound in cooperation with an image.
- FIG. 1 shows the overall configuration of a three-dimensional position determination apparatus 10 according to an embodiment of the present invention.
- the three-dimensional positioning apparatus of the present embodiment captures an object operated by the player with a single camera, identifies the three-dimensional position of the object by image processing, and responds to the identified three-dimensional position. Screen on the display.
- a typical example of an application using the three-dimensional positioning device 10 is an action game in which a character or the like displayed on the screen is operated by the action of the player, but other forms of games, simple business applications, digital copying It can also be applied to true album display and music data playback applications.
- the three-dimensional position determination device 10 includes a display 40, a camera 20 installed on the upper side of the display, an image processing device 30, and a reflector 50.
- the display 40 is preferably arranged in front of the player 72.
- Player 7 2 controls the object while looking at his own image taken by camera 20.
- the camera 20 captures an object 70 operated by the player 72 and outputs a frame to the image processing apparatus 30 at a predetermined frame rate. It is preferable that the frame rate be as high as possible in order to accelerate the response of image detection.
- the camera 20 is placed above the display 40.
- the shooting range 2 6 of the camera 20 is set to capture at least the object 7 0 operated by the player 7 2.
- the player 72 can operate the object 70 with the front facing the display 40 side.
- the camera 20 may be placed below or to the side of the display 40, or the direction in which the player 72 looks at the display 40.
- the camera 20 may be installed at a different place.
- the frame output from the camera 20 is transferred via the image processing unit 30 It is projected on Spy 4 0.
- the photographed frame is subjected to mirror processing by the image processing device 30 and a mirror image of the player 72 is displayed on the display 40.
- the image processing apparatus 30 may display the screen as it is captured on the display 40 without performing mirror processing.
- the display 40 may display a screen with the image processing device 30 upside down.
- the image processing apparatus 30 has a function of loading and executing application software stored in an external storage medium.
- the image processing device 30 not only performs the above-described mirror surface processing on the frame output from the camera 20, but also detects an image of an object in the frame and displays a predetermined image superimposed on the object. Performs processing such as giving instructions to the application according to the player's action.
- the specular image subjected to the predetermined processing by the image processing device 30 is output to the display 40.
- the image processing apparatus 30 is typically a dedicated machine such as a game console, but may be a general-purpose personal computer or server provided with an image input / output function. A more detailed function and configuration of the image processing apparatus 30 will be described later.
- the display 40 may include a speaker 42.
- the speaker 42 reproduces the audio and accompaniment output from the image processing device 30 in accordance with the image displayed on the display 40 and other images.
- the speaker 42 is preferably integrally formed with the display 40 and disposed near the display 40. However, the speaker 42 and the display 40 may not be integrated, and may be disposed apart from each other.
- the reflector 50 is disposed between the player 72 and the display 40 and the camera 20, and has a role of causing the camera 20 to capture a reflection image of the object 70.
- object refers to the shooting range 2 of the camera 20 It is a generic term of what is operated by player 7 2 in 6
- Parts of the body such as arms, hands, feet, and mouth, and objects such as rods, sheets, boxes, etc. operated by parts of the player's body (eg, hands, feet, and mouth), and devices such as controllers. included.
- objects are moved by the player's intention, such as when objects are part of the body, and are expressed as "objects operated by players".
- the player's finger is shown as 70 objects.
- the reflected image by the reflector 50 is also taken by the camera 20.
- the camera 20 will include both the direct image and the reflection image of the object 70 in one frame.
- the reflector 50 is provided with two reflecting surfaces 52, 54, each of which reflects the object 70, and the reflected images are taken by the camera 20. Therefore, the reflection surfaces 52, 54 are angled so that the reflection image of the object 70 is connected by the lens of the camera 20. Also, the installation location of the reflector 50 is limited to a position separated from the camera 20 by a predetermined distance.
- entry areas 62 2 and 6 which are areas over which the reflection image of the object 70 can be projected toward the camera 20, are above the reflecting surfaces 52 2 and 54, respectively.
- Spreads. The spread of the entry areas 62 and 64 is determined by the degree of inclination of the reflecting surfaces 52 and 54, and is the range where the object 70 is assumed to enter.
- each approach area 62, 64 is set so as not to cross each other.
- object 7 0 is in entry area 6 2
- the image reflected by the reflecting surface 52 is taken by the force camera 20, and when the object 70 is in the entry area 64, the reflection reflected by the reflecting surface 54 is reflected.
- An image is taken by a camera 20.
- the object 70 has a certain length in the depth direction of the reflector 50 like a finger or a stick, the object 70 exists simultaneously in both the entry areas 62 and 64.
- FIG. 2 is a diagram illustrating the hardware configuration of the camera 20 and the image processing apparatus 30 in a simplified manner.
- the camera 20 includes an image sensor 22 as an imaging element and an image processing unit 24.
- the image sensor 22 is generally a CCD sensor or a CMOS sensor, and records an image by capturing an image redefined by a lens (not shown) with a light receiving element. The captured image is temporarily stored in a memory (not shown) such as RAM.
- the configuration of the camera 20 is well known, and thus further detailed description is omitted.
- the image processing unit 24 includes circuits such as an ASIC, and performs AZD conversion, demosaicing, white balance processing, noise removal, contrast enhancement, color difference enhancement, on the image data output from the image sensor 22. Perform necessary processes such as gamma processing.
- the image data processed by the image processing unit 24 is transferred to the image processing apparatus 30 via a communication interface (not shown).
- a communication interface not shown.
- the image data passed from the image processing unit 24 to the image processing unit 30 is RAW data obtained by digitizing the output signal from the image sensor 22.
- Image data may be compressed data such as JPEG. In the latter case, the image processing device 30
- An image decoding unit for decoding compressed data is disposed at a stage prior to the processing unit 32.
- the image processing apparatus 30 comprises a processing unit 32, an image output unit 34 for outputting the image data passed from the processing unit 32 to the display 40, and audio data passed from the processing unit 32. And an audio output unit 36 for outputting the signal to the speaker 42.
- the image processing apparatus 30 may be configured to load application software stored in an arbitrary recording medium such as a CD-ROM or DVD-ROM. And an application execution unit that executes the application. Since these functions are naturally provided in dedicated machines such as game consoles and personal computers, further detailed description will be omitted.
- FIG. 3 is a plan view showing the structure of the reflector 50.
- the reflector 50 is a thin plate as a whole, and as described above, has the first reflecting surface 52 and the second reflecting surface 54 that are spaced apart in the depth direction.
- the reflecting surfaces 52 and 54 are, in one example, mirrors, and may be mirror-finished metal, plastic, glass deposited with metal, or the like.
- the first reflection surface 52 and the second reflection surface 54 are disposed in parallel, and the major axis thereof is disposed substantially perpendicular to the optical axis of the camera 20. As shown in FIG. 1, the first reflecting surface 52 and the second reflecting surface 54 reflect the object above the reflecting surface, and the angle at which the reflected image is projected toward the lens of the camera 20.
- markers 56 for making the image processing device 3 0 recognize the position of the reflector 50 are respectively arranged.
- the marker 56 may be a colored portion, a predetermined pattern such as a check, or a two-dimensional code.
- Light sources such as LEDs may be embedded at both ends. In short, any form can be used as long as it can add information required to specify the position of the reflector 50 in the frame output from the camera 20.
- Reflector 50 has a predetermined width in the depth direction, and has a plurality of reflecting surfaces in the depth direction, so that a plurality of entry areas 62 and 64 can be set in the depth direction. Ru.
- Each of the reflecting surfaces 52, 54 projects a reflection image of different entry areas where the object is supposed to enter toward the camera 20, and causes the force camera 20 to capture a reflection image of the object. By doing this, as described later, it is possible to detect the movement of the object in the depth direction.
- FIG. 4 is a diagram showing a detailed configuration of the processing unit 32. These configurations are realized by a CPU, a memory, a program loaded into the memory, etc., but are drawn here as functional blocks realized by their cooperation. Therefore, it is understood by those skilled in the art that these functional blocks can be realized in various forms by hardware I, software I only, or a combination thereof.
- the image acquisition unit 102 acquires the frames output from the camera 20 one by one, and sends them to the image inversion unit 104 and the image-linked audio control unit 150.
- the image inverting unit 104 subjects the frame received from the image acquiring unit 102 to mirror processing (that is, image horizontal inversion processing) to generate a mirror image.
- the specular image is sent to the three-dimensional localization unit 110 and the on-screen display unit 14 4.
- the three-dimensional localization unit 1 1 0 specifies the three-dimensional position of the object using the frame captured by the camera 20 received from the image inversion unit 1 0 4.
- the three-dimensional position refers to the two-dimensional position corresponding to the position of the object in the frame, ie, the position on the screen, and the position in the depth or z direction in FIG.
- the object in the screen is recognized, and by specifying the depth position of the object, a specific action of the player is detected.
- the three-dimensional localization unit 110 includes a reflection surface area identification unit 112, a depth localization unit 122, a frame internal localization unit 114, and a reference image storage unit 120.
- the reflecting surface area specifying unit 1 12 is a reflecting surface that is an area corresponding to the first reflecting surface 52 and the second reflecting surface 54 of the reflector 50 from the frame captured by the camera 20. Identify the area.
- the reflective surface area identification unit 1 1 2 The marker 56 is detected, and the area sandwiched between them is identified as the reflective surface area.
- the depth localization unit 1 2 2 specifies the position in the depth direction of the object by detecting the reflection image from the reflection surface area specified by the reflection surface area specification unit 1 1 2. Specifically, the depth localization unit 12 2 compares reflection surface areas among a plurality of frames and detects the difference. If there is no reflection image in the reflection surface area in one frame and the reflection image appears in the reflection surface area in the subsequent frame, it can be determined that the object is located in the entry area corresponding to the reflection surface.
- the depth localization unit 122 obtains a default image of the reflective surface area before starting the process of specifying the three-dimensional position. It may be determined that the object is located in the entry area when a difference is detected between the default image and the reflective surface area of an arbitrary frame.
- the depth localization unit 12 2 performs the same process on the reflective surface area corresponding to the first reflective surface 52 and the second reflective surface 54 to obtain a first entry region 62 and a second entry region. 6 Determine if the object has entered each of the four. The result of this judgment is sent to the input control unit 130.
- Intra-Frame Localization Section 114 identifies the intra-frame position of the object.
- the in-frame localization unit 1 1 4 includes an object detection unit 1 1 6.
- the object defect detection unit 116 performs well-known pattern matching using an object reference image (template) on the frame received from the image inversion unit 104 to determine the position of the object within the frame. Identify.
- the target to be matched may be the frame itself received from the image inverting unit 104 or the reflecting surface area specified in the reflecting surface area specifying section 112 may be excluded from the frame. Good.
- the reference image storage unit 120 stores a reference image for specifying an object.
- a reference image prepared in advance for an object whose position in the frame should be specified may be stored, as described later, the object to be specified may A photograph may be taken by a camera 20, and an area where an object will exist may be cut out from a frame and stored as a reference image in a reference image storage unit 120.
- a reference image created by taking an average of images of several dozen or several thousand hands may be stored, or the player's age, gender, physical constitution, etc. You may store multiple reference images classified according to. Any matching technique using a reference image can be used. These are well known to those skilled in the art and will not be described in further detail.
- the input control unit 130 instructs the application execution unit (not shown) that executes the application including the game based on the information obtained by the image processing on the frame photographed by the camera 20. give.
- the input control unit 130 includes a function specifying unit 132, a display control unit 134, and an image storage unit 136.
- the function identifying unit 1 32 determines the depth direction between the first entry region 62 and the second entry region 64 based on the determination of the depth position by the depth localization unit 12 2. Detect movement of 0 and identify the player's action. The action identifying unit 1 32 may identify movement of the object 70 toward the camera 20 in the depth direction and movement away from the camera 20 as different player actions. The action identification unit 1 32 gives the identified action to the application execution unit and display control unit 1 3 4 (not shown). The application execution unit receives a given action as an input and adds a predetermined function.
- the display control unit 134 superimposes on the direct image of the object taken by the camera 20, and causes the display to display an image to be displayed to achieve a predetermined purpose.
- the display control unit 1 3 4 is configured such that when the object is positioned in the first entry area 62 corresponding to the first reflective surface 52, the second control surface When positioned in the entry area 64, images of different display modes may be displayed.
- the display control unit 134 searches the image storage unit 136 for an image corresponding to the position of the object, and outputs the image to the on-screen display unit 144.
- the image storage unit 136 stores the above-described image displayed superimposed on the direct image of the object I. Examples of this image include characters used in the game, pointers such as cursors, instruments such as instruments and weapons, marks such as stars and sun, images of parts of the body such as hands and feet, or inputs such as a key pad or calculator. There is an image of the device. These images may be input images possessed by the OS to receive input from the user, or may be application images read out from the application software being executed.
- the image storage unit 1 36 may hold images of a plurality of modes so that the display control unit 1 34 can display images of different modes according to the in-frame position of the camera. And may hold data needed to change an image.
- the on-screen display unit 14 4 causes the image output from the display control unit 1 3 4 4 to be displayed on-screen on the specular image obtained from the image inverting unit 1 4 4, and sends the image to the image output unit 3 4.
- the image output unit 34 displays the screen superimposed on the mirror image of the player on the display 40.
- the image cooperation voice control unit 150 cooperates with the position of the object detected by the three-dimensional localization unit 110 or the player's accompaniment detected by the input control unit 130.
- the voice output unit 36 is controlled to output the voice. The specific configuration of the image coupled voice control unit 150 will be described in detail in the third and fourth embodiments.
- the placement instructing unit 1 4 2 displays a display on the display 40 for instructing the player a position where the reflector 50 should be placed, together with the image of the player taken by the camera 20.
- the position where the reflector 50 should be arranged is limited to a certain range. Therefore, for example, a frame line is displayed on the display 40 so that the player can place the reflector 50 in the correct position.
- the position of the reflector 50 is adjusted so that the reflector 50 photographed by the camera 20 fits inside the frame.
- the arrangement confirmation unit 140 refers to the frame photographed by the camera 20 and confirms whether or not the reflector 50 is installed at an appropriate position. Specifically, the in-frame position of the marker 56 at both ends of the reflector 50 is detected by the reflection surface area specifying unit 112, and the position confirmation unit 140 detects the position of the marker 56 as a position instructing unit. 1 It is determined whether or not it is inside the displayed frame line. If the marker is inside the frame line, display an indication that the reflector has been properly placed on the display 40 and instruct the placement instructing unit 1 2 2 to stop displaying the frame line. . The localization processing by the three-dimensional localization unit 110 may not be started unless the marker is placed inside the frame line.
- This application is a desk, and the player can input numbers by pushing the image of the calculator key displayed on the screen.
- FIG. 5 (a) shows that the object 70 operated by the player 72, that is, the finger of the player, is larger than the first entry area 62 which extends above the first reflection surface 52 of the reflector 50. It shows the state located on the near side. This is a state in which the application executed on the image processing device 30 is waiting for some action of the player.
- FIG. 5 (b) shows a screen 44 displayed on the display 40 at this time and recognized by the player. As shown, a direct image of player 7 2, object 7 0 and reflector 5 0 is displayed on screen 4 4.
- the reflection surface area identification unit 1 1 2 identifies the reflection surface area 5 0 ′ by detecting the marker 56 from the frame.
- a default image of the reflective surface area 5 0 ′ may be stored.
- the standby state nothing is present above the first approach area 62 and the second approach area 64 except for the background. Therefore, if the default image is stored, the first entry area 6
- the difference between the second and second entry areas 64 when the object enters is easily taken, so that the detection processing of the object's reflection image in the reflection surface area becomes a robbery.
- the conventional object detection method based on the motion difference between frames, when the object is stopped in the screen, the difference is lost and nothing can be recognized.
- the default image is stored in advance as in the present embodiment, even if the object is stopped with entering the entry area, the default is continued even while stopped. The difference with the image can be obtained, so the depth position of the object can be recognized continuously.
- FIG. 6 (a) shows a state in which the object 70 operated by the player 72 has entered the first entry area 62.
- FIG. 6 (b) shows a screen 44 displayed on the display 40 at this time and recognized by the player.
- a reflection image 70a of the object 70 is shown in the area corresponding to the first reflection surface 52 of the screen 44.
- the depth localization unit 12 2 detects this reflection image 70 a by taking the difference of the reflection surface area between the frames.
- the action identification unit 1 32 When the action identification unit 1 32 knows that the object 70 has entered the entry area 62, it instructs the display control unit 1 34 to display an application image 80 for executing the calculator application. Display 40 Display on screen.
- the application image 80 contains multiple key areas for entering numbers or symbols.
- the application image 80 is preferably a line drawing or semi-transparent so as not to prevent visual observation of the superimposed player's operation, but it may be an opaque image.
- the action identification unit 132 instructs the application execution unit (not shown) to start the calculator application. After that, the in-frame localization unit 1 1 4 continues tracking the object 70 by specifying the in-frame position of the re-object ⁇ 70 by matching. In Fig. 6 (b), the object 70 is at the key position corresponding to "5" of the application image 80.
- FIG. 7 (a) shows a state in which the player 72 has moved the object 70 within the first approach area 62, that is, in a plane perpendicular to the depth direction.
- FIG. 7 (b) shows a screen 44 that is displayed on the display 40 at this time and is recognized by the player.
- the player 72 moves the object 70 from the position of "5" of the application image 80 to the position of "1".
- the position of the reflection image 7 0 a of the object in the reflection surface area also changes.
- the instruction by the action identification unit 132 does not occur, and accordingly the aspect of the application image 80 does not change.
- the intraframe localization unit 114 continues to track which key of the application image 80 the object 70 is.
- FIG. 8 (a) shows a state in which the object 70 operated by the player 72 has entered the second approach area 64 beyond the first approach area 62.
- FIG. 8 (b) shows a screen 44 displayed on the display 40 at this time and recognized by the player.
- the first reflection plane 5 is formed on the portion corresponding to the first reflection plane 52 in the reflection plane area.
- the reflection image 70 a of the second reflection surface 54 is displayed on a portion corresponding to the second reflection surface 54 of the reflection image 70 b of the second image.
- the depth localization unit 1 2 2 obtains the reflection image by taking the difference of the reflection surface area between the frames.
- the action identification unit 1 32 moves beyond the first entry region 62 to the second entry region 64 based on the information from the depth localization unit 12 2. Recognizing that, it is determined that the action of moving the object 70 toward the camera in the depth direction is performed by the player 72. According to this, the action identifying unit 1 32 is an application image corresponding to the position within the frame where the object 70 currently exists.
- the action identification unit 1 32 sends the current object to the display control unit 7.
- the color of the key "1" corresponding to the in-frame position of the object ⁇ 70 is changed (see 80 a in the figure).
- the change of the display mode may be, for example, blinking or lighting of the key, or the mode in which the key is pressed, in addition to the change of the color. In this way, the player can input numbers into the calculator application by operating the object.
- the keyboard may be displayed as an application image and used as an input device for a single processor.
- the action identifying unit 132 detects this action, and the Absence Image 80 becomes “1”. It is determined that the corresponding key has been deselected.
- the function specifying unit 1 32 2 instructs the display control unit 1 3 4 4 to return the display mode of the key corresponding to “1” to the original state.
- the action identifying unit 132 detects this attachment, and the player calculates It determines that the operation of the application has been stopped.
- the action identification unit 1 32 informs the application execution unit to stop the desktop application, and instructs the display control unit 1 3 4 to hide the application image 80.
- the screen returns to the screen shown in Fig. 5 (b) again.
- FIG. 9 is a flow chart for executing the application described in FIG. 5 to FIG. 8 in the three-dimensional positioning apparatus 10 according to the present embodiment.
- the camera 20 can be used to calculate the reflector 50 .
- the image capturing unit 102 acquires a frame including the direct image and the reflection image of the object 70 (S 10).
- the reflecting surface area specifying unit 112 detects a marker 56 in the frame given by the image inverting unit 104 to specify a reflecting surface area (S 12).
- the depth localization unit 12 2 detects the difference between the frames in the reflective surface area to obtain the depth direction of the object. Identify the position (S1 4).
- the action identification unit 132 determines whether the object 70 has entered the first entry area 62 according to the information from the depth localization unit 122 (S 16). The application is not executed unless the object 70 enters the first entry area 62 (N in S16).
- the event identification unit 132 instructs the application execution unit to start the application. Further, the in-frame localization unit 114 identifies the in-frame position of the library I by matching, and the display control unit 134 superimposes a predetermined application image on the in-frame position of the object check. Display (S 18). The intra-frame localization unit 114 continues tracking the object as long as the object 70 is inside the first entry area 62 (S 20). The depth localization unit 122 also detects the difference between the frames in the reflection surface area corresponding to the first reflection surface 52 and the second reflection surface 54, and specifies the position of the object in the depth direction (S22). ).
- the action identification unit 132 determines whether the object 70 has entered the second entry area 64 according to the information from the depth localization unit 122 (S 24). As long as the object 70 does not enter the second approach area 64 (N of S 24), the processes of S 18 to S 22 are repeated. When the object 70 enters the second entry area 64 (Y in S 24), the action identification unit 132 determines that the key of the application image 80 is pushed, and the information is used by the application execution unit and the display control unit 1. Tell 34. In response to this, the application executes the processing according to the in-frame position of the object 70, and the display mode of the application image 80 changes (S26).
- the entry of the object into the predetermined entry area is detected using the reflection image by the reflector, so that the player presses the object with the object.
- Detects actions in the direction of depth such as In conventional detection of objects based on inter-frame differences, detection of movement of objects along the depth direction, that is, the optical axis direction of the camera
- the reflection image from the direction intersecting the optical axis of the camera is used, so that the movement of the object in the depth direction can be accurately detected.
- the function of a specific application is turned on or off, or the image is displayed or not displayed.
- the switching of functions can be easily realized only by manipulating objects.
- the action of the player can have a plurality of meanings. That is, while the object 70 is in the first entry area 62, the operation of the object 70 by the player 72 corresponds to the “select” operation in the application image 80. Therefore, moving the object moves the key selected in the application image 80. Then, with the desired key selected, when the player 72 performs an operation to further push the object 70, the object 70 enters the second entry area 64, thereby “determining” It can give an operation. As described above, in the present embodiment, one of the features is that it can detect the stroke of the object.
- the movement for pushing the hand toward the camera may correspond to turning on a specific function
- the movement for pulling out the hand from the camera may correspond to turning off the specific function.
- the shape of the force displayed on the screen changes as you press the hand, and if you move the hand in that state, it becomes possible to draw a line on the screen, and when you pull out the hand
- Applications such as returning the shape of the cursor and making it impossible to write a line on the screen by moving the hand are also possible.
- a line will be drawn each time the hand is moved.
- the player can easily switch on and off the functions through simple actions.
- the number of reflective surfaces may be one or three or more.
- the player can not identify an argument such as pushing or pulling out of an object, but can at least determine whether the object is positioned within the entry area corresponding to the reflective surface. . Even if there are three or more reflecting surfaces, entry areas are set corresponding to each, and it is described above that it is determined by the depth localization unit 122 whether or not an object has entered each entry area. It is similar. By increasing the number of reflective surfaces, it is possible to identify more complex player's ideas, and thus to give the application more diverse instructions.
- the first embodiment has described that the application image of the calculator is displayed superimposed on the position in the frame of the object.
- the second embodiment an example of displaying a character operable by the player will be described.
- FIG. 10 0 (a) shows the overall configuration of a three-dimensional position determination device 12 according to Embodiment 2.
- the arrangement of the camera 20, the image processing device 30, the display 40 and the reflector 50 is the same as that of the first embodiment.
- the player controls an object ⁇ 76 is the entire hand of the player.
- the action identification unit 1 32 identifies the player's action. Set.
- the object detection unit 116 is a reference image representing a state in which the palm is open, and a reference representing a state in which the palm is closed as a reference image for matching with the object 76.
- the object detection unit 116 can detect the opening and closing of the hand as well as the in-frame position of the object.
- the function specifying section 132 displays the character image with the open mouth in the frame position of the object frame and the hand is closed. In the case of, the command to display the character image whose mouth is closed at the position in the frame of the object.
- FIG. 10 0 (b) shows a screen 44 in a state where the character image 82 corresponding to the state is displayed in FIG. 10 0 (a). Since the hand that is the object 76 in FIG. 10 (a) is closed, in FIG. 10 (b), the character image 82 with the mouth closed is displayed superimposed on the object 76.
- the object detection unit 1 1 6 opens the hand. Conditions are detected.
- the character image 82 with the mouth closed is superimposed on the object 7 6 and displayed.
- the action identification unit 132 may cause the voice output unit 36 to output voice in accordance with the change in the character's mouth. For example, when the mouth is closed, no sound may be emitted, and when the mouth is opened, the sound may be emitted. In this way, the player can open and close his / her hand in the first entry area 62 to realize an application that speaks characters.
- the object detection unit 1 16 has a plurality of reference images from the closed state to the open state of the hand, and detects the degree of openness of the hand by matching with them. May be In this case, the action identification unit 132 may instruct the display control unit 134 to change the opening degree of the character image in accordance with the degree of opening of the hand. In addition, the action identification unit 1 32 has the character's mouth The voice output unit 3 6 may be instructed to change the size of the voice, the pitch of the voice, and the voice color according to the opening degree. In this case, a plurality of voice data are prepared in a voice data storage unit (not shown), and the voice output unit 36 searches for appropriate voice data according to the instruction from the action identification unit 132. Output.
- the first approach area 62 and the second approach area 64 may be used to turn on / off a specific function.
- the display of the character image 82 is started, and only when the object 76 is located in the second entry area 64, the hand is opened or closed. Sound may be emitted. Even if the player 72 opens and closes the hand that is the object 76 when the object 76 is positioned in the first entry area 62, the mouth of the character image 82 in the screen is opened and closed according to the operation. No sound can be emitted.
- the object on which the character images are superimposed may be part of another body of the player 72, for example, a mouth.
- Figure 1 2 (a) and (b) show the situation.
- the object detection unit 1 16 detects the player's lip 78 by matching and specifies the position in the frame.
- the object detection unit 116 detects the degree of mouth opening by executing matching using a plurality of reference images.
- the display control unit 1 34 4 displays a character image 8 4 in the shape of the upper lip and the lower lip, superimposed on the lip of the player.
- the display control unit 134 changes the distance between the upper and lower lip of the character image 84 in accordance with the opening degree.
- the action identification unit 1 32 instructs the voice output unit 36 to output a voice in accordance with the opening / closing timing of the mouth. Using this, it is possible to realize an application that outputs a voice different from the player's voice, such as animal voices and voices of famous people, in accordance with the player's mouth movement.
- Another character image may be displayed at a position not overlapping the mirror image of the player, and the application may be such that the player moves the mouth by imitating the character moving the mouth. Match the character's mouth movement If you do, the voice of the character may be output.
- the object used for depth localization and the position within the frame are specified by matching.
- the object may be separate from the object. Therefore, depth localization using reflectors can only be used as a switch to turn on / off specific functions that use object matching.
- Embodiments 1 and 2 the technology for identifying the three-dimensional position of the object using the reflector 50 in which two reflecting surfaces are spaced apart in the depth direction has been described.
- the first and second embodiments in order to specify the in-frame position of an object, matching using a reference image is performed. For this reason, it is necessary to store in advance the reference image of the object in the reference image storage unit 120.
- the reference image may be stored in advance, but when using a part of the player's body as an object, it is necessary to obtain an object reference image for each player in order to improve the recognition accuracy. desirable.
- the normal of each surface is the presence of an object, instead of the reflector having two reflecting surfaces arranged in the depth direction and used in the first and second embodiments. It differs in that it uses a reflector with a first reflecting surface and a second reflecting surface that are angled to intersect at the side and reflect the object simultaneously.
- FIG. 13 shows a three-dimensional position identification device 14 according to a third embodiment.
- FIG. 14 shows a screen 44 displayed on the display 40 and recognized by the player in the state shown in FIG.
- FIG. 15 is a cross-sectional view of a plane perpendicular to the depth direction, that is, the z-direction, of the reflector 170.
- the configuration of the reflector 150 is different from that of the first embodiment.
- the reflector 170 has a first reflection surface 172 and a second reflection surface 174. As shown in Fig.15, one of the first reflective surface 2 and one of the second reflective surface 4 intersects with the surface normal of each surface 2 d and 1 7 4 d on the side where the object crest is present. As angled, two reflection images of the object are arranged to reflect simultaneously toward the camera 20.
- the first reflecting surface 12 72 and the second reflecting surface 17 4 are a plurality of strip-shaped reflecting surfaces 1 7 8 a to 1 7 8 d arranged in the depth direction. It consists of In Figure 14, one reflective surface is made up of four strip-like reflective surfaces. Markers 1 7 6 for identification are also provided at both ends in the major axis direction of the reflector 1 70 in the same manner as the reflector 5 0. [0089]
- the first reflection surface 12 and the second reflection surface 14 4 are made of a mirror, a mirror, a mirror surface processed metal, a plastic, a glass on which metal is deposited, etc. as the reflector 50.
- a microprism mirror on a plane formed by planarly arranging minute prisms.
- the reflective surface By configuring the reflective surface with a micro prism mirror, it is possible to reduce the thickness of the reflector 170, facilitating installation and saving space.
- the first reflection surface 172, the second reflection surface 174, and the strip-shaped reflection surface 17 8 a to 17 8 d it should be noted that although the angle is drawn at an angle, in fact, the micro prism mirror can give such a reflection angle even in a plane.
- FIG. 16 shows the configuration of an image processing apparatus 30 according to the second embodiment.
- the functions of the image acquisition unit 102, the image inversion unit 104, the arrangement confirmation unit 140, the arrangement instruction unit 142, and the on-screen display unit 144 are the same as in FIG. I will omit the explanation.
- the reflecting surface area specifying unit 1 12 specifies a reflecting surface area based on the position of the marker 1 1 6 in the frame received from the image inverting unit 1 0 4.
- the in-frame localization unit 114 includes a stereo image analysis unit 118 in addition to the object detection unit 116.
- the stereo image analysis unit 118 uses the two reflection images 7 0 c and 7 0 d in the reflection surface area specified by the reflection surface area specification unit 1 12 in accordance with a known technique. Identify the position in the frame. The position where the reflected image 7 0 c, 7 0 d is reflected, and the size of the reflected image 7 0 c, 7 0 d Because of this difference, the position within the frame of object 70 can be roughly determined.
- the reference image storage unit 120 extracts an image of a predetermined range centered on the intra-frame position of the object 70 identified by the stereo image analysis unit 118 from the frame, and stores it as a reference image.
- Object 70 is the method of the first reflective surface 1 72 and the second reflective surface 1 7 4 in the frame, extending from the location where the two reflection images of Object 70 are present. It should be near where the lines meet. Therefore, it is possible to obtain a reference image of the object by cutting out the area in the frame corresponding to the circle 180 in FIG.
- the accuracy of the in-frame position of the object determined by the stereo image of the reflection image is not very high, but this low accuracy is covered by cutting out a wider range of image than the object to be detected. can do.
- the size of the range from which the reference image is cut may be determined appropriately through experiments.
- the stereo image analysis unit 118 stores the image of the predetermined range in the reference image storage unit 120 as a reference image.
- object detection unit 116 can perform intraframe localization and tracking of object 70 using the reference image in reference image storage unit 120. It will be.
- the depth orienting portion 1 2 2 has a reflection image 7 0 c, 7 0 d of the object 70 on any of the reflecting surfaces. By detecting whether it is reflected or not based on the difference between frames, the position in the depth direction of the object 70 can be specified.
- FIG. 17 is a flow chart showing a procedure for executing the same desktop application as that shown in FIGS. 5 to 8 in the third embodiment.
- the camera 20 captures an image of the object ⁇ 70 and the reflector 1 1 0, and the image acquisition unit 1 0 2 takes a direct image of the image 1 0 Get a frame that contains the image and the reflection image (S 4 0).
- the reflecting surface area specifying unit 1 1 2 is a marker 1 7 6 in the frame given from the image inverting unit 1 0 4.
- the reflective surface area is identified by detecting (S 4 2).
- the reflection surface area identification unit 1 1 2 is used as a default image for taking a difference with the reflection image of the object 7 0. Before the object 7 0 enters the entry area 1 8 2, You may acquire an image.
- the depth localization unit 122 detects that the object has entered the entry area 182 by detecting the difference between the frames in the reflective surface area.
- the stereo image analysis unit 1 18 is configured to reflect the reflection images 7 0 c and 7 O d of the two objects projected on the reflection surface area specified by the reflection surface area specification unit 1 12. Identify the rough position of the object 7 in the frame based on (S 4 4).
- the stereo image analysis unit 118 extracts an image within a predetermined range 180 centered on the identified in-frame position from the frame as a reference image for use in matching, and stores the image in the reference image storage unit 1 20. (S 4 6). From this point onward, it is the same as S14 and later in Fig.
- the in-frame localization unit 114 uses the reference image stored in the reference image storage unit 120 to position the object within the frame.
- the depth localization unit 1 2 2 identifies the position in the depth direction of the object by detecting the difference between the frames in the reflective surface area.
- the stereo image analysis unit 118 If the position in the frame of the object can not be accurately determined by the stereo image analysis unit 118, the extraction of the reference image becomes inappropriate, and the object detection unit 116 matches the object. It can not be detected. In this case, the player may be notified to cut out the object again.
- a reflector provided with a first reflecting surface and a second reflecting surface that are angled so that their normals intersect on the side where the object is present Use to obtain a stereo image of the reflection image of the object.
- detecting the difference between the default image of the background in the reflective surface area and the image when the object enters it is possible to measure the timing of acquiring a stereo image of the object for extracting the reference image.
- storing default images also increases the robustness of difference detection. Ste By analyzing the rheo image, the rough position in the object frame can be identified without performing object matching, so that part can be extracted as a reference image.
- the action of storing the reference image by the player is omitted, which contributes to the quick start of the application.
- the procedure for acquiring the reference image is made invisible to the player.
- matching is performed to detect the position in the frame of the object I with high accuracy.
- one of the features of the third embodiment is that the quick start of the application and the height of the position accuracy by matching are compatible.
- the reflector 170 By using the above-mentioned reflector 170, it is possible to identify the three-dimensional position of the object only by the reflection image, even if there is no direct image of the object in the frame taken by the camera. .
- the in-frame position of the reobject is specified by matching using the reference image.
- More complex applications can be realized by increasing the number of strip-shaped reflective surfaces of the reflectors 70 and enhancing the detection accuracy of the object movement in the depth direction.
- virtual surgery can be considered.
- a three-dimensional image of the surgical site is displayed on the liquid crystal stereoscopic display, and player 1 operates by holding a stick-like object instead of a surgical tool such as a scalpel.
- the three-dimensional localization unit 110 identifies the three-dimensional position of the obturator cage, and changes the three-dimensional image of the surgical site displayed on the liquid crystal stereoscopic display according to the position.
- an image is displayed in which the surgical site is incised when the object is moved in a certain direction.
- LEDs may be mounted at multiple locations on the object, and the trajectory of the LED when the object is moved may be detected in the frame to obtain the motion vector of the object.
- the angle of view can be adjusted by controlling the curvature of the uneven surface of the mirror. Therefore, the entrance area for determining the entrance of the object is not limited to the vertical upper side of the reflecting surface as shown in FIG. 13, and can be fanned out or conversely narrowed. If the entry area is expanded, although the position accuracy is lowered, the range in which the movement of the object in the depth direction can be detected can be broadened.
- the reflector is used to specify the three-dimensional position of the object operated by the player, thereby specifying the action and activating the application function. All of these can change the display mode of the application image displayed on the screen to notify the player that the function has been recognized and the specific function has been enabled or disabled.
- FIG. 18 shows a configuration of a three-dimensional position specifying device 16 according to Embodiment 4.
- a camera 20 an image processing device 30, a display 40 and a reflector 170 are the same as those shown in FIG.
- a reflector 170 is the same as those shown in FIG.
- Player 7 2 manipulates object 7 0.
- Depth localization unit 1 2 2 It detects that Buzz ⁇ 70 has entered the entry area corresponding to the strip-like reflective surface 1 7 8 d, and the action identification unit 1 2 2 identifies the movement of the object ⁇ 70 in the direction of the camera. Report that to the execution unit and display control unit 134.
- the display mode of the selected area of the application image is changed, and a digit corresponding to the selected area is input to the calculator application.
- a predetermined sound effect is output from the speaker 42 along with the change in the display mode of the application image.
- the player 72 can have a stronger sense of operating the application through the object by listening to the audio as the display mode of the application image changes.
- it can be made to be aware that a virtual contact surface (expected contact surface) W exists in a portion corresponding to the strip-like reflective surface 1 78 d shown in FIG.
- FIG. 19 shows an image-linked voice control unit in an image processing device 30 according to a fourth embodiment.
- the velocity vector calculation unit 160 uses the frame captured by the camera 20, and the velocity vector of the movement of the object 70 operated by the player 72 toward the assumed contact surface W. Calculate Specifically, the velocity vector of the object is calculated based on the difference in the reflection image between multiple frames. Corresponding to the frame determined that the object 70 has entered the entry area corresponding to the strip-like reflective surface 1 7 8 a by the depth localization section 1 2 2 and the strip-like reflective surface 1 7 8 b or 1 7 8 c The time difference tf between the frames determined that the object has entered the entry area is calculated with reference to the frame rate of the camera 20.
- Movement time calculation unit 1 56 calculates the velocity V and the distance between the object and the assumed contact surface W.
- the movement time tm IiZv until the object 70 reaches the assumed contact surface W is calculated using Ii.
- the delay time acquisition unit 1 54 acquires a delay time t d until the sound emitted from the speaker 42 disposed apart from the player reaches the player 72.
- the exact distance L from the speaker 42 to the player 72 is unknown because it differs depending on the player, but since the position where the reflector 170 should be placed is determined, practically it is L is a constant and there is no problem.
- the delay time t d is given by a constant.
- the delay time acquisition unit 1 54 may take t d which is a constant.
- the voice synchronization unit 1 58 outputs a voice synchronized with the action of the player from the speaker 42 with reference to the moving time t m and the delay time t d. Specifically, the voice synchronization unit 1 58 starts from the shooting time of the frame used to calculate the velocity V, and after a lapse of time obtained by subtracting the delay time td from the moving time tm, Make it output. As a result, the player 1 hears the sound emitted from the speaker 42 substantially at the same time as the object reaches the assumed contact surface W.
- the sound to be output may be a completely different type of sound from the sound actually generated by the contact of the object with the surface.
- the time for an object to reach a contact surface before the object reaches a virtual or real contact surface is obtained. Calculate and pre-empt voice in consideration of voice delay. This makes it possible for the player to know that the function has been recognized through both sight and hearing.
- the voice synchronization unit 1 58 is particularly effective when the moving speed of the object is relatively slow.
- the delay time td by the delay time acquisition unit 1 54 may not be taken into consideration.
- the voice synchronization unit 1 5 8 identifies some occurrences of the rib layer by the action identification unit 1 3 2, for example, a selection option, the corresponding click sound or sound effect is output from the speaker 4 2 Let me do it.
- the audio synchronization unit 1 58 may output audio at a timing earlier than the measured movement time.
- the velocity vector calculation unit 160 may calculate the opening / closing speed of the mouth using difference information of the mouth between a plurality of frames. Then, using the opening and closing speed and the delay time, the audio synchronization unit 1 58 adjusts the output timing of the audio so that the opening and closing of the player's mouth and the audio emitted from the speaker are synchronized. It is also good.
- the reflector is not used and only the frame captured by the camera is used.
- a technique for estimating the velocity vector of an object is described.
- the movement component in the depth direction of the object can not be detected, and only the movement component in the frame is considered.
- FIG. 20 shows the configuration of a three-dimensional position specifying device 18 according to the fifth embodiment.
- the arrangement of the force camera 20, the image processing device 30 and the display 40 is the same as that of the above-described embodiment.
- the fifth embodiment does not use a reflector, but instead, the player operates an object 74 attached with a light emitting element such as an LED.
- FIG. 21 is a diagram for explaining the principle of a method of calculating the velocity vector of an object from one frame captured by the camera 20.
- An image sensor such as C C D or C M O S outputs a signal according to the amount of light stored in the element, but a certain amount of time is required to scan the entire element. More specifically, each pixel of the image sensor is lighted starting from the top row of elements to the bottom row, and readout of the accumulated light quantity starts. Reading is also started in the same way as daylighting, starting from the top row and proceeding at the same speed to the bottom of the row, line by row. Therefore, information having a time difference between the pixel row starting to pick up light and the pixel row starting to read out is included in one frame.
- CMOS since there is a difference in light start time in each row, when the movement of the object is fast, the image is distorted between the upper part of the image read out earlier and the lower part of the image read out last (moving body distortion). Since CMOS reads data one line at a time, if one screen is read in 1 Z 15 seconds, a difference of 1 Z 15 seconds will occur between reading start and reading end.
- FIG. 22 shows the configuration of the image cooperation voice control unit 150 according to the fifth embodiment.
- Daylighting Time Acquisition Unit 1 52 acquires the daylighting time te of the image sensor 22 employed in the camera 20. This information may be input in advance or may be acquired by communicating with the camera 20.
- the trajectory measuring unit 164 receives the remaining frame of the trajectory 75 from the image inverting unit 104, and measures the length p of the trajectory contained therein and its direction.
- movement time calculation unit 156 uses the calculated velocity V and the distance I i between object 74 and assumed contact surface W to calculate the assumed contact of object 74. Calculate the travel time tm to the surface W.
- the assumed contact surface W is virtually set in the frame, and the distance I i from the object 74 to the assumed contact surface W is calculated by analyzing the frame.
- the delay time acquisition unit 1 54 and the voice synchronization unit 1 58 are the same as in the fourth embodiment.
- FIG. 23 is a flowchart of processing to output sound in cooperation with an image in the fifth embodiment.
- the camera 20 captures an image of the operation of the object 74 having a light emitting element such as an LED (S60).
- the captured frame is passed from the image acquisition unit 102 to the trajectory measurement unit 164 in the image-linked voice control unit 150.
- Trajectory measurement unit 164 detects trajectory 75 generated by the LED in the frame, measures the length and direction of this trajectory, and passes the result to velocity vector calculation unit 160 (S 62).
- the velocity vector calculation unit 160 calculates the velocity V of the object using the length and direction of the trajectory, and the lighting time of the image sensor (S64).
- the movement time calculation unit 156 calculates the movement time tm of the object to the contact surface using the distance I i to the assumed contact surface W and the velocity V (S66).
- the voice synchronization unit 156 calculates the voice output timing by subtracting the delay time td from the moving time tm, starting from the photographing time of the frame used to calculate the velocity vector (S68). That Then, according to the output timing, the voice output unit 36 outputs a predetermined voice (S 70). As a result, the player 72 listens to the sound emitted from the speaker 42 substantially at the same time as the object reaches the supposed contact surface W.
- the state in which the light emitter is attached to the object and moved is photographed, and the light pickup time of the image sensor of the camera and the image output from the image sensor
- the information on the track of the light emitter in the frame can be used to calculate the velocity of the object.
- the fifth embodiment is characterized in that the velocity information of an object can be obtained by measuring the trajectory in a single frame regardless of differences among a plurality of frames.
- the light emitter attached to the object emits light and its locus remains in the frame.
- the application for projecting the specular image of the player and the object on the display has been described, but the moving image captured by the camera may not be displayed on the display.
- a camera capable of capturing a moving image at a sufficiently high frame rate, and a computing capability capable of processing such a high frame rate It is desirable to use a combination of an image processing device with drawing capability and a display capable of displaying an image at a high frame rate.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- User Interface Of Digital Computer (AREA)
- Position Input By Displaying (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/294,966 US8113953B2 (en) | 2006-07-06 | 2007-04-23 | Image-linked sound output method and device |
| EP07737098A EP2055361A1 (en) | 2006-07-06 | 2007-04-23 | Voice outputting method and device linked to images |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2006-186797 | 2006-07-06 | ||
| JP2006186797A JP4627052B2 (ja) | 2006-07-06 | 2006-07-06 | 画像に連携した音声出力方法および装置 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2008004331A1 true WO2008004331A1 (fr) | 2008-01-10 |
Family
ID=38894306
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2007/000441 Ceased WO2008004331A1 (fr) | 2006-07-06 | 2007-04-23 | Procédé et dispositif d'émission vocale, liés à des images |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US8113953B2 (https=) |
| EP (1) | EP2055361A1 (https=) |
| JP (1) | JP4627052B2 (https=) |
| WO (1) | WO2008004331A1 (https=) |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2005091117A1 (ja) * | 2004-03-22 | 2005-09-29 | Nintendo Co., Ltd. | 情報処理装置、情報処理プログラム、情報処理プログラムを記憶した記憶媒体およびウインドウ制御方法 |
| US9652030B2 (en) | 2009-01-30 | 2017-05-16 | Microsoft Technology Licensing, Llc | Navigation of a virtual plane using a zone of restriction for canceling noise |
| US9383823B2 (en) * | 2009-05-29 | 2016-07-05 | Microsoft Technology Licensing, Llc | Combining gestures beyond skeletal |
| JP2012181704A (ja) * | 2011-03-01 | 2012-09-20 | Sony Computer Entertainment Inc | 情報処理装置および情報処理方法 |
| CN102684257A (zh) * | 2012-05-03 | 2012-09-19 | 友达光电股份有限公司 | 太阳能系统、太阳能模块及供电方法 |
| US20140080593A1 (en) * | 2012-09-19 | 2014-03-20 | Wms Gaming, Inc. | Gaming System and Method With Juxtaposed Mirror and Video Display |
| JP5664877B2 (ja) * | 2012-09-27 | 2015-02-04 | 株式会社コナミデジタルエンタテインメント | サービス提供装置、それに用いる制御方法及びコンピュータプログラム |
| JP6102330B2 (ja) | 2013-02-22 | 2017-03-29 | 船井電機株式会社 | プロジェクタ |
| RU2015148842A (ru) | 2013-06-14 | 2017-07-19 | Интерконтинентал Грейт Брендс Ллк | Интерактивные видеоигры |
| US11826648B2 (en) | 2019-01-30 | 2023-11-28 | Sony Group Corporation | Information processing apparatus, information processing method, and recording medium on which a program is written |
| US11341456B2 (en) * | 2020-08-25 | 2022-05-24 | Datalogic Usa, Inc. | Compact and low-power shelf monitoring system |
| DE102022208956B3 (de) * | 2022-08-30 | 2024-01-25 | Siemens Healthcare Gmbh | Computerimplementiertes Verfahren zur Visualisierung einer Verzögerungszeit in einem übertragenen Bild und Vorrichtung zur Ausgabe eines Bildes |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH05232908A (ja) * | 1992-02-19 | 1993-09-10 | Toshiba Corp | 指示入力装置 |
| JPH0855235A (ja) * | 1994-08-11 | 1996-02-27 | Sharp Corp | 音声及び動作の制御装置並びに音声及び画像の出力装置 |
| WO1999060522A1 (fr) * | 1998-05-19 | 1999-11-25 | Sony Computer Entertainment Inc. | Dispositif et procede de traitement d'images, et support associe |
| JP2003085571A (ja) * | 2001-09-07 | 2003-03-20 | Tomy Co Ltd | 塗り絵玩具 |
| JP2004500657A (ja) * | 2000-02-11 | 2004-01-08 | カネスタ インコーポレイテッド | 仮想入力装置を用いたデータ入力方法および装置 |
| JP2004166246A (ja) * | 2002-10-25 | 2004-06-10 | Sony Computer Entertainment Inc | 画像生成方法および画像生成装置 |
| JP2005031799A (ja) * | 2003-07-08 | 2005-02-03 | Sony Computer Entertainment Inc | 制御システムおよび制御方法 |
| JP2005051660A (ja) * | 2003-07-31 | 2005-02-24 | Onkyo Corp | 映像信号および音声信号の再生システム |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6522395B1 (en) | 1999-04-30 | 2003-02-18 | Canesta, Inc. | Noise reduction techniques suitable for three-dimensional information acquirable with CMOS-compatible image sensor ICS |
| US6710770B2 (en) * | 2000-02-11 | 2004-03-23 | Canesta, Inc. | Quasi-three-dimensional method and apparatus to detect and localize interaction of user-object and virtual transfer device |
-
2006
- 2006-07-06 JP JP2006186797A patent/JP4627052B2/ja active Active
-
2007
- 2007-04-23 EP EP07737098A patent/EP2055361A1/en not_active Withdrawn
- 2007-04-23 US US12/294,966 patent/US8113953B2/en active Active
- 2007-04-23 WO PCT/JP2007/000441 patent/WO2008004331A1/ja not_active Ceased
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH05232908A (ja) * | 1992-02-19 | 1993-09-10 | Toshiba Corp | 指示入力装置 |
| JPH0855235A (ja) * | 1994-08-11 | 1996-02-27 | Sharp Corp | 音声及び動作の制御装置並びに音声及び画像の出力装置 |
| WO1999060522A1 (fr) * | 1998-05-19 | 1999-11-25 | Sony Computer Entertainment Inc. | Dispositif et procede de traitement d'images, et support associe |
| JP2004500657A (ja) * | 2000-02-11 | 2004-01-08 | カネスタ インコーポレイテッド | 仮想入力装置を用いたデータ入力方法および装置 |
| JP2003085571A (ja) * | 2001-09-07 | 2003-03-20 | Tomy Co Ltd | 塗り絵玩具 |
| JP2004166246A (ja) * | 2002-10-25 | 2004-06-10 | Sony Computer Entertainment Inc | 画像生成方法および画像生成装置 |
| JP2005031799A (ja) * | 2003-07-08 | 2005-02-03 | Sony Computer Entertainment Inc | 制御システムおよび制御方法 |
| JP2005051660A (ja) * | 2003-07-31 | 2005-02-24 | Onkyo Corp | 映像信号および音声信号の再生システム |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2008012102A (ja) | 2008-01-24 |
| EP2055361A1 (en) | 2009-05-06 |
| US8113953B2 (en) | 2012-02-14 |
| US20100222144A1 (en) | 2010-09-02 |
| JP4627052B2 (ja) | 2011-02-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2008004331A1 (fr) | Procédé et dispositif d'émission vocale, liés à des images | |
| TWI343208B (https=) | ||
| JP5806469B2 (ja) | 画像処理プログラム、画像処理装置、画像処理システム、および画像処理方法 | |
| US8241122B2 (en) | Image processing method and input interface apparatus | |
| JP5456832B2 (ja) | 入力された発話の関連性を判定するための装置および方法 | |
| CA2786681C (en) | Voice-body identity correlation | |
| JP5246790B2 (ja) | 音データ処理装置、及び、プログラム | |
| CN106464793B (zh) | 摄像装置和摄像辅助方法 | |
| WO2014066192A1 (en) | Augmenting speech recognition with depth imaging | |
| US20180196503A1 (en) | Information processing device, information processing method, and program | |
| JP2004312733A (ja) | 網膜トラッキングを組み込んだ装置及び方法 | |
| JP2022548804A (ja) | 画像処理方法、電子機器、記憶媒体及びコンピュータプログラム | |
| CN108702458A (zh) | 拍摄方法和装置 | |
| KR20210124313A (ko) | 인터랙티브 대상의 구동 방법, 장치, 디바이스 및 기록 매체 | |
| CN116489451B (zh) | 运镜信息的确定方法、场景画面的显示方法及装置 | |
| CN115830231A (zh) | 生成手部3d模型的方法、装置及电子设备 | |
| JP4409545B2 (ja) | 三次元位置特定装置および方法、奥行位置特定装置 | |
| JP2009239346A (ja) | 撮影装置 | |
| CN115171175B (zh) | 人脸识别方法、装置、设备及可读存储介质 | |
| JP2009177480A (ja) | 撮影装置 | |
| JP2009239349A (ja) | 撮影装置 | |
| CN114450730A (zh) | 信息处理系统及方法 | |
| CN120726688A (zh) | 一种动态手势识别方法、装置、电子设备、芯片及介质 | |
| CN116386639A (zh) | 语音交互方法及相关装置、设备、系统和存储介质 | |
| JP2006121264A (ja) | 動画像処理装置、動画像処理方法およびプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07737098 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2007737098 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 12294966 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| NENP | Non-entry into the national phase |
Ref country code: RU |