WO2018084051A1 - Information processing device, head-mounted display, information processing system, and information processing method - Google Patents

Information processing device, head-mounted display, information processing system, and information processing method Download PDF

Info

Publication number
WO2018084051A1
WO2018084051A1 PCT/JP2017/038524 JP2017038524W WO2018084051A1 WO 2018084051 A1 WO2018084051 A1 WO 2018084051A1 JP 2017038524 W JP2017038524 W JP 2017038524W WO 2018084051 A1 WO2018084051 A1 WO 2018084051A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
information processing
data
unit
position coordinates
Prior art date
Application number
PCT/JP2017/038524
Other languages
French (fr)
Japanese (ja)
Inventor
大場 章男
Original Assignee
株式会社ソニー・インタラクティブエンタテインメント
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社ソニー・インタラクティブエンタテインメント filed Critical 株式会社ソニー・インタラクティブエンタテインメント
Publication of WO2018084051A1 publication Critical patent/WO2018084051A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/64Constructional details of receivers, e.g. cabinets or dust covers

Definitions

  • the present invention relates to an information processing apparatus that performs information processing using captured images, a head-mounted display having a capturing function, an information processing system that displays images using captured images, and an information processing method using the captured images.
  • a game in which a part of the body, such as the user's head, is photographed with a video camera, a predetermined area such as the eyes, mouth, hand, etc. is extracted and the area is replaced with another image and displayed on the display.
  • a user interface system that receives mouth and hand movements taken by a video camera as application operation instructions.
  • technology that captures the real world and displays a virtual world that reacts to its movement, or performs some kind of information processing, is used in a wide range of fields, from small mobile terminals to leisure facilities, regardless of their scale. ing.
  • Immediateness from shooting to output of results is important in order to realize realistic image expression and to perform information processing with high accuracy.
  • a subject's movement is immediately reflected in the displayed image, or when an image pickup device is provided on the head-mounted display worn by the user to display an image corresponding to the field of view, a slight delay in processing is uncomfortable and easy to use. Produce badness.
  • In order to pursue immediacy if it is attempted to reduce the amount of data transfer and processing load, it may be considered that sufficient accuracy cannot be obtained for information processing.
  • the present invention has been made in view of these problems, and an object of the present invention is to provide a technique that achieves both immediacy and processing accuracy in information processing and display using captured images.
  • An aspect of the present invention relates to an information processing apparatus.
  • the information processing apparatus includes a captured image acquisition unit that acquires data of a captured moving image from an imaging apparatus that includes a rolling shutter that captures an image with a time lag for each row of pixels, and a frame of the moving image.
  • a correction unit that corrects the position coordinates of the feature points to the position coordinates at the reference time of the frame; and an analysis processing unit that performs image analysis using the corrected position coordinates and reflects the result in the output data. It is characterized by that.
  • This head-mounted display has a rolling shutter that captures images with a time lag for each row of pixels, an image pickup unit that outputs captured image data sequentially from the row after shooting, and a shooting for each row of pixels.
  • a display unit that acquires image data and displays the image data sequentially from the line for which acquisition has been completed.
  • Still another aspect of the present invention relates to an information processing system.
  • This information processing system includes a rolling shutter that captures an image with a time lag for each row of pixels, and is captured from an imaging unit that sequentially outputs captured image data from the row where the shooting is completed, and the imaging unit. Analysis of acquiring the moving image data, correcting the position coordinates of the feature points in the frame of the moving image to the position coordinates of the frame at the reference time, and performing image analysis using the corrected position coordinates
  • a processing unit, an output data generation unit that generates display image data using the result of image analysis and outputs the data for each row, and a display unit that sequentially displays the display image from the output row It is characterized by.
  • Still another aspect of the present invention relates to an information processing method.
  • the information processing device obtains captured moving image data from an imaging device including a rolling shutter that captures an image with a time lag for each row of pixels, and a frame of the moving image. Correcting the position coordinates of the feature points to the position coordinates at the reference time of the frame, performing image analysis using the corrected position coordinates, and outputting data reflecting the analysis results. It is characterized by including.
  • FIG. 1 shows a configuration example of an information processing system according to the present embodiment.
  • the information processing system 1 includes an imaging device 12 that captures a real space, an information processing device 10 that performs information processing based on a captured image, and a display device 16 that displays an image output by the information processing device 10.
  • the information processing apparatus 10 may be connectable to a network 18 such as the Internet.
  • the information processing apparatus 10, the imaging apparatus 12, the display apparatus 16, and the network 18 may be connected by a wired cable, or may be wirelessly connected by a wireless LAN (Local Area Network) or the like. Any two or all of the imaging device 12, the information processing device 10, and the display device 16 may be combined and integrally provided.
  • the information processing system 1 may be realized by a portable terminal or a head mounted display equipped with them. In any case, the external shapes of the imaging device 12, the information processing device 10, and the display device 16 are not limited to those illustrated.
  • the imaging device 12 performs imaging processing such as a CMOS (Complementary Metal Metal Oxide Semiconductor) sensor that captures an object at a predetermined frame rate, and performs demosaic processing, lens distortion correction, color correction, and the like on the output data.
  • CMOS Complementary Metal Metal Oxide Semiconductor
  • Including an image processing mechanism for generating The image processing mechanism may include a mechanism for generating image data with a plurality of resolutions by reducing an image.
  • the imaging device 12 may be a so-called stereo camera in which two cameras are arranged on the left and right at a known interval.
  • the imaging device 12 transmits the data of the captured and generated image to the information processing device 10 in a stream format in order from the uppermost pixel row of the image.
  • the imaging device 12 generates image data with a plurality of resolutions, for example, only resolution and area data in accordance with a request from the information processing device 10 may be transmitted.
  • the information processing apparatus 10 performs image analysis on the data transmitted from the imaging apparatus 12 and performs information processing based on the result, or reflects the data in a data request to the imaging apparatus 12. Further, the information processing apparatus 10 transmits output data such as a display image and sound to the display device 16.
  • the content of the output data is not particularly limited, and may vary depending on the function requested by the user in the system and the content of the activated application.
  • the information processing apparatus 10 may transmit the captured image data transmitted from the imaging apparatus 12 as it is to the display device 16 so that the captured image is displayed immediately. Or you can acquire the position and orientation of the object in the captured image by image analysis, and apply some processing to the captured image based on it, or progress the electronic game based on the result of the image analysis and generate a game screen Good.
  • Typical examples of such a mode include virtual reality (VR) and augmented reality (AR).
  • the display device 16 includes a display such as liquid crystal, plasma, or organic EL that outputs an image, and a speaker that outputs sound, and outputs output data transmitted from the information processing device 10 as an image or sound.
  • the display device 16 may be a television receiver, various monitors, a display screen of a mobile terminal, an electronic viewfinder of a camera, or a head-mounted display that is attached to the user's head and displays an image in front of the user's eyes.
  • FIG. 2 shows an example of the external shape when the display device 16 is a head mounted display.
  • the head mounted display 100 includes an output mechanism unit 102 and a mounting mechanism unit 104.
  • the mounting mechanism unit 104 includes a mounting band 106 that goes around the head when the user wears to fix the device.
  • the output mechanism unit 102 includes a housing 108 shaped to cover the left and right eyes when the user mounts the head mounted display 100, and includes a display panel inside so as to face the eyes when worn.
  • the housing 108 may further include a lens that is positioned between the display panel and the user's eyes when the head mounted display 100 is attached, and that enlarges the viewing angle of the user.
  • the head mounted display 100 may further include a speaker or an earphone at a position corresponding to the user's ear when worn.
  • the head mounted display 100 includes the stereo camera 110 on the front surface of the housing 108 as the imaging device 12, and takes a moving image of the surrounding real space with a field of view corresponding to the user's line of sight.
  • the information processing apparatus 10 can identify the position and orientation of the user's head relative to the surrounding environment by analyzing the captured image using SLAM (SimultaneousaneLocalization and Mapping) technology. Using this information, for example, if a visual field for the virtual world is determined and display images for left-eye viewing and right-eye viewing are generated and displayed, a VR as if the virtual world has spread in front of the eyes can be realized. .
  • the information processing apparatus 10 may be an external apparatus that can establish communication with the head mounted display 100 or may be built in the head mounted display 100.
  • the configuration and appearance of each device may be appropriately determined accordingly.
  • a description will be given with a focus on a technique that achieves both immediacy from shooting in real space to image display and analysis accuracy of the shot image.
  • a rolling shutter camera in which shooting timing is shifted for each row is employed as the imaging device 12.
  • FIG. 3 is a diagram for explaining the relationship between the image data acquisition order by the rolling shutter and the image plane.
  • the upper part of the figure shows an exposure time 142 and a data read time 144 for each row of pixel columns in the horizontal direction (x axis) on a plane 140 formed by the vertical direction (y axis) of the imaging surface and the time axis.
  • the rolling shutter camera is a “line exposure sequential readout” type camera. As shown in the figure, the exposure time is shifted for each row of the imaging surface, and data for each frame is obtained by reading the data of each row immediately after the exposure is completed. .
  • analysis and display are performed by expanding the data for each row acquired in this way on a plane 146 formed by the x-axis and the y-axis and handling them as image frames at the same time.
  • the actual observation time differs depending on the difference in exposure time between the object that appears above the photographed image and the object that moves down, so that the analysis result may include an error due to the difference.
  • the error becomes larger as the movement of the object or the imaging surface becomes faster.
  • the global shutter camera is a “simultaneous exposure batch readout” type camera, and all rows are exposed at the same timing, so there is no difference in the observation time of objects appearing in one frame.
  • FIG. 4 is a diagram for explaining a difference in image analysis when a subject moves.
  • (a) is a global shutter camera
  • (b) is a rolling shutter camera, and the same subject (or a feature point included therein) is photographed. It represents about. That is, the minimum unit solid line rectangle (rectangle or parallelogram) represents exposure for one frame.
  • FIG. 5 is a diagram for explaining the passage of time from shooting to display when the transfer time is taken into account.
  • A shows the timing of data processing for one frame when displaying an image shot with a global shutter camera and
  • B showing an image shot with a rolling shutter camera on the display device 16, respectively.
  • the information processing apparatus 10 may be interposed in the data transfer path, it is omitted in FIG.
  • the exposure of all the rows at time tg 0 is made at the same time.
  • the output from the imaging device 12 is made in order from the upper row due to the limitation of the transmission band, as indicated by a dotted line in the rectangle 110a.
  • the upper line of the image is output at an early timing indicated by an arrow a
  • the lower line is output at a later timing indicated by an arrow a ′. That is, the lower row needs to wait for ⁇ t before data output. As a result, it takes time tg 1 -tg 0 for the imaging device 12 to output data of all rows.
  • the data thus output is stored in the frame buffer of the display device 16 via the information processing device 10 and displayed.
  • the dotted line in the rectangle 112a represents the timing at which the data of each row is stored in the display device 16.
  • the upper row of the image is stored at an early timing as indicated by an arrow a
  • the lower row is stored at a later timing as indicated by an arrow a ′. That is, the upper row needs to wait for ⁇ t ′ before all data is stored. As a result, it takes time tg 3 -tg 2 for the display device 16 to store all rows of data in the frame buffer.
  • the time at which the storage in the frame buffer is completed is tg 3 , but in the case of a display device premised on the frame buffer, an adjustment time corresponding to the driving method is further generated until the actual display. .
  • the example shown in the drawing is a simple mode in which a captured image is displayed as it is. However, even if some processing or drawing is performed in the information processing apparatus 10, as long as the transmission band is limited, the time for sequentially transmitting data and the frame As the time for storing in the buffer, at least tg 3 -tg 0 is required from shooting to display. The same waiting time also occurs when the information processing apparatus 10 is provided with a frame buffer and performs image analysis or the like.
  • a display corresponding to a line buffer that can output immediately for each row without waiting for storage of data for one frame in the frame buffer is preferably adopted as the display device 16. Then, as indicated by a rectangle 112b, the display progresses sequentially in the time from time tr 2 to tr 3 so as to synchronize with the output timing from the imaging device 12. As a result, the waiting time ⁇ t ′ generated in the display device based on the frame buffer does not occur.
  • the time difference from imaging to display is the shortest in the combination of the rolling shutter camera and the line buffer compatible display device shown in FIG. That is, on the premise of the time difference due to the scanning of the display screen, the latest information can be displayed as an image by providing the observation time difference for each line.
  • a field emission display FED: Field ⁇ ⁇ ⁇ Emission Display
  • FED Field ⁇ ⁇ ⁇ Emission Display
  • the figure shows the processing timing of the imaging device 12 and the display device 16 as the most easily understood example.
  • newer data that is, information
  • the information processing apparatus 10 that performs processing.
  • realizing a coordinated operation that shortens the time for waiting for input data in a memory or the like as much as possible has a remarkable effect in terms of immediacy and accuracy of display and information processing.
  • an effect of saving the memory capacity can be obtained.
  • a rolling shutter camera is adopted for shooting, and the imaging device 12, the information processing device 10, and the display device 16 are basically configured to output data immediately.
  • image analysis by correcting the data so that the time difference of observation in one frame is eliminated, both immediate use of observation data and analysis accuracy are compatible. The correction is basically performed by estimating the position of each frame at the reference time based on the time and position at which the feature point was observed. A specific calculation method will be described later.
  • FIG. 6 shows the internal circuit configuration of the information processing apparatus 10.
  • the information processing apparatus 10 includes a CPU (Central Processing Unit) 23, a GPU (Graphics Processing Unit) 24, and a main memory 26. These units are connected to each other via a bus 30.
  • An input / output interface 28 is further connected to the bus 30.
  • the input / output interface 28 outputs data to a peripheral device interface such as USB or IEEE1394, a communication unit 32 including a wired or wireless LAN network interface, a storage unit 34 such as a hard disk drive or a nonvolatile memory, and the display device 16.
  • An output unit 36, an input unit 38 for inputting data from the imaging device 12 or an input device (not shown), and a recording medium driving unit 40 for driving a removable recording medium such as a magnetic disk, an optical disk or a semiconductor memory are connected.
  • the CPU 23 controls the entire information processing apparatus 10 by executing the operating system stored in the storage unit 34.
  • the CPU 23 also executes various programs read from the removable recording medium and loaded into the main memory 26 or downloaded via the communication unit 32.
  • the GPU 24 has a function of a geometry engine and a function of a rendering processor, performs a drawing process in accordance with a drawing command from the CPU 23, and outputs it to the output unit 36.
  • the main memory 26 is composed of RAM (Random Access Memory) and stores programs and data necessary for processing.
  • FIG. 7 shows a functional block configuration of the information processing apparatus 10.
  • Each functional block shown in the figure can be realized by the various circuits shown in FIG. 6 in terms of hardware, and in terms of software, an image analysis function, an information processing function, and an image loaded from a recording medium to the main memory. This is realized by a program that exhibits various functions such as a drawing function and a data input / output function. Therefore, it is understood by those skilled in the art that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof, and is not limited to any one.
  • the information processing apparatus 10 includes a captured image acquisition unit 52 that acquires captured image data from the imaging device 12, an image analysis unit 54 that analyzes the acquired image, and an output that generates data to be output by using the analysis result.
  • a data generation unit 56 is included.
  • the captured image acquisition unit 52 is realized by the input unit 38, the CPU 23, the main memory 26, and the like of FIG. 6, and sequentially acquires frame data of the captured image from the imaging device 12. Specifically, as described above, data is acquired in a stream format in order from the line in which exposure is completed in one frame. The acquired data is supplied to the image analysis unit 54 and the output data generation unit 56.
  • the captured image acquisition unit 52 may request the imaging apparatus 2 by designating captured image data to be acquired based on the result of image analysis by the image analysis unit 54.
  • the image analysis unit 54 is realized by the CPU 23, the GPU 24, the main memory 26, and the like in FIG. 6, performs a predetermined image analysis using the captured image data, and supplies the result to the output data generation unit 56.
  • the content of the analysis performed by the image analysis unit 54 is not particularly limited.
  • the above-mentioned analysis such as the SLAM or the tracking processing of the object may be performed, and any of image analysis generally performed such as object detection, object recognition, and depth map acquisition may be used.
  • the accuracy of the analysis process can be improved by the correction that eliminates the difference in the observation time within the frame as described above.
  • the image analysis unit 54 includes a feature extraction unit 60, a correction unit 62, a correction data storage unit 64, and an analysis processing unit 66.
  • the feature extraction unit 60 extracts features used for image analysis from the captured image. Specific extraction objects and processing algorithms vary depending on the image analysis to be performed, and examples include edge detection, corner detection, contour detection, and area division by texture. Since such feature extraction may be performed by a general technique, detailed description thereof is omitted.
  • the correction unit 62 corrects the points, lines, or area boundaries extracted as features to represent the exact position of each frame at the reference time.
  • Data necessary for correction is stored in the correction data storage unit 64 and referred to as appropriate. Examples of the data include position information of features extracted up to the previous frame, and a two-dimensional map in which parameters related to observation time shift time are associated with discrete positions on the image plane.
  • the parameter may be calculated in consideration of lens distortion correction performed by the imaging device 12.
  • the parameters that can be calculated in advance based on lens distortion correction and exposure progress speed, etc. can be prepared in advance in the form of a two-dimensional map or a lookup table, so that the correction process can be made more efficient.
  • the analysis processing unit 66 performs a predetermined analysis of the image analysis as exemplified above by using the feature data whose position is corrected.
  • the output data generation unit 56 is realized by the CPU 23, the GPU 24, the main memory 26, the output unit 36, and the like of FIG. 6, and generates display image and audio data to be output and outputs them to the display device 16.
  • the kind of data to be generated may vary depending on the purpose of use of the information processing apparatus 10 and the application selected by the user.
  • the data stream acquired from the captured image acquisition unit 52 may be output as it is. Even when some processing is performed on the captured image, it is desirable to suppress the delay time by completing the processing for each row and outputting it immediately.
  • FIG. 8 is a diagram for explaining a correction processing technique performed by the correction unit 62.
  • the horizontal direction is the time axis
  • the vertical direction is the vertical axis (y-axis) of the captured image
  • the solid line rectangle (parallelogram) of the smallest unit is the exposure processing of one frame by the rolling shutter camera.
  • the black circles represent the observation times and positions of feature points.
  • two feature points are shown in each frame. For example, in the n-th frame f n , two feature points are observed at the positions y n and y n ′ at the exposure timing indicated by the dotted line.
  • the position information of the object is naturally a two-dimensional coordinate consisting of the horizontal axis (x-axis) and the y-axis of the image plane.
  • the exposure time of the top row is a reference time and the interval of the reference time between frames is d
  • the reference times of the frames f n ⁇ 1 , f n , f n + 1 , f n + 2 d ⁇ (n ⁇ 1), d ⁇ n, d ⁇ (n + 1), d ⁇ (n + 2),...
  • the interval d has an inversely proportional relationship with the progress rate of exposure in the y-axis direction, and can also be regarded as a frame shooting period.
  • the correcting unit 62 obtains a position correction value at the reference time by interpolation based on the amount of movement of the feature point from the previous frame f n ⁇ 1 .
  • the corrected position is indicated by a white square.
  • V is the length of the image in the vertical direction.
  • the feature point 120 is observed at a time (d ⁇ R n ⁇ 1 ) before the reference time d ⁇ n and a time after R n .
  • Equation 1 can be expressed as follows.
  • the correction unit 62 corrects the position coordinates of the feature points of each frame composing the captured image to the value at the reference time of the frame using Expression 2. Even if the feature is a line or a region, it can be corrected to the shape at the reference time by correcting the position coordinates of the points constituting the line or boundary line. Thereby, based on the image at the time unified within the frame, image analysis can be performed accurately.
  • the correction unit 62 since the position coordinates of the same feature point in the immediately preceding frame are used for the correction, the correction unit 62 stores the identification information of the feature point in association with at least the position coordinates in the immediately previous frame in the correction data storage unit 64. Keep it.
  • the change of the position coordinates of the feature points observed in the two preceding and following frames is linearly interpolated, but the interpolation algorithm is not limited to this. That is, the position coordinates observed in three or more frames may be taken into account, or they may be used to interpolate with a curve such as spline interpolation.
  • the reference time can be freely set such as the exposure time of the center row and the bottom row.
  • the observation time of the feature point located above the line exposed at the reference time is earlier than the reference time, but the “delay time” described so far is replaced with the “deviation time” from the reference time.
  • the correction can be realized by the same calculation. The same applies to the following calculations.
  • the above calculation is based on the premise that the orthogonal two-dimensional array of image sensors corresponds to the pixel array of the captured image.
  • the position with the observation time of the y-axis direction is proportional
  • the delay time R n from the reference time is determined as a linear function of y n as described above.
  • each pixel of the output captured image is shifted in the x-axis direction and the y-axis direction from the position of the imaging element where the pixel value is observed, and the shift amount is It depends on the position on the image.
  • the correction unit 62 may perform correction in consideration of lens distortion correction in the imaging device 12.
  • the lens distortion correction reverse correction M is applied to each position (x, y) to obtain the coordinates (x m , y m ) before correction, and the y coordinate y of them Calculate the ratio of m to V.
  • a delay map representing the parameter m (x, y) on a two-dimensional plane can be easily applied to the same mesh. Can be generated.
  • the correction unit 62 refers to the delay map stored in the correction data storage unit 64 based on the position coordinates (x n , y n ) of the feature points extracted from the frame f n and interpolates as necessary. Then, the parameter m (x n , y n ) indicating the delay ratio at the time when the feature point is observed is acquired.
  • the delay time R n of the observation time of the feature point from the reference time d ⁇ n of the frame f n is obtained as follows.
  • R n d ⁇ m (x n , y n )
  • L n 1 / (1 ⁇ m (x n ⁇ 1 , y n ⁇ 1 ) + m (x n , y n ))
  • the corrected position coordinates (xc n , yc n ) instead, it asks for:
  • FIG. 9 is a diagram for explaining a method in which the correction unit 62 obtains a corrected optical flow.
  • the format of the figure and the movement of the feature points are the same as those shown in FIG.
  • an optical flow indicating a movement vector of an object or a feature point is important information for detecting and tracking a moving object or specifying a shape change.
  • the vector 130 indicating the optical flow is shown in a two-dimensional space formed by the y-axis and the time axis, but actually, the amount of movement per unit time on the image plane composed of the x-axis and the y-axis is shown. To express.
  • the correcting unit 62 accurately obtains the optical flow using the position coordinates (xc n , yc n ) of the feature points corrected by Expression 2 or 3. Specifically, the optical flow of the feature points in the frame f n , that is, the vector (Vx n , Vy n ) on the image plane is obtained as follows.
  • Expression 4 is a technique for obtaining a vector after correcting the position coordinates (x n , y n ) of the feature point by Expression 2 or 3, but it is obtained directly from the position coordinates of the same feature point in the preceding and following frames.
  • the vector 132 may be approximated as follows to simplify the processing.
  • the parameter m is used for the ratio of the delay time of the observation time, but a linear expression such as y n / V may be substituted.
  • the optical flow may be obtained by the analysis processing unit 66 as needed during the image analysis.
  • the exposure time of each row is assumed to be equal to the interval d of the reference time, that is, the shooting period of the frame, but more strictly, the exposure time changes according to the shooting environment etc. with the shooting period as the upper limit. Can do.
  • sh n is a shutter correction value in consideration of the exposure time e n in the frame f n, it is defined as follows.
  • sh n (1 ⁇ e n / d) / 2
  • the position coordinates after correction is obtained as follows.
  • shutter correction value sh n-1 and sh n may be approximated as follows.
  • FIG. 11 schematically shows a temporal relationship among a captured image, an image used for image analysis, and a display image in the present embodiment.
  • the left side of the figure shows a photographed image by the imaging device 12, and the right side shows a display image by the display device 16, corresponding to three frames f n ⁇ 1 , f n and f n + 1 .
  • a rolling shutter is used for the image pickup device 12 if the data for each row that is observed and read at each time is arranged in the time axis direction with the vertical direction in the figure as the time axis, it corresponds to the image plane as shown in the figure. Become.
  • the black ball 200 is photographed to move downward.
  • a stationary object 202 is arranged on the right side.
  • the time at which the ball 200 is observed in each frame is the reference time d ⁇ (n ⁇ 1), d ⁇ n, d ⁇ (n + 1) that is the exposure time of the uppermost row of the frame, respectively.
  • the information processing apparatus 10 acquires such captured image data from the imaging apparatus 12 in a stream format in order from the top row.
  • the information processing apparatus 10 sequentially outputs the acquired streams to the display apparatus 16.
  • the display device 16 is a display corresponding to the line buffer
  • the data is displayed immediately from the upper stage to the lower stage of the screen in the output order.
  • the delay time from photographing to display is unified to the time ⁇ T required for transfer in all rows. As a result, the most recent data possible is displayed.
  • the image of the ball 200 observed before and after the reference times d ⁇ n and d ⁇ (n + 1) is indicated by a dotted line, and the corrected image obtained by temporally interpolating them is shown.
  • the position of the ball is shaded.
  • the correction target is limited to the feature points necessary for image analysis, thereby suppressing the occurrence of delay due to the correction processing.
  • the result of accurate analysis using the positions of the feature points at the unified time in this way may be reflected in the subsequent display image or may be used for requesting data to the imaging device 12.
  • a region where the object is captured may be predicted and notified to the imaging device 12.
  • the imaging device 12 transmits the high-resolution image data of the notified area and the low-resolution image data of the other area, and synthesizes and displays them or uses them for further analysis.
  • the data transfer amount can be suppressed as a whole.
  • the movement of the visual field can be accurately obtained by using SLAM or the like, and VR or AR without a sense of incongruity can be realized.
  • the feature points used for image analysis are targeted for correction, and the image itself is basically output with a time difference included. That is, the handling of the captured image is independent between the display process and the analysis process. If this characteristic is used, depending on the type and purpose of image analysis, the correction of feature points and the frequency of analysis processing using them will be lower than the frame rate of shooting and display, reducing the overall processing load. You can also.
  • a rolling shutter camera is used as an imaging device, the time required from actual observation to output is shortened, and the captured image is analyzed within the frame. Correct to eliminate the observation time difference. This makes it possible to perform image analysis with high accuracy by using a conventionally used algorithm as it is.
  • correction is limited to the characteristics used for image analysis, and parameters used for correction are calculated in advance, so that the correction process can be made more efficient and the influence on time can be reduced.
  • the parameter calculation result is prepared as a two-dimensional map corresponding to discrete positions on the image plane, so that random access is possible and even an image subjected to a specific operation such as lens distortion correction, Easy and precise correction can be realized.
  • the present invention can be used for information processing devices such as game devices and personal computers, head mounted displays, imaging devices, and information processing systems including them.

Abstract

An image capturing device 12 uses a rolling shutter camera to capture moving images (frames fn-1, fn, fn+1) having a time difference in an observation time period for each row in a frame. A display device 16 immediately displays the data of the captured images for each row in the order of acquisition. An information processing device 10 corrects position coordinates of a feature point extracted in each frame to position coordinates at respective reference time points, and performs analysis (images 204a, 204b).

Description

情報処理装置、ヘッドマウントディスプレイ、情報処理システム、および情報処理方法Information processing apparatus, head mounted display, information processing system, and information processing method
 本発明は、撮影画像を利用した情報処理を行う情報処理装置、撮影機能を有するヘッドマウントディスプレイ、撮影画像を利用して画像を表示する情報処理システム、および撮影画像を用いた情報処理方法に関する。 The present invention relates to an information processing apparatus that performs information processing using captured images, a head-mounted display having a capturing function, an information processing system that displays images using captured images, and an information processing method using the captured images.
 ユーザの頭部など体の一部をビデオカメラで撮影し、目、口、手などの所定の領域を抽出して、その領域を別の画像で置換してディスプレイに表示するゲームが知られている(例えば、特許文献1参照)。また、ビデオカメラで撮影された口や手の動きをアプリケーションの操作指示として受け取るユーザインタフェースシステムも知られている。このように、実世界を撮影しその動きに反応する仮想世界を表示させたり、何らかの情報処理を行ったりする技術は、小型の携帯端末からレジャー施設まで、その規模によらず幅広い分野で利用されている。 A game is known in which a part of the body, such as the user's head, is photographed with a video camera, a predetermined area such as the eyes, mouth, hand, etc. is extracted and the area is replaced with another image and displayed on the display. (For example, refer to Patent Document 1). There is also known a user interface system that receives mouth and hand movements taken by a video camera as application operation instructions. In this way, technology that captures the real world and displays a virtual world that reacts to its movement, or performs some kind of information processing, is used in a wide range of fields, from small mobile terminals to leisure facilities, regardless of their scale. ing.
欧州特許出願公開第0999518号明細書European Patent Application No. 0999518
 臨場感のある画像表現を実現したり、情報処理を高精度に行ったりするためには、撮影から結果出力までの即時性が重要となる。被写体の動きを即座に表示画像に反映させる場合や、ユーザが装着するヘッドマウントディスプレイに撮像装置を設け、視野に対応する画像を表示させる場合などは特に、ごく僅かな処理の遅延が違和感や使い勝手の悪さを生む。即時性の追求のため、データ転送量や処理負荷を軽減させようとすると、情報処理に十分な精度が得られないことも考えられる。 Immediateness from shooting to output of results is important in order to realize realistic image expression and to perform information processing with high accuracy. In particular, when a subject's movement is immediately reflected in the displayed image, or when an image pickup device is provided on the head-mounted display worn by the user to display an image corresponding to the field of view, a slight delay in processing is uncomfortable and easy to use. Produce badness. In order to pursue immediacy, if it is attempted to reduce the amount of data transfer and processing load, it may be considered that sufficient accuracy cannot be obtained for information processing.
 本発明はこうした課題に鑑みてなされたものであり、その目的は、撮影画像を用いた情報処理や表示において、即時性と処理の精度を両立させる技術を提供することにある。 The present invention has been made in view of these problems, and an object of the present invention is to provide a technique that achieves both immediacy and processing accuracy in information processing and display using captured images.
 本発明のある態様は情報処理装置に関する。この情報処理装置は、画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備えた撮像装置から、撮影された動画像のデータを取得する撮影画像取得部と、動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正する補正部と、補正された位置座標を用いて画像解析を行い、その結果を出力データに反映させる解析処理部と、を備えたことを特徴とする。 An aspect of the present invention relates to an information processing apparatus. The information processing apparatus includes a captured image acquisition unit that acquires data of a captured moving image from an imaging apparatus that includes a rolling shutter that captures an image with a time lag for each row of pixels, and a frame of the moving image. A correction unit that corrects the position coordinates of the feature points to the position coordinates at the reference time of the frame; and an analysis processing unit that performs image analysis using the corrected position coordinates and reflects the result in the output data. It is characterized by that.
 本発明の別の態様はヘッドマウントディスプレイに関する。このヘッドマウントディスプレイは、画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備え、撮影が完了した行から順次、撮影画像のデータを出力する撮像部と、画素の行ごとに撮影画像のデータを取得し、取得が完了した行から順次、表示する表示部と、を備えたことを特徴とする。 Another aspect of the present invention relates to a head mounted display. This head-mounted display has a rolling shutter that captures images with a time lag for each row of pixels, an image pickup unit that outputs captured image data sequentially from the row after shooting, and a shooting for each row of pixels. A display unit that acquires image data and displays the image data sequentially from the line for which acquisition has been completed.
 本発明のさらに別の態様は情報処理システムに関する。この情報処理システムは、画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備え、撮影が完了した行から順次、撮影画像のデータを出力する撮像部と、撮像部から、撮影された動画像のデータを取得し、当該動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正する補正部と、補正された位置座標を用いて画像解析を行う解析処理部と、画像解析の結果を利用して表示画像のデータを生成し行ごとに出力する出力データ生成部と、表示画像を、出力された行から順次表示する表示部と、を備えたことを特徴とする。 Still another aspect of the present invention relates to an information processing system. This information processing system includes a rolling shutter that captures an image with a time lag for each row of pixels, and is captured from an imaging unit that sequentially outputs captured image data from the row where the shooting is completed, and the imaging unit. Analysis of acquiring the moving image data, correcting the position coordinates of the feature points in the frame of the moving image to the position coordinates of the frame at the reference time, and performing image analysis using the corrected position coordinates A processing unit, an output data generation unit that generates display image data using the result of image analysis and outputs the data for each row, and a display unit that sequentially displays the display image from the output row It is characterized by.
 本発明のさらに別の態様は情報処理方法に関する。この情報処理方法は情報処理装置が、画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備えた撮像装置から、撮影された動画像のデータを取得するステップと、動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正するステップと、補正された位置座標を用いて画像解析を行うステップと、解析結果を反映したデータを出力するステップと、を含むことを特徴とする。 Still another aspect of the present invention relates to an information processing method. In this information processing method, the information processing device obtains captured moving image data from an imaging device including a rolling shutter that captures an image with a time lag for each row of pixels, and a frame of the moving image. Correcting the position coordinates of the feature points to the position coordinates at the reference time of the frame, performing image analysis using the corrected position coordinates, and outputting data reflecting the analysis results. It is characterized by including.
 なお、以上の構成要素の任意の組合せ、本発明の表現を方法、装置、システム、コンピュータプログラム、コンピュータプログラムを記録した記録媒体などの間で変換したものもまた、本発明の態様として有効である。 Note that any combination of the above-described components, and the expression of the present invention converted between a method, an apparatus, a system, a computer program, a recording medium on which the computer program is recorded, and the like are also effective as an aspect of the present invention. .
 本発明によると、撮影画像を用いた情報処理や表示において、即時性と処理の精度を両立させることができる。 According to the present invention, it is possible to achieve both immediacy and processing accuracy in information processing and display using captured images.
本実施の形態の情報処理システムの構成例を示す図である。It is a figure which shows the structural example of the information processing system of this Embodiment. 本実施の形態の表示装置をヘッドマウントディスプレイとしたときの外観形状の例を示す図である。It is a figure which shows the example of an external appearance shape when the display apparatus of this Embodiment is used as a head mounted display. 本実施の形態におけるローリングシャッターによる撮影データの取得順と画像平面の関係を説明するための図である。It is a figure for demonstrating the relationship between the acquisition order of the imaging | photography data by the rolling shutter in this Embodiment, and an image plane. グローバルシャッターとローリングシャッターによる撮影画像の解析の差を説明するための図である。It is a figure for demonstrating the difference of the analysis of the picked-up image by a global shutter and a rolling shutter. 転送時間を考慮したときの撮影から表示までの時間経過を説明するための図である。It is a figure for demonstrating progress of time from imaging | photography to a display when transfer time is considered. 本実施の形態における情報処理装置の内部回路構成を示す図である。It is a figure which shows the internal circuit structure of the information processing apparatus in this Embodiment. 本実施の形態における情報処理装置の機能ブロックの構成を示す図である。It is a figure which shows the structure of the functional block of the information processing apparatus in this Embodiment. 本実施の形態において補正部が行う補正処理の手法を説明するための図である。It is a figure for demonstrating the method of the correction process which a correction | amendment part performs in this Embodiment. 本実施の形態において補正部が、補正したオプティカルフローを求める手法を説明するための図である。It is a figure for demonstrating the method in which the correction | amendment part calculates | requires the corrected optical flow in this Embodiment. 本実施の形態において純粋な露光時間を考慮した場合の補正処理について説明するための図である。It is a figure for demonstrating the correction process at the time of considering pure exposure time in this Embodiment. 本実施の形態における撮影画像、画像解析に用いる画像、および表示画像の、時間的な関係を模式的に示す図である。It is a figure which shows typically the temporal relationship of the picked-up image in this Embodiment, the image used for image analysis, and a display image.
 図1は本実施の形態の情報処理システムの構成例を示す。情報処理システム1は、実空間を撮影する撮像装置12、撮影画像に基づき情報処理を行う情報処理装置10、情報処理装置10が出力した画像を表示する表示装置16を含む。情報処理装置10はインターネットなどのネットワーク18と接続可能としてもよい。 FIG. 1 shows a configuration example of an information processing system according to the present embodiment. The information processing system 1 includes an imaging device 12 that captures a real space, an information processing device 10 that performs information processing based on a captured image, and a display device 16 that displays an image output by the information processing device 10. The information processing apparatus 10 may be connectable to a network 18 such as the Internet.
 情報処理装置10と、撮像装置12、表示装置16、ネットワーク18とは、有線ケーブルで接続されてよく、また無線LAN(Local Area Network)などにより無線接続されてもよい。撮像装置12、情報処理装置10、表示装置16のうちいずれか2つ、または全てが組み合わされて一体的に装備されてもよい。例えばそれらを装備した携帯端末やヘッドマウントディスプレイなどで情報処理システム1を実現してもよい。いずれにしろ撮像装置12、情報処理装置10、表示装置16の外観形状は図示するものに限らない。 The information processing apparatus 10, the imaging apparatus 12, the display apparatus 16, and the network 18 may be connected by a wired cable, or may be wirelessly connected by a wireless LAN (Local Area Network) or the like. Any two or all of the imaging device 12, the information processing device 10, and the display device 16 may be combined and integrally provided. For example, the information processing system 1 may be realized by a portable terminal or a head mounted display equipped with them. In any case, the external shapes of the imaging device 12, the information processing device 10, and the display device 16 are not limited to those illustrated.
 撮像装置12は、対象物を所定のフレームレートで撮影するCMOS(Complementary Metal Oxide Semiconductor)センサなどの撮像素子と、その出力データにデモザイク処理、レンズ歪み補正、色補正などを施し、撮影画像のデータを生成する画像処理機構を含む。当該画像処理機構は、画像を縮小することで複数解像度の画像データを生成する機構を含んでいてもよい。また撮像装置12は、2つのカメラを既知の間隔で左右に配置したいわゆるステレオカメラとしてもよい。 The imaging device 12 performs imaging processing such as a CMOS (Complementary Metal Metal Oxide Semiconductor) sensor that captures an object at a predetermined frame rate, and performs demosaic processing, lens distortion correction, color correction, and the like on the output data. Including an image processing mechanism for generating The image processing mechanism may include a mechanism for generating image data with a plurality of resolutions by reducing an image. The imaging device 12 may be a so-called stereo camera in which two cameras are arranged on the left and right at a known interval.
 撮像装置12は、撮影、生成した画像のデータを、画像の上端の画素列から順にストリーム形式で情報処理装置10に送信する。撮像装置12が複数解像度の画像データを生成する場合などは、情報処理装置10からの要求に従った解像度や領域のデータのみを送信してもよい。情報処理装置10は、撮像装置12から送信されたデータに対し画像解析を行い、その結果に基づき情報処理を行ったり、撮像装置12に対するデータ要求に反映させたりする。さらに情報処理装置10は、表示装置16に表示画像や音声などの出力データを送信する。 The imaging device 12 transmits the data of the captured and generated image to the information processing device 10 in a stream format in order from the uppermost pixel row of the image. When the imaging device 12 generates image data with a plurality of resolutions, for example, only resolution and area data in accordance with a request from the information processing device 10 may be transmitted. The information processing apparatus 10 performs image analysis on the data transmitted from the imaging apparatus 12 and performs information processing based on the result, or reflects the data in a data request to the imaging apparatus 12. Further, the information processing apparatus 10 transmits output data such as a display image and sound to the display device 16.
 ここで出力データの内容は特に限定されず、ユーザがシステムに求める機能や起動させたアプリケーションの内容などによって様々であってよい。情報処理装置10は例えば、撮像装置12から送信された撮影画像のデータをそのまま表示装置16に送信し、撮影画像が即時に表示されるようにしてもよい。あるいは画像解析により撮影画像に写る対象物の位置や姿勢を取得し、それに基づき撮影画像に何らかの加工を施したり、画像解析の結果に基づき電子ゲームを進捗させてゲーム画面を生成したりしてもよい。このような態様の代表的なものとして、仮想現実(VR:Virtual Reality)や拡張現実(AR:Augmented Reality)が挙げられる。 Here, the content of the output data is not particularly limited, and may vary depending on the function requested by the user in the system and the content of the activated application. For example, the information processing apparatus 10 may transmit the captured image data transmitted from the imaging apparatus 12 as it is to the display device 16 so that the captured image is displayed immediately. Or you can acquire the position and orientation of the object in the captured image by image analysis, and apply some processing to the captured image based on it, or progress the electronic game based on the result of the image analysis and generate a game screen Good. Typical examples of such a mode include virtual reality (VR) and augmented reality (AR).
 表示装置16は、画像を出力する液晶、プラズマ、有機ELなどのディスプレイと、音声を出力するスピーカーを備え、情報処理装置10から送信された出力データを画像や音声として出力する。表示装置16は、テレビ受像器、各種モニター、携帯端末の表示画面、カメラの電子ファインダなどでもよいし、ユーザの頭に装着してその眼前に画像を表示するヘッドマウントディスプレイでもよい。 The display device 16 includes a display such as liquid crystal, plasma, or organic EL that outputs an image, and a speaker that outputs sound, and outputs output data transmitted from the information processing device 10 as an image or sound. The display device 16 may be a television receiver, various monitors, a display screen of a mobile terminal, an electronic viewfinder of a camera, or a head-mounted display that is attached to the user's head and displays an image in front of the user's eyes.
 図2は表示装置16をヘッドマウントディスプレイとしたときの外観形状の例を示している。この例においてヘッドマウントディスプレイ100は、出力機構部102および装着機構部104で構成される。装着機構部104は、ユーザが被ることにより頭部を一周し装置の固定を実現する装着バンド106を含む。 FIG. 2 shows an example of the external shape when the display device 16 is a head mounted display. In this example, the head mounted display 100 includes an output mechanism unit 102 and a mounting mechanism unit 104. The mounting mechanism unit 104 includes a mounting band 106 that goes around the head when the user wears to fix the device.
 出力機構部102は、ヘッドマウントディスプレイ100をユーザが装着した状態において左右の目を覆うような形状の筐体108を含み、内部には装着時に目に正対するように表示パネルを備える。筐体108内部にはさらに、ヘッドマウントディスプレイ100の装着時に表示パネルとユーザの目との間に位置し、ユーザの視野角を拡大するレンズを備えてよい。またヘッドマウントディスプレイ100はさらに、装着時にユーザの耳に対応する位置にスピーカーやイヤホンを備えてよい。 The output mechanism unit 102 includes a housing 108 shaped to cover the left and right eyes when the user mounts the head mounted display 100, and includes a display panel inside so as to face the eyes when worn. The housing 108 may further include a lens that is positioned between the display panel and the user's eyes when the head mounted display 100 is attached, and that enlarges the viewing angle of the user. Further, the head mounted display 100 may further include a speaker or an earphone at a position corresponding to the user's ear when worn.
 この例でヘッドマウントディスプレイ100は、撮像装置12として、筐体108の前面にステレオカメラ110を備え、ユーザの視線に対応する視野で周囲の実空間を動画撮影する。この場合、情報処理装置10は、SLAM(Simultaneous Localization and Mapping)の技術を用いて撮影画像を解析することにより、周囲環境に対するユーザ頭部の位置や姿勢を特定できる。その情報を利用して、例えば仮想世界に対する視野を決定し左眼視用、右眼視用の表示画像を生成し表示させれば、あたかも眼前に仮想世界が広がっているようなVRを実現できる。このとき情報処理装置10は、ヘッドマウントディスプレイ100と通信を確立できる外部装置としてもよいし、ヘッドマウントディスプレイ100に内蔵してもよい。 In this example, the head mounted display 100 includes the stereo camera 110 on the front surface of the housing 108 as the imaging device 12, and takes a moving image of the surrounding real space with a field of view corresponding to the user's line of sight. In this case, the information processing apparatus 10 can identify the position and orientation of the user's head relative to the surrounding environment by analyzing the captured image using SLAM (SimultaneousaneLocalization and Mapping) technology. Using this information, for example, if a visual field for the virtual world is determined and display images for left-eye viewing and right-eye viewing are generated and displayed, a VR as if the virtual world has spread in front of the eyes can be realized. . At this time, the information processing apparatus 10 may be an external apparatus that can establish communication with the head mounted display 100 or may be built in the head mounted display 100.
 このように本実施の形態の情報処理システム1は、様々な態様への適用が可能であるため、各装置の構成や外観形状もそれに応じて適宜決定してよい。以後、実空間の撮影から画像表示までの即時性と、撮影画像の解析精度を両立させる手法に主眼を置いて説明する。その目的において本実施の形態では、撮像装置12として、行ごとに撮影のタイミングがずれるローリングシャッターカメラを採用する。 As described above, since the information processing system 1 according to the present embodiment can be applied to various modes, the configuration and appearance of each device may be appropriately determined accordingly. In the following, a description will be given with a focus on a technique that achieves both immediacy from shooting in real space to image display and analysis accuracy of the shot image. For this purpose, in the present embodiment, a rolling shutter camera in which shooting timing is shifted for each row is employed as the imaging device 12.
 図3は、ローリングシャッターによる撮影データの取得順と画像平面の関係を説明するための図である。同図上段は、撮像面の縦方向(y軸)と時間軸とがなす平面140に、横方向(x軸)の画素列からなる各行の露光時間142とデータ読み出し時間144を表している。ローリングシャッターカメラは「ライン露光順次読み出し」方式のカメラであり、図示するように撮像面の一行ごとに露光時間をずらし、露光完了直後に各行のデータを読み出すことにより1フレーム分のデータを取得する。 FIG. 3 is a diagram for explaining the relationship between the image data acquisition order by the rolling shutter and the image plane. The upper part of the figure shows an exposure time 142 and a data read time 144 for each row of pixel columns in the horizontal direction (x axis) on a plane 140 formed by the vertical direction (y axis) of the imaging surface and the time axis. The rolling shutter camera is a “line exposure sequential readout” type camera. As shown in the figure, the exposure time is shifted for each row of the imaging surface, and data for each frame is obtained by reading the data of each row immediately after the exposure is completed. .
 一般的にはそのように取得した行ごとのデータを、x軸およびy軸がなす平面146に展開し、同一時刻の画像フレームとして扱うことにより解析や表示がなされる。しかしながら厳密には、撮影画像の上方に写る物と下方に移る物とでは、実際の観測時刻が露光時間のずれに応じて異なるため、解析結果にはそれに起因した誤差が含まれ得る。当該誤差は、対象物あるいは撮像面の動きが速くなるほど大きくなる。このような誤差を解消するため、CCDなどの撮像素子を用いたグローバルシャッターカメラを採用することが考えられる。 Generally, analysis and display are performed by expanding the data for each row acquired in this way on a plane 146 formed by the x-axis and the y-axis and handling them as image frames at the same time. Strictly speaking, however, the actual observation time differs depending on the difference in exposure time between the object that appears above the photographed image and the object that moves down, so that the analysis result may include an error due to the difference. The error becomes larger as the movement of the object or the imaging surface becomes faster. In order to eliminate such an error, it is conceivable to employ a global shutter camera using an image sensor such as a CCD.
 グローバルシャッターカメラは「同時露光一括読み出し」方式のカメラであり、全ての行を同じタイミングで露光するため、1フレームに写る物は、その観測時刻に差が生じない。図4は被写体に動きがあるときの画像解析の差を説明するための図である。同図は(a)がグローバルシャッターカメラ、(b)がローリングシャッターカメラで同じ被写体(あるいはそれに含まれる特徴点)を撮影したときの、観測時刻およびy軸方向の位置を、連続する複数のフレームについて表している。すなわち最小単位の実線矩形(長方形または平行四辺形)が1フレーム分の露光を表す。 The global shutter camera is a “simultaneous exposure batch readout” type camera, and all rows are exposed at the same timing, so there is no difference in the observation time of objects appearing in one frame. FIG. 4 is a diagram for explaining a difference in image analysis when a subject moves. In the figure, (a) is a global shutter camera, (b) is a rolling shutter camera, and the same subject (or a feature point included therein) is photographed. It represents about. That is, the minimum unit solid line rectangle (rectangle or parallelogram) represents exposure for one frame.
 まず(a)に示すグローバルシャッターの場合、全ての行について同期間に露光がなされるため、データ処理上で定義される各フレームの基準時刻t、t、t、・・・と、実際の観測時刻との間に、y軸方向の位置に依存したずれは生じない。一方、(b)に示すローリングシャッターの場合は上述のとおり、行によって露光タイミングが異なるため、各フレームの基準時刻t、t、t、・・・に対し実際の観測時刻には、y軸方向の位置に依存したずれが生じる。 First, in the case of the global shutter shown in (a), since all rows are exposed during the same period, the reference times t 0 , t 1 , t 2 ,. There is no deviation depending on the position in the y-axis direction between the actual observation time. On the other hand, in the case of the rolling shutter shown in (b), since the exposure timing differs depending on the row as described above, at the actual observation time with respect to the reference times t 0 , t 1 , t 2 ,. A shift depending on the position in the y-axis direction occurs.
 同図の場合、撮像面において最も上の行の露光時間を基準時刻としているため、被写体が下にあるほど誤差が大きくなる。このような撮影画像を、基準時刻t、t、t、・・・のデータとしてそれぞれ一括して解析すると、白丸で示すように、被写体の移動方向によって実際より速く認識されたり遅く認識されたりする誤差が発生する。画像解析の内容によっては、移動速度のみならず、位置や姿勢の解析結果に誤差が生じたり、検出対象を誤認識したりすることも考えられる。 In the case of the figure, since the exposure time of the uppermost row on the imaging surface is set as the reference time, the error increases as the subject is at the lower side. When such photographed images are collectively analyzed as data of reference times t 0 , t 1 , t 2 ,..., As shown by white circles, they are recognized faster or slower depending on the moving direction of the subject. Error occurs. Depending on the contents of the image analysis, it is conceivable that an error occurs in the analysis result of not only the moving speed but also the position and orientation, or the detection target is erroneously recognized.
 一方、グローバルシャッターであっても、そのデータ読み出しや転送は行単位で順番になされるため、画像解析や表示における即時性の観点では不利となり得る。図5は転送時間を考慮したときの撮影から表示までの時間経過を説明するための図である。(a)はグローバルシャッターカメラで撮影した画像を、(b)はローリングシャッターカメラで撮影した画像を、それぞれ表示装置16に表示させる際の1フレーム分のデータ処理のタイミングを示している。なおデータの転送経路には情報処理装置10が介在してよいが、同図では省略している。 On the other hand, even in the case of a global shutter, data reading and transfer are sequentially performed in units of rows, which may be disadvantageous from the viewpoint of immediacy in image analysis and display. FIG. 5 is a diagram for explaining the passage of time from shooting to display when the transfer time is taken into account. (A) shows the timing of data processing for one frame when displaying an image shot with a global shutter camera and (b) showing an image shot with a rolling shutter camera on the display device 16, respectively. Although the information processing apparatus 10 may be interposed in the data transfer path, it is omitted in FIG.
 (a)に示すグローバルシャッターの場合、矩形110aに太線で示すように、時刻tgにおいて全ての行の露光が同時になされる。一方、撮像装置12からの出力は、矩形110a内に点線で示すように、伝送帯域の制限によって上の行から順番になされる。例えば画像上方の行は矢印aで示す早いタイミングで出力され、下方の行は矢印a’で示す、それより遅いタイミングで出力される。つまり下方の行はデータ出力までにΔtだけ待機する必要がある。その結果、撮像装置12が全ての行のデータを出力するのには、tg-tgの時間を要する。 For global shutter shown in (a), as shown in rectangular 110a by thick lines, the exposure of all the rows at time tg 0 is made at the same time. On the other hand, the output from the imaging device 12 is made in order from the upper row due to the limitation of the transmission band, as indicated by a dotted line in the rectangle 110a. For example, the upper line of the image is output at an early timing indicated by an arrow a, and the lower line is output at a later timing indicated by an arrow a ′. That is, the lower row needs to wait for Δt before data output. As a result, it takes time tg 1 -tg 0 for the imaging device 12 to output data of all rows.
 そのようにして出力されたデータは、情報処理装置10を介して表示装置16のフレームバッファに格納され表示される。矩形112a内の点線は、表示装置16において各行のデータが格納されるタイミングを表している。この場合も、画像上方の行は矢印aで示すような早いタイミングで格納され、下方の行が矢印a’で示す、それより遅いタイミングで格納される。つまり上方の行は全てのデータが格納されるまでにΔt’だけ待機する必要がある。その結果、表示装置16が全ての行のデータをフレームバッファに格納するには、tg-tgの時間を要する。 The data thus output is stored in the frame buffer of the display device 16 via the information processing device 10 and displayed. The dotted line in the rectangle 112a represents the timing at which the data of each row is stored in the display device 16. Also in this case, the upper row of the image is stored at an early timing as indicated by an arrow a, and the lower row is stored at a later timing as indicated by an arrow a ′. That is, the upper row needs to wait for Δt ′ before all data is stored. As a result, it takes time tg 3 -tg 2 for the display device 16 to store all rows of data in the frame buffer.
 なお同図では、フレームバッファへの格納が完了する時刻をtgとしているが、フレームバッファを前提とした表示装置の場合、実際の表示までにはさらに、駆動方式に応じた調整時間が発生する。図示する例は撮影画像をそのまま表示するシンプルな態様であるが、情報処理装置10において何らかの加工や描画をする場合であっても、伝送帯域に制限がある限り、データを順に送る時間、およびフレームバッファに溜めていく時間として、撮影から表示までには少なくともtg-tgの時間を要する。また情報処理装置10においてフレームバッファを設け、画像解析などを行う場合も同様の待機時間が生じる。 In the figure, the time at which the storage in the frame buffer is completed is tg 3 , but in the case of a display device premised on the frame buffer, an adjustment time corresponding to the driving method is further generated until the actual display. . The example shown in the drawing is a simple mode in which a captured image is displayed as it is. However, even if some processing or drawing is performed in the information processing apparatus 10, as long as the transmission band is limited, the time for sequentially transmitting data and the frame As the time for storing in the buffer, at least tg 3 -tg 0 is required from shooting to display. The same waiting time also occurs when the information processing apparatus 10 is provided with a frame buffer and performs image analysis or the like.
 一方、(b)に示すローリングシャッターの場合、露光自体が上の行から順番になされるため、矩形110bに太線で示すように、全ての行を露光するのにtr-trの時間を要する。しかしながら、露光が完了した行から即時にデータが出力されるため、矢印b、b’に例示されるように、撮影から出力までの時間は行によらない。すなわちグローバルシャッターにおいて発生していた待機時間Δtは生じない。 On the other hand, in the case of the rolling shutter shown in (b), since the exposure itself is performed in order from the upper row, the time of tr 1 -tr 0 is required to expose all the rows as indicated by the thick line in the rectangle 110b. Cost. However, since data is output immediately from the line where the exposure is completed, the time from shooting to output does not depend on the line, as illustrated by arrows b and b ′. That is, the standby time Δt that has occurred in the global shutter does not occur.
 このようなローリングシャッターの特徴を踏まえ、好適には、フレームバッファでの1フレーム分のデータ格納を待たず、行ごとに即時出力できるラインバッファ対応のディスプレイを表示装置16として採用する。すると矩形112bに示すように、撮像装置12からの出力タイミングと同期するように、時刻trからtrまでの時間で順次表示が進捗する。結果として、フレームバッファを前提とした表示装置において発生していた待機時間Δt’は生じない。 In consideration of the characteristics of the rolling shutter, a display corresponding to a line buffer that can output immediately for each row without waiting for storage of data for one frame in the frame buffer is preferably adopted as the display device 16. Then, as indicated by a rectangle 112b, the display progresses sequentially in the time from time tr 2 to tr 3 so as to synchronize with the output timing from the imaging device 12. As a result, the waiting time Δt ′ generated in the display device based on the frame buffer does not occur.
 このような待機時間Δt、Δt’を考慮すると、撮像から表示までの時間差は、(b)に示すローリングシャッターカメラとラインバッファ対応の表示装置の組み合わせが最短となる。すなわち表示画面の走査による時間差を前提として、撮影側でもあえて観測時間差を行ごとに設けることにより、最新の情報を画像として表示させることができる。なお入力されたデータを即時表示する方式のディスプレイとして、電界放出ディスプレイ(FED:Field Emission Display)を採用してもよい。 Considering such standby times Δt and Δt ′, the time difference from imaging to display is the shortest in the combination of the rolling shutter camera and the line buffer compatible display device shown in FIG. That is, on the premise of the time difference due to the scanning of the display screen, the latest information can be displayed as an image by providing the observation time difference for each line. Note that a field emission display (FED: Field と し て Emission Display) may be employed as a display of a method for immediately displaying input data.
 また同図は最も理解しやすい例として、撮像装置12と表示装置16の処理のタイミングを示していたが、より新しいデータすなわち情報を処理対象とすることが望ましいのは、撮影画像を用いて情報処理を行う情報処理装置10においても同様である。すなわち入力されたデータをメモリなどに待機させる時間を極力短くするような連携した動作を各装置で実現することは、表示や情報処理の即時性や精度の面で格段の効果を奏する。また同時に、メモリ容量を節約する効果も得られる。 In addition, the figure shows the processing timing of the imaging device 12 and the display device 16 as the most easily understood example. However, it is desirable to use newer data, that is, information, as a processing target by using a captured image. The same applies to the information processing apparatus 10 that performs processing. In other words, realizing a coordinated operation that shortens the time for waiting for input data in a memory or the like as much as possible has a remarkable effect in terms of immediacy and accuracy of display and information processing. At the same time, an effect of saving the memory capacity can be obtained.
 このような特有の知見に基づき、本実施の形態では、撮影にローリングシャッターカメラを採用し、撮像装置12、情報処理装置10、および表示装置16において、データを即時出力することを基本とする。一方、画像解析においては、1フレームにおける観測の時間差が解消されるようにデータを補正することにより、観測データの即時利用と解析精度を両立させる。補正は基本的に、特徴点が観測された時刻と位置に基づき、各フレームの基準時刻における位置を推定することによって行う。具体的な演算手法については後述する。 Based on such unique knowledge, in this embodiment, a rolling shutter camera is adopted for shooting, and the imaging device 12, the information processing device 10, and the display device 16 are basically configured to output data immediately. On the other hand, in image analysis, by correcting the data so that the time difference of observation in one frame is eliminated, both immediate use of observation data and analysis accuracy are compatible. The correction is basically performed by estimating the position of each frame at the reference time based on the time and position at which the feature point was observed. A specific calculation method will be described later.
 図6は情報処理装置10の内部回路構成を示している。情報処理装置10は、CPU(Central Processing Unit)23、GPU(Graphics Processing Unit)24、メインメモリ26を含む。これらの各部は、バス30を介して相互に接続されている。バス30にはさらに入出力インターフェース28が接続されている。入出力インターフェース28には、USBやIEEE1394などの周辺機器インターフェースや、有線又は無線LANのネットワークインターフェースからなる通信部32、ハードディスクドライブや不揮発性メモリなどの記憶部34、表示装置16へデータを出力する出力部36、撮像装置12や図示しない入力装置からデータを入力する入力部38、磁気ディスク、光ディスクまたは半導体メモリなどのリムーバブル記録媒体を駆動する記録媒体駆動部40が接続される。 FIG. 6 shows the internal circuit configuration of the information processing apparatus 10. The information processing apparatus 10 includes a CPU (Central Processing Unit) 23, a GPU (Graphics Processing Unit) 24, and a main memory 26. These units are connected to each other via a bus 30. An input / output interface 28 is further connected to the bus 30. The input / output interface 28 outputs data to a peripheral device interface such as USB or IEEE1394, a communication unit 32 including a wired or wireless LAN network interface, a storage unit 34 such as a hard disk drive or a nonvolatile memory, and the display device 16. An output unit 36, an input unit 38 for inputting data from the imaging device 12 or an input device (not shown), and a recording medium driving unit 40 for driving a removable recording medium such as a magnetic disk, an optical disk or a semiconductor memory are connected.
 CPU23は、記憶部34に記憶されているオペレーティングシステムを実行することにより情報処理装置10の全体を制御する。CPU23はまた、リムーバブル記録媒体から読み出されてメインメモリ26にロードされた、あるいは通信部32を介してダウンロードされた各種プログラムを実行する。GPU24は、ジオメトリエンジンの機能とレンダリングプロセッサの機能とを有し、CPU23からの描画命令に従って描画処理を行い、出力部36に出力する。メインメモリ26はRAM(Random Access Memory)により構成され、処理に必要なプログラムやデータを記憶する。 The CPU 23 controls the entire information processing apparatus 10 by executing the operating system stored in the storage unit 34. The CPU 23 also executes various programs read from the removable recording medium and loaded into the main memory 26 or downloaded via the communication unit 32. The GPU 24 has a function of a geometry engine and a function of a rendering processor, performs a drawing process in accordance with a drawing command from the CPU 23, and outputs it to the output unit 36. The main memory 26 is composed of RAM (Random Access Memory) and stores programs and data necessary for processing.
 図7は情報処理装置10の機能ブロックの構成を示している。同図に示す各機能ブロックは、ハードウェア的には、図6で示した各種回路によりで実現でき、ソフトウェア的には、記録媒体からメインメモリにロードした、画像解析機能、情報処理機能、画像描画機能、データ入出力機能などの諸機能を発揮するプログラムで実現される。したがって、これらの機能ブロックがハードウェアのみ、ソフトウェアのみ、またはそれらの組合せによっていろいろな形で実現できることは当業者には理解されるところであり、いずれかに限定されるものではない。 FIG. 7 shows a functional block configuration of the information processing apparatus 10. Each functional block shown in the figure can be realized by the various circuits shown in FIG. 6 in terms of hardware, and in terms of software, an image analysis function, an information processing function, and an image loaded from a recording medium to the main memory. This is realized by a program that exhibits various functions such as a drawing function and a data input / output function. Therefore, it is understood by those skilled in the art that these functional blocks can be realized in various forms by hardware only, software only, or a combination thereof, and is not limited to any one.
 情報処理装置10は、撮像装置12から撮影画像のデータを取得する撮影画像取得部52、取得した画像を解析する画像解析部54、解析結果を利用するなどして出力すべきデータを生成する出力データ生成部56を含む。撮影画像取得部52は、図6の入力部38、CPU23、メインメモリ26などで実現され、撮像装置12から撮影画像のフレームデータを順次取得する。詳細には上述したように、1フレームのうち露光が完了した行から順にストリーム形式でデータを取得する。取得したデータは画像解析部54および出力データ生成部56に供給する。 The information processing apparatus 10 includes a captured image acquisition unit 52 that acquires captured image data from the imaging device 12, an image analysis unit 54 that analyzes the acquired image, and an output that generates data to be output by using the analysis result. A data generation unit 56 is included. The captured image acquisition unit 52 is realized by the input unit 38, the CPU 23, the main memory 26, and the like of FIG. 6, and sequentially acquires frame data of the captured image from the imaging device 12. Specifically, as described above, data is acquired in a stream format in order from the line in which exposure is completed in one frame. The acquired data is supplied to the image analysis unit 54 and the output data generation unit 56.
 このときも好適には、取得した行のデータから順次供給していく。ただし画像解析部54が実施する画像解析の内容によっては、メインメモリ26などに2次元の画像データとして格納しておき、画像解析部54が適宜参照できるようにしてもよい。また撮影画像取得部52は、画像解析部54による画像解析の結果に基づき、取得する撮影画像のデータを指定して撮像装置2に要求してもよい。 Also at this time, it is preferable to supply data sequentially from the acquired row data. However, depending on the content of the image analysis performed by the image analysis unit 54, it may be stored as two-dimensional image data in the main memory 26 or the like so that the image analysis unit 54 can refer to it appropriately. Further, the captured image acquisition unit 52 may request the imaging apparatus 2 by designating captured image data to be acquired based on the result of image analysis by the image analysis unit 54.
 画像解析部54は、図6のCPU23、GPU24、メインメモリ26などで実現され、撮影画像のデータを用いて所定の画像解析を行い、その結果を出力データ生成部56に供給する。画像解析部54が行う解析の内容は特に限定されない。例えば上述のSLAMや対象物の追跡処理など、元から動きを想定した解析でもよいし、対象物検出、対象物認識、デプスマップ取得など一般的になされる画像解析のいずれでもよい。いずれにしろ上述したようにフレーム内での観測時間の差を解消する補正により、解析処理の精度を向上させることができる。 The image analysis unit 54 is realized by the CPU 23, the GPU 24, the main memory 26, and the like in FIG. 6, performs a predetermined image analysis using the captured image data, and supplies the result to the output data generation unit 56. The content of the analysis performed by the image analysis unit 54 is not particularly limited. For example, the above-mentioned analysis such as the SLAM or the tracking processing of the object may be performed, and any of image analysis generally performed such as object detection, object recognition, and depth map acquisition may be used. In any case, the accuracy of the analysis process can be improved by the correction that eliminates the difference in the observation time within the frame as described above.
 詳細には画像解析部54は、特徴抽出部60、補正部62、補正データ記憶部64、および解析処理部66を含む。特徴抽出部60は、画像解析に用いる特徴を撮影画像から抽出する。具体的な抽出対象や処理アルゴリズムは、実施する画像解析によって様々であるが、例えばエッジ検出、コーナー検出、輪郭線検出、テクスチャ等による領域分割などが挙げられる。このような特徴抽出は一般的な技術でよいため詳細な説明は省略する。 Specifically, the image analysis unit 54 includes a feature extraction unit 60, a correction unit 62, a correction data storage unit 64, and an analysis processing unit 66. The feature extraction unit 60 extracts features used for image analysis from the captured image. Specific extraction objects and processing algorithms vary depending on the image analysis to be performed, and examples include edge detection, corner detection, contour detection, and area division by texture. Since such feature extraction may be performed by a general technique, detailed description thereof is omitted.
 補正部62は、特徴として抽出された点、線、あるいは領域の境界線が、各フレームの基準時刻における正確な位置を表すように補正する。補正に必要なデータは補正データ記憶部64に格納しておき、適宜参照する。当該データとして、前のフレームまでに抽出された特徴の位置情報や、観測時刻のずれ時間に係るパラメータを画像平面上の離散的な位置に対応づけた2次元マップなどが挙げられる。当該パラメータは、撮像装置12でなされるレンズ歪み補正を考慮して計算してもよい。 The correction unit 62 corrects the points, lines, or area boundaries extracted as features to represent the exact position of each frame at the reference time. Data necessary for correction is stored in the correction data storage unit 64 and referred to as appropriate. Examples of the data include position information of features extracted up to the previous frame, and a two-dimensional map in which parameters related to observation time shift time are associated with discrete positions on the image plane. The parameter may be calculated in consideration of lens distortion correction performed by the imaging device 12.
 レンズ歪み補正や露光の進捗速度などに基づき事前に算出できるパラメータについて、2次元マップやルックアップテーブルなどの形であらかじめ準備しておくことにより、補正処理を効率化できる。解析処理部66は、位置が補正された特徴のデータを用いて、上に例示したような画像解析のうち所定の解析を実施する。 The parameters that can be calculated in advance based on lens distortion correction and exposure progress speed, etc. can be prepared in advance in the form of a two-dimensional map or a lookup table, so that the correction process can be made more efficient. The analysis processing unit 66 performs a predetermined analysis of the image analysis as exemplified above by using the feature data whose position is corrected.
 出力データ生成部56は、図6のCPU23、GPU24、メインメモリ26、出力部36などで実現され、出力すべき表示画像や音声のデータを生成して表示装置16に出力する。どのようなデータを生成するかは、情報処理装置10の使用目的やユーザが選択するアプリケーションなどによって様々でよい。撮影画像をそのまま表示させる態様においては、撮影画像取得部52から取得したデータストリームをそのまま出力してもよい。撮影画像に何らかの加工を施す場合も、行ごとに処理を完了させ即時出力するなどして遅延時間を抑えることが望ましい。 The output data generation unit 56 is realized by the CPU 23, the GPU 24, the main memory 26, the output unit 36, and the like of FIG. 6, and generates display image and audio data to be output and outputs them to the display device 16. The kind of data to be generated may vary depending on the purpose of use of the information processing apparatus 10 and the application selected by the user. In the aspect in which the captured image is displayed as it is, the data stream acquired from the captured image acquisition unit 52 may be output as it is. Even when some processing is performed on the captured image, it is desirable to suppress the delay time by completing the processing for each row and outputting it immediately.
 図8は、補正部62が行う補正処理の手法を説明するための図である。同図は図4と同様、横方向を時間軸、縦方向を撮影画像の縦軸(y軸)とし、最小単位の実線矩形(平行四辺形)が、ローリングシャッターカメラによる1つのフレームの露光処理を表す。また黒丸は特徴点の観測時刻と位置を表し、同図の場合、各フレームに2つの特徴点が写っているとしている。例えばn番目のフレームfにおいて、点線で示した露光のタイミングで、y、y’の位置に2つの特徴点が観測されている。 FIG. 8 is a diagram for explaining a correction processing technique performed by the correction unit 62. As in FIG. 4, the horizontal direction is the time axis, the vertical direction is the vertical axis (y-axis) of the captured image, and the solid line rectangle (parallelogram) of the smallest unit is the exposure processing of one frame by the rolling shutter camera. Represents. The black circles represent the observation times and positions of feature points. In the case of the figure, two feature points are shown in each frame. For example, in the n-th frame f n , two feature points are observed at the positions y n and y n ′ at the exposure timing indicated by the dotted line.
 なお同図はy軸と時間軸の2次元空間を示しているが、物体の位置情報は当然、画像平面の横軸(x軸)とy軸からなる2次元上の座標である。また最も上の行の露光時刻を基準時刻とし、フレーム間での基準時刻の間隔をdとすると、各フレームfn-1、f、fn+1、fn+2、・・・の基準時刻は図示するように、d・(n-1)、d・n、d・(n+1)、d・(n+2)、・・・となる。ここで間隔dは、y軸方向の露光の進捗速度と反比例の関係を有するとともに、フレームの撮影周期と捉えることもできる。補正部62は、1つ前のフレームfn-1からの特徴点の移動量に基づき、基準時刻における位置の補正値を補間により求める。 Although the figure shows a two-dimensional space of the y-axis and the time axis, the position information of the object is naturally a two-dimensional coordinate consisting of the horizontal axis (x-axis) and the y-axis of the image plane. If the exposure time of the top row is a reference time and the interval of the reference time between frames is d, the reference times of the frames f n−1 , f n , f n + 1 , f n + 2 ,. Thus, d · (n−1), d · n, d · (n + 1), d · (n + 2),... Here, the interval d has an inversely proportional relationship with the progress rate of exposure in the y-axis direction, and can also be regarded as a frame shooting period. The correcting unit 62 obtains a position correction value at the reference time by interpolation based on the amount of movement of the feature point from the previous frame f n−1 .
 同図では補正後の位置を白抜きの四角で示している。例えばフレームfで抽出された特徴点120に着目すると、それが観測される時刻の、基準時刻d・nからの遅延時間Rは、画像平面の縦方向の位置yに依存し、次の式で与えられる。
 R=d・y/V
 ここでVは画像の縦方向の長さである。直前のフレームfn-1を考慮すると、特徴点120は、基準時刻d・nより(d-Rn-1)だけ前の時刻と、Rだけ後の時刻で観測される。
In the figure, the corrected position is indicated by a white square. For example, when focusing on the feature point 120 extracted in the frame f n , the delay time R n from the reference time d · n at the time when the feature point 120 is observed depends on the position y n in the vertical direction on the image plane, and Is given by
R n = d · y n / V
Here, V is the length of the image in the vertical direction. Considering the immediately preceding frame f n−1 , the feature point 120 is observed at a time (d−R n−1 ) before the reference time d · n and a time after R n .
 それらの時刻で観測される特徴点120の位置座標をそれぞれ(xn-1,yn-1)、(x,y)とすると、その間の基準時刻d・nにおける当該特徴点120の位置座標(xc,yc)は次のように求められる。 If the position coordinates of the feature point 120 observed at those times are (x n−1 , y n−1 ) and (x n , y n ), respectively, the feature point 120 at the reference time d · n between them The position coordinates (xc n , yc n ) are obtained as follows.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 ここでL=1/(V-yn-1+y)とおくと、式1は次のように表現できる。 If L n = 1 / (Vy n-1 + y n ), Equation 1 can be expressed as follows.
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 補正部62は、式2を用いて、撮影画像を構成する各フレームの特徴点の位置座標を、当該フレームの基準時刻における値に補正する。特徴が線や領域であっても、当該線や境界線を構成する点の位置座標を補正することで、基準時刻における形状に補正できる。これにより、フレーム内で統一された時刻での像に基づき、画像解析を正確に行える。 The correction unit 62 corrects the position coordinates of the feature points of each frame composing the captured image to the value at the reference time of the frame using Expression 2. Even if the feature is a line or a region, it can be corrected to the shape at the reference time by correcting the position coordinates of the points constituting the line or boundary line. Thereby, based on the image at the time unified within the frame, image analysis can be performed accurately.
 なお上記補正には直前のフレームにおける同じ特徴点の位置座標を用いるため、補正部62は補正データ記憶部64に、特徴点の識別情報と、少なくとも直前のフレームにおける位置座標とを対応づけて格納しておく。図8の例では、前後の2つのフレームで観測された特徴点の位置座標の変化を線形補間したが、補間のアルゴリズムはこれに限らない。すなわち3つ以上のフレームで観測された位置座標を考慮してもよいし、それらを用いてスプライン補間など曲線で補間してもよい。 In addition, since the position coordinates of the same feature point in the immediately preceding frame are used for the correction, the correction unit 62 stores the identification information of the feature point in association with at least the position coordinates in the immediately previous frame in the correction data storage unit 64. Keep it. In the example of FIG. 8, the change of the position coordinates of the feature points observed in the two preceding and following frames is linearly interpolated, but the interpolation algorithm is not limited to this. That is, the position coordinates observed in three or more frames may be taken into account, or they may be used to interpolate with a curve such as spline interpolation.
 また図8の例は最も上の行の露光時刻を基準時刻としたため、およそ全ての特徴点の観測時刻がそれより遅くなることを想定している。一方、中央の行や最も下の行の露光時刻など、基準時刻は自由に設定することができる。この場合、当然、基準時刻に露光される行より上に位置する特徴点の観測時刻は、基準時刻より早くなるが、これまで述べた「遅延時間」を基準時刻からの「ずれ時間」に置き換えることにより、同様の計算で補正を実現できる。以下の計算においても同様である。 Further, in the example of FIG. 8, since the exposure time of the top row is set as the reference time, it is assumed that the observation times of all feature points are later than that. On the other hand, the reference time can be freely set such as the exposure time of the center row and the bottom row. In this case, naturally, the observation time of the feature point located above the line exposed at the reference time is earlier than the reference time, but the “delay time” described so far is replaced with the “deviation time” from the reference time. Thus, the correction can be realized by the same calculation. The same applies to the following calculations.
 上記計算は、撮像素子の直交する2次元配列と、撮影画像の画素配列が対応することを前提としている。この場合、図8の点線のように、y軸方向の位置と観測時刻が比例関係になるため、基準時刻からの遅延時間Rが上述のとおりyの一次関数として求められる。一方、撮像装置12においてレンズ歪み補正がなされている場合、出力される撮影画像の各画素は、当該画素値を観測した撮像素子の位置からx軸方向およびy軸方向にずれ、そのずれ量は画像上の位置によって異なる。 The above calculation is based on the premise that the orthogonal two-dimensional array of image sensors corresponds to the pixel array of the captured image. In this case, as shown by the dotted line in FIG. 8, the position with the observation time of the y-axis direction is proportional, the delay time R n from the reference time is determined as a linear function of y n as described above. On the other hand, when the lens distortion correction is performed in the imaging device 12, each pixel of the output captured image is shifted in the x-axis direction and the y-axis direction from the position of the imaging element where the pixel value is observed, and the shift amount is It depends on the position on the image.
 その結果、レンズ歪み補正後の画像における特徴点の位置座標と実際の撮影時刻との関係も、画像平面上の位置によって変化する。このため補正部62は、撮像装置12におけるレンズ歪み補正を加味して補正を行ってもよい。具体的には、上述の遅延時間R=d・y/Vのうち、1フレーム分の露光時間に対する遅延時間の割合の成分であるy/Vに相当するパラメータm(x,y)を、画像平面上の複数の位置(x,y)に対し計算しておく。 As a result, the relationship between the position coordinates of the feature points in the image after lens distortion correction and the actual shooting time also changes depending on the position on the image plane. For this reason, the correction unit 62 may perform correction in consideration of lens distortion correction in the imaging device 12. Specifically, the parameter m (x, y) corresponding to y n / V, which is a component of the ratio of the delay time to the exposure time for one frame in the delay time R n = d · y n / V described above. Are calculated for a plurality of positions (x, y) on the image plane.
 具体的には下記のように、各位置(x,y)に対しレンズ歪み補正の逆補正Mを作用させて補正前の座標(x,y)を求めたうえ、そのうちのy座標yの、Vに対する割合を計算する。
 (x,y)=(x,y)・M
 m(x,y)=y/V
Specifically, as shown below, the lens distortion correction reverse correction M is applied to each position (x, y) to obtain the coordinates (x m , y m ) before correction, and the y coordinate y of them Calculate the ratio of m to V.
(X m , y m ) = (x, y) · M
m (x, y) = y m / V
 レンズ歪み補正時に、画像平面の直交メッシュに対し補正量を設定した補正マップを用いている場合、パラメータm(x,y)を2次元平面に表した遅延マップを、同様のメッシュに対し容易に生成できる。補正部62は、フレームfから抽出された特徴点の位置座標(x,y)に基づき、補正データ記憶部64に格納された遅延マップを参照し、必要に応じて内挿することで、当該特徴点を観測した時刻の遅延割合を示すパラメータm(x,y)を取得する。 When using a correction map in which the correction amount is set for the orthogonal mesh on the image plane during lens distortion correction, a delay map representing the parameter m (x, y) on a two-dimensional plane can be easily applied to the same mesh. Can be generated. The correction unit 62 refers to the delay map stored in the correction data storage unit 64 based on the position coordinates (x n , y n ) of the feature points extracted from the frame f n and interpolates as necessary. Then, the parameter m (x n , y n ) indicating the delay ratio at the time when the feature point is observed is acquired.
 すると、フレームfの基準時刻d・nからの、当該特徴点の観測時刻の遅延時間Rは、次のように求められる。
 R=d・m(x,y
この場合、L=1/(1-m(xn-1,yn-1)+m(x,y))として、補正後の位置座標(xc,yc)は式2の代わりに次のように求められる。
Then, the delay time R n of the observation time of the feature point from the reference time d · n of the frame f n is obtained as follows.
R n = d · m (x n , y n )
In this case, L n = 1 / (1−m (x n−1 , y n−1 ) + m (x n , y n )), and the corrected position coordinates (xc n , yc n ) Instead, it asks for:
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 図9は、補正部62が、補正したオプティカルフローを求める手法を説明するための図である。図の形式および特徴点の動きは図8で示したものと同様である。物体や特徴点の移動ベクトルを示すオプティカルフローは一般に、移動体を検出、追跡したり形状変化を特定したりする際の重要な情報である。図9においてオプティカルフローを示すベクトル130は、y軸と時間軸のなす2次元空間において示されているが、実際にはx軸およびy軸からなる画像平面上での単位時間あたりの移動量を表す。 FIG. 9 is a diagram for explaining a method in which the correction unit 62 obtains a corrected optical flow. The format of the figure and the movement of the feature points are the same as those shown in FIG. In general, an optical flow indicating a movement vector of an object or a feature point is important information for detecting and tracking a moving object or specifying a shape change. In FIG. 9, the vector 130 indicating the optical flow is shown in a two-dimensional space formed by the y-axis and the time axis, but actually, the amount of movement per unit time on the image plane composed of the x-axis and the y-axis is shown. To express.
 補正部62は、式2または式3により補正した特徴点の位置座標(xc,yc)を用いて、正確にオプティカルフローを求める。具体的には、フレームfにおける特徴点のオプティカルフロー、すなわち画像平面でのベクトル(Vx,Vy)は次式のように求められる。 The correcting unit 62 accurately obtains the optical flow using the position coordinates (xc n , yc n ) of the feature points corrected by Expression 2 or 3. Specifically, the optical flow of the feature points in the frame f n , that is, the vector (Vx n , Vy n ) on the image plane is obtained as follows.
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 式4は一旦、特徴点の位置座標(x,y)を式2または式3により補正してからベクトルを求める手法であるが、前後のフレームにおける同じ特徴点の位置座標から直接求めたベクトル132で次のように近似し、処理を簡略化してもよい。 Expression 4 is a technique for obtaining a vector after correcting the position coordinates (x n , y n ) of the feature point by Expression 2 or 3, but it is obtained directly from the position coordinates of the same feature point in the preceding and following frames. The vector 132 may be approximated as follows to simplify the processing.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 なお式5では、観測時刻の遅延時間の割合にパラメータmを用いているが、y/Vといった一次式で代用してもよい。またオプティカルフローは、解析処理部66が画像解析の途中で必要に応じて求めてもよい。これまで述べた例では、各行の露光時間が基準時刻の間隔d、すなわちフレームの撮影周期に等しいとしていたが、より厳密には露光時間は、撮影周期を上限として、撮影環境等に応じて変化し得る。 In Expression 5, the parameter m is used for the ratio of the delay time of the observation time, but a linear expression such as y n / V may be substituted. The optical flow may be obtained by the analysis processing unit 66 as needed during the image analysis. In the examples described so far, the exposure time of each row is assumed to be equal to the interval d of the reference time, that is, the shooting period of the frame, but more strictly, the exposure time changes according to the shooting environment etc. with the shooting period as the upper limit. Can do.
 図10は純粋な露光時間を考慮した場合の補正処理について説明するための図である。図の形式は図8で示したものと同様であるが、各行の露光時間eは、基準時刻の間隔dより短い。この場合、画像の縦方向に対する露光の進行が、これまで説明した例より早くなるため、同じ観測位置でも遅延時間Rが短くなる。具体的には次のようになる。
 R=d・{m(x,y)-sh
FIG. 10 is a diagram for explaining a correction process when a pure exposure time is considered. While figure formats are the same as those shown in FIG. 8, each row of the exposure time e n is shorter than the distance d of the reference time. In this case, the progress of the exposure with respect to the longitudinal direction of the image, to become faster than the examples described heretofore, the delay time R n at the same observation position is shortened. Specifically:
R n = d · {m (x n , y n ) −sh n }
 ここでshはフレームfにおける露光時間eを考慮したシャッター補正値であり、次のように定義される。
 sh=(1-e/d)/2
当該遅延時間Rを式3に代入すると、補正後の位置座標は次のように求められる。
Here sh n is a shutter correction value in consideration of the exposure time e n in the frame f n, it is defined as follows.
sh n = (1−e n / d) / 2
When the delay time R n into equation 3, the position coordinates after correction is obtained as follows.
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 ただし連続するフレーム間で露光時間en-1とeの差、ひいてはシャッター補正値shn-1とshの差が無視できる程度であれば、次のように近似してもよい。 However the difference in the exposure between successive frame time e n-1 and e n, if the negligible difference in turn shutter correction value sh n-1 and sh n, may be approximated as follows.
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
 さらに、連続するフレーム間で遅延時間の割合の変化が無視できる程度であれば、次のように近似してもよい。
 L≒1
これらは、露光時間を極端に変化させる状況やオプティカルフローが顕著に大きい状況が発生しない限り、妥当な近似といえる。同様の理由で、予測されるシャッター補正値の平均的な値を定数として設定しておけば、露光時間を純粋に考慮した場合であっても、撮影画像上での特徴点の位置座標と、それによって定まる遅延時間の割合を示すパラメータm(x,y)を用いて補正処理を容易に行える。
Furthermore, as long as the change in the ratio of the delay time between consecutive frames is negligible, the following approximation may be performed.
L n ≒ 1
These are reasonable approximations unless a situation in which the exposure time is extremely changed or a situation in which the optical flow is significantly large does not occur. For the same reason, if the average value of the predicted shutter correction value is set as a constant, even if the exposure time is purely considered, the position coordinates of the feature points on the captured image, The correction process can be easily performed using the parameter m (x n , y n ) indicating the delay time ratio determined thereby.
 図11は、本実施の形態における撮影画像、画像解析に用いる画像、および表示画像の、時間的な関係を模式的に示している。同図左側は撮像装置12による撮影画像、右側は表示装置16による表示画像を、フレームfn-1、f、fn+1の3フレーム分、示している。撮像装置12にローリングシャッターを用いた場合、図の縦方向を時間軸として、各時刻で観測され読み出される行ごとのデータを時間軸方向に並べると、図示するように画像平面に対応することになる。 FIG. 11 schematically shows a temporal relationship among a captured image, an image used for image analysis, and a display image in the present embodiment. The left side of the figure shows a photographed image by the imaging device 12, and the right side shows a display image by the display device 16, corresponding to three frames f n−1 , f n and f n + 1 . When a rolling shutter is used for the image pickup device 12, if the data for each row that is observed and read at each time is arranged in the time axis direction with the vertical direction in the figure as the time axis, it corresponds to the image plane as shown in the figure. Become.
 同図では簡単な例として、黒い玉200が下側に移動する模様を撮影しているとする。なお玉200の位置を明確にするため、その右側には静止物202を配置している。上述したように、各フレームにおいて玉200が観測される時刻は、フレームの最も上の行の露光時刻である基準時刻d・(n-1)、d・n、d・(n+1)からそれぞれ、Rn-1、R、Rn+1だけ遅延する。情報処理装置10は撮像装置12から、このような撮影画像のデータを、上の行から順にストリーム形式で取得する。 In the figure, as a simple example, it is assumed that the black ball 200 is photographed to move downward. In order to clarify the position of the ball 200, a stationary object 202 is arranged on the right side. As described above, the time at which the ball 200 is observed in each frame is the reference time d · (n−1), d · n, d · (n + 1) that is the exposure time of the uppermost row of the frame, respectively. Delay by R n−1 , R n , R n + 1 . The information processing apparatus 10 acquires such captured image data from the imaging apparatus 12 in a stream format in order from the top row.
 撮影画像をそのまま表示させるシンプルな態様の場合、情報処理装置10は取得したストリームを順次、表示装置16に出力する。表示装置16をラインバッファに対応するディスプレイとすると、当該データは出力された順に画面の上段から下段にかけて即時表示される。これにより原理的には全ての行において、撮影から表示までの遅延時間が、転送に要した時間ΔTに統一される。結果として、できうる限りで最も新しいデータが表示されることになる。 In the case of a simple mode in which captured images are displayed as they are, the information processing apparatus 10 sequentially outputs the acquired streams to the display apparatus 16. When the display device 16 is a display corresponding to the line buffer, the data is displayed immediately from the upper stage to the lower stage of the screen in the output order. Thereby, in principle, the delay time from photographing to display is unified to the time ΔT required for transfer in all rows. As a result, the most recent data possible is displayed.
 なお情報処理装置10において何らかの加工や画像生成をする場合であっても、処理を画像平面の上段から進めて即時出力することにより、より新しい画像を表示できる。一方、情報処理装置10において画像解析を行う場合、前処理として特徴点の位置やオプティカルフローを補正することにより、1つのフレームを同一時刻のデータとして扱う一般的なアルゴリズムをそのまま利用できるようにする。つまり各基準時刻に合わせた画像204a、204bを疑似的に生成する。 Even when some processing or image generation is performed in the information processing apparatus 10, a newer image can be displayed by proceeding from the upper stage of the image plane and outputting it immediately. On the other hand, when image analysis is performed in the information processing apparatus 10, a general algorithm that handles one frame as data at the same time can be used as it is by correcting the position of the feature point and the optical flow as preprocessing. . That is, the images 204a and 204b corresponding to each reference time are generated in a pseudo manner.
 図示する画像204a、204bでは、基準時刻d・nおよびd・(n+1)の前後で観測された、玉200の像を点線で示し、それらを時間的に補間することで得られた補正後の玉の位置を網掛けで示している。ただし実際には補正対象を、画像解析に必要な特徴点に限定することにより、補正処理による遅延の発生を抑える。このようにして統一された時刻における特徴点の位置を用いて精度よく解析した結果は、その後の表示画像に反映させてもよいし、撮像装置12へのデータ要求に用いてもよい。 In the images 204a and 204b shown in the figure, the image of the ball 200 observed before and after the reference times d · n and d · (n + 1) is indicated by a dotted line, and the corrected image obtained by temporally interpolating them is shown. The position of the ball is shaded. However, in practice, the correction target is limited to the feature points necessary for image analysis, thereby suppressing the occurrence of delay due to the correction processing. The result of accurate analysis using the positions of the feature points at the unified time in this way may be reflected in the subsequent display image or may be used for requesting data to the imaging device 12.
 例えば所定の対象物の位置や動きを正確に取得することにより、当該対象物が写る領域を予測して撮像装置12に通知してもよい。これに応じて撮像装置12が、通知された領域の高解像度の画像データと、その他の領域の低解像度の画像データを送信するようにし、それらを合成して表示させたり、さらなる解析に利用したりすれば、全体としてデータの転送量を抑えることができる。あるいはSLAMなどにより視野の動きを正確に求め、違和感のないVRやARを実現することもできる。 For example, by acquiring the position and movement of a predetermined object accurately, a region where the object is captured may be predicted and notified to the imaging device 12. In response to this, the imaging device 12 transmits the high-resolution image data of the notified area and the low-resolution image data of the other area, and synthesizes and displays them or uses them for further analysis. As a result, the data transfer amount can be suppressed as a whole. Alternatively, the movement of the visual field can be accurately obtained by using SLAM or the like, and VR or AR without a sense of incongruity can be realized.
 本実施の形態では、画像解析に用いる特徴点を補正対象とし、画像自体は時間差を含んだまま出力することを基本とする。すなわち表示処理と解析処理で撮影画像の取り扱いが独立している。この特性を利用すれば、画像解析の種類や目的によっては、特徴点の補正やそれを用いた解析処理の頻度を、撮影や表示のフレームレートより低く抑え、全体的な処理の負荷を軽減させることもできる。 In this embodiment, the feature points used for image analysis are targeted for correction, and the image itself is basically output with a time difference included. That is, the handling of the captured image is independent between the display process and the analysis process. If this characteristic is used, depending on the type and purpose of image analysis, the correction of feature points and the frequency of analysis processing using them will be lower than the frame rate of shooting and display, reducing the overall processing load. You can also.
 以上述べた本実施の形態によれば、撮像装置としてローリングシャッターカメラを利用し、実際の観測から出力までに要する時間を短くするとともに、その撮影画像を解析する際は、フレーム内で生じている観測時間差を解消するように補正する。これにより、従来用いられているアルゴリズムをそのまま利用して、精度のよい画像解析を行える。 According to the present embodiment described above, a rolling shutter camera is used as an imaging device, the time required from actual observation to output is shortened, and the captured image is analyzed within the frame. Correct to eliminate the observation time difference. This makes it possible to perform image analysis with high accuracy by using a conventionally used algorithm as it is.
 また画像解析に用いる特徴に限定して補正したり、補正に用いるパラメータをあらかじめ算出しておいたりすることで、補正処理を効率化し、時間的な影響を少なくできる。パラメータの算出結果は、画像平面の離散的な位置に対応づけた2次元マップとして準備することで、ランダムアクセスが可能になるとともに、レンズ歪み補正など固有の操作がなされた画像であっても、容易かつ厳密な補正を実現できる。 Also, correction is limited to the characteristics used for image analysis, and parameters used for correction are calculated in advance, so that the correction process can be made more efficient and the influence on time can be reduced. The parameter calculation result is prepared as a two-dimensional map corresponding to discrete positions on the image plane, so that random access is possible and even an image subjected to a specific operation such as lens distortion correction, Easy and precise correction can be realized.
 また、ローリングシャッターによる撮像装置と、入力された画素列を即時表示可能な構造の表示装置とを組み合わせてシステムを構築することにより、撮影から表示までの経路におけるデータの滞りを最小限にし、常に最新の画像を表示できるようにする。このようなシステムに上述の画像解析手法を導入することにより、撮影画像を用いた情報処理や表示の即時性と処理の精度を両立させることができる。 In addition, by constructing a system that combines an imaging device with a rolling shutter and a display device with a structure that can immediately display an input pixel row, it minimizes data stagnation in the route from shooting to display. Enable to display the latest image. By introducing the above-described image analysis technique into such a system, it is possible to achieve both information processing and display immediacy using captured images and processing accuracy.
 以上、本発明を実施の形態をもとに説明した。上記実施の形態は例示であり、それらの各構成要素や各処理プロセスの組合せにいろいろな変形例が可能なこと、またそうした変形例も本発明の範囲にあることは当業者に理解されるところである。 The present invention has been described based on the embodiments. Those skilled in the art will understand that the above-described embodiment is an exemplification, and that various modifications can be made to combinations of the respective constituent elements and processing processes, and such modifications are also within the scope of the present invention. is there.
 1 情報処理システム、 10 情報処理装置、 12 撮像装置、 16 表示装置、 23 CPU、 24 GPU、 26 メインメモリ、 32 通信部、 34 記憶部、 36 出力部、 38 入力部、40 記録媒体駆動部、 52 撮影画像取得部、 54 画像解析部、 56 出力データ生成部、 60 特徴抽出部、 62 補正部、 64 補正データ記憶部、 66 解析処理部。 1 information processing system, 10 information processing device, 12 imaging device, 16 display device, 23 CPU, 24 GPU, 26 main memory, 32 communication unit, 34 storage unit, 36 output unit, 38 input unit, 38 input unit, 40 recording medium drive unit, 52 taken image acquisition unit, 54 image analysis unit, 56 output data generation unit, 60 feature extraction unit, 62 correction unit, 64 correction data storage unit, 66 analysis processing unit.
 以上のように本発明は、ゲーム装置やパーソナルコンピュータなどの情報処理装置、ヘッドマウントディスプレイ、撮像装置、およびそれらを含む情報処理システムなどに利用可能である。 As described above, the present invention can be used for information processing devices such as game devices and personal computers, head mounted displays, imaging devices, and information processing systems including them.

Claims (12)

  1.  画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備えた撮像装置から、撮影された動画像のデータを取得する撮影画像取得部と、
     前記動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正する補正部と、
     補正された位置座標を用いて画像解析を行い、その結果を出力データに反映させる解析処理部と、
     を備えたことを特徴とする情報処理装置。
    A captured image acquisition unit that acquires data of a captured moving image from an imaging device that includes a rolling shutter that captures an image with a time lag for each row of pixels;
    A correction unit that corrects the position coordinates of the feature points in the frame of the moving image to the position coordinates at the reference time of the frame;
    An analysis processing unit that performs image analysis using the corrected position coordinates and reflects the result in the output data;
    An information processing apparatus comprising:
  2.  前記補正部は、前記撮像装置において1フレームを撮影する際の縦方向の撮影進捗速度と補正対象の特徴点の位置座標に基づき、当該特徴点が観測された時刻の、前記基準時刻からのずれ時間を特定することにより、当該位置座標を補正することを特徴とする請求項1に記載の情報処理装置。 The correction unit shifts the time when the feature point is observed from the reference time based on the vertical shooting progress speed when shooting one frame in the imaging apparatus and the position coordinates of the feature point to be corrected. The information processing apparatus according to claim 1, wherein the position coordinates are corrected by specifying time.
  3.  画像平面における離散的な位置に、前記ずれ時間の特定に用いる所定のパラメータを対応づけた2次元マップを格納した補正データ記憶部をさらに備え、
     前記補正部は、前記2次元マップを参照して、前記補正対象の特徴点の位置座標に対応する前記パラメータを取得することにより、前記ずれ時間を特定することを特徴とする請求項2に記載の情報処理装置。
    A correction data storage unit storing a two-dimensional map in which predetermined parameters used for specifying the shift time are associated with discrete positions on the image plane;
    The said correction | amendment part specifies the said shift | offset | difference time by referring to the said 2-dimensional map, and acquiring the said parameter corresponding to the position coordinate of the said feature point of the correction | amendment object. Information processing device.
  4.  前記所定のパラメータは、前記撮像装置においてなされるレンズ歪み補正を元に戻す補正のための成分を含むことを特徴とする請求項3に記載の情報処理装置。 4. The information processing apparatus according to claim 3, wherein the predetermined parameter includes a correction component that restores the lens distortion correction performed in the imaging apparatus.
  5.  前記補正部は、補正対象のフレームより前のフレームにおける特徴点の位置座標と、補正対象のフレームの特徴点の位置座標を、前記ずれ時間に基づき補間することにより、前記位置座標を補正することを特徴とする請求項2から4のいずれかに記載の情報処理装置。 The correction unit corrects the position coordinates by interpolating the position coordinates of the feature points in the frame before the correction target frame and the position coordinates of the feature points of the correction target frame based on the shift time. The information processing apparatus according to claim 2, wherein:
  6.  前記補正部は、フレームごとの露光時間の変化に応じて、前記ずれ時間を調整することを特徴とする請求項2から5のいずれかに記載の情報処理装置。 6. The information processing apparatus according to claim 2, wherein the correction unit adjusts the shift time according to a change in exposure time for each frame.
  7.  前記撮影画像取得部は、前記撮像装置において撮影が完了した行から順次、前記動画像のデータを取得し、
     前記情報処理装置はさらに、
     前記撮影画像取得部が取得した行から順次、前記動画像のデータを表示装置に出力するデータ出力部を備えたことを特徴とする請求項1から6のいずれかに記載の情報処理装置。
    The captured image acquisition unit sequentially acquires data of the moving image from a row where shooting is completed in the imaging device,
    The information processing apparatus further includes:
    The information processing apparatus according to claim 1, further comprising a data output unit that sequentially outputs data of the moving image to a display device from a row acquired by the captured image acquisition unit.
  8.  前記データ出力部は、前記解析処理部による解析結果に基づき、前記動画像に加工を施したデータを、前記表示装置に出力することを特徴とする請求項7に記載の情報処理装置。 The information processing apparatus according to claim 7, wherein the data output unit outputs data obtained by processing the moving image to the display device based on an analysis result by the analysis processing unit.
  9.  画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備え、撮影が完了した行から順次、撮影画像のデータを出力する撮像部と、
     画素の行ごとに前記撮影画像のデータを取得し、取得が完了した行から順次、表示する表示部と、
     を備えたことを特徴とするヘッドマウントディスプレイ。
    An image pickup unit that includes a rolling shutter that captures an image with a time lag for each row of pixels, and that sequentially outputs captured image data from the row where shooting is completed;
    The captured image data is acquired for each row of pixels, and a display unit that sequentially displays from the acquired row,
    A head-mounted display characterized by comprising:
  10.  画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備え、撮影が完了した行から順次、撮影画像のデータを出力する撮像部と、
     前記撮像部から、撮影された動画像のデータを取得し、当該動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正する補正部と、
     補正された位置座標を用いて画像解析を行う解析処理部と、
     前記画像解析の結果を利用して表示画像のデータを生成し行ごとに出力する出力データ生成部と、
     前記表示画像を、出力された行から順次表示する表示部と、
     を備えたことを特徴とする情報処理システム。
    An image pickup unit that includes a rolling shutter that captures an image with a time lag for each row of pixels, and that sequentially outputs captured image data from the row where shooting is completed;
    A correction unit that acquires data of a captured moving image from the imaging unit, and corrects the position coordinates of the feature points in the frame of the moving image to the position coordinates at the reference time of the frame;
    An analysis processing unit that performs image analysis using the corrected position coordinates;
    An output data generation unit that generates display image data using the result of the image analysis and outputs the data for each row;
    A display unit for sequentially displaying the display image from the output line;
    An information processing system comprising:
  11.  画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備えた撮像装置から、撮影された動画像のデータを取得するステップと、
     前記動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正するステップと、
     補正された位置座標を用いて画像解析を行うステップと、
     解析結果を反映したデータを出力するステップと、
     を含むことを特徴とする情報処理装置による情報処理方法。
    Acquiring data of a captured moving image from an imaging device including a rolling shutter that captures an image with a time lag for each row of pixels;
    Correcting the position coordinates of the feature points in the frame of the moving image to the position coordinates at the reference time of the frame;
    Performing image analysis using the corrected position coordinates;
    Outputting data reflecting the analysis results;
    An information processing method by an information processing apparatus, comprising:
  12.  画素の行ごとに時間ずれを設けて画像を撮影するローリングシャッターを備えた撮像装置から、撮影された動画像のデータを取得する機能と、
     前記動画像のフレームにおける特徴点の位置座標を、当該フレームの基準時刻における位置座標に補正する機能と、
     補正された位置座標を用いて画像解析を行う機能と、
     解析結果を反映したデータを出力する機能と、
     をコンピュータに実現させることを特徴とするコンピュータプログラム。
    A function of acquiring data of a captured moving image from an imaging device having a rolling shutter that captures an image with a time lag for each row of pixels;
    A function of correcting the position coordinates of the feature points in the frame of the moving image to the position coordinates at the reference time of the frame;
    A function to perform image analysis using the corrected position coordinates;
    A function to output data reflecting the analysis results,
    A computer program for causing a computer to realize the above.
PCT/JP2017/038524 2016-11-01 2017-10-25 Information processing device, head-mounted display, information processing system, and information processing method WO2018084051A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016-214654 2016-11-01
JP2016214654A JP6645949B2 (en) 2016-11-01 2016-11-01 Information processing apparatus, information processing system, and information processing method

Publications (1)

Publication Number Publication Date
WO2018084051A1 true WO2018084051A1 (en) 2018-05-11

Family

ID=62076171

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/038524 WO2018084051A1 (en) 2016-11-01 2017-10-25 Information processing device, head-mounted display, information processing system, and information processing method

Country Status (2)

Country Link
JP (1) JP6645949B2 (en)
WO (1) WO2018084051A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012160887A (en) * 2011-01-31 2012-08-23 Toshiba Alpine Automotive Technology Corp Imaging device and motion vector detection method
JP2014511606A (en) * 2011-02-25 2014-05-15 フオトニス・ネザーランズ・ベー・フエー Real-time image acquisition and display
JP2014115824A (en) * 2012-12-10 2014-06-26 Canon Inc Image processing system, image processing method, and program

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5240328B2 (en) * 2011-08-08 2013-07-17 カシオ計算機株式会社 Imaging apparatus and program
JP2015185936A (en) * 2014-03-20 2015-10-22 カシオ計算機株式会社 Imaging controller, imaging control method and program
JP6477193B2 (en) * 2015-04-20 2019-03-06 株式会社ソシオネクスト Image processing apparatus and image processing method
JP6646361B2 (en) * 2015-04-27 2020-02-14 ソニーセミコンダクタソリューションズ株式会社 Image processing apparatus, imaging apparatus, image processing method, and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012160887A (en) * 2011-01-31 2012-08-23 Toshiba Alpine Automotive Technology Corp Imaging device and motion vector detection method
JP2014511606A (en) * 2011-02-25 2014-05-15 フオトニス・ネザーランズ・ベー・フエー Real-time image acquisition and display
JP2014115824A (en) * 2012-12-10 2014-06-26 Canon Inc Image processing system, image processing method, and program

Also Published As

Publication number Publication date
JP6645949B2 (en) 2020-02-14
JP2018074486A (en) 2018-05-10

Similar Documents

Publication Publication Date Title
WO2015081870A1 (en) Image processing method, device and terminal
US11024082B2 (en) Pass-through display of captured imagery
KR20190046845A (en) Information processing apparatus and method, and program
WO2019171522A1 (en) Electronic device, head mounted display, gaze point detector, and pixel data readout method
JP2019030007A (en) Electronic device for acquiring video image by using plurality of cameras and video processing method using the same
US10678325B2 (en) Apparatus, system, and method for accelerating positional tracking of head-mounted displays
JP2014150443A (en) Imaging device, control method thereof, and program
US10349040B2 (en) Storing data retrieved from different sensors for generating a 3-D image
JP2012222743A (en) Imaging apparatus
JP7150134B2 (en) Head-mounted display and image display method
US20230236425A1 (en) Image processing method, image processing apparatus, and head-mounted display
WO2021261248A1 (en) Image processing device, image display system, method, and program
TW201824178A (en) Image processing method for immediately producing panoramic images
US11128814B2 (en) Image processing apparatus, image capturing apparatus, video reproducing system, method and program
JP7142762B2 (en) Display device and image display method
JP6768933B2 (en) Information processing equipment, information processing system, and image processing method
US20210397005A1 (en) Image processing apparatus, head-mounted display, and image displaying method
WO2018084051A1 (en) Information processing device, head-mounted display, information processing system, and information processing method
JP7429515B2 (en) Image processing device, head-mounted display, and image display method
US11106042B2 (en) Image processing apparatus, head-mounted display, and image displaying method
JP2020167659A (en) Image processing apparatus, head-mounted display, and image display method
JP6930011B2 (en) Information processing equipment, information processing system, and image processing method
JP6439412B2 (en) Image processing apparatus and image processing method
JP2023105524A (en) Display control device, head-mounted display, and display control method
JP2020167658A (en) Image creation device, head-mounted display, content processing system, and image display method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17867295

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17867295

Country of ref document: EP

Kind code of ref document: A1