WO2023281593A1 - Dispositif de traitement d'informations, procédé de commande et support de stockage - Google Patents

Dispositif de traitement d'informations, procédé de commande et support de stockage Download PDF

Info

Publication number
WO2023281593A1
WO2023281593A1 PCT/JP2021/025338 JP2021025338W WO2023281593A1 WO 2023281593 A1 WO2023281593 A1 WO 2023281593A1 JP 2021025338 W JP2021025338 W JP 2021025338W WO 2023281593 A1 WO2023281593 A1 WO 2023281593A1
Authority
WO
WIPO (PCT)
Prior art keywords
field
candidate
image
coordinate system
feature point
Prior art date
Application number
PCT/JP2021/025338
Other languages
English (en)
Japanese (ja)
Inventor
亮介 坂井
康敬 馬場崎
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to PCT/JP2021/025338 priority Critical patent/WO2023281593A1/fr
Priority to JP2023532892A priority patent/JPWO2023281593A5/ja
Publication of WO2023281593A1 publication Critical patent/WO2023281593A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Definitions

  • the present invention relates to the technical field of an information processing device, a control method, and a storage medium that perform processing related to spatial comprehension in Augmented Reality (AR).
  • AR Augmented Reality
  • Patent Literature 1 describes a technique of estimating the position of each feature point in a field and performing AR calibration based on the estimation results in order to realize AR in watching sports.
  • one of the main purposes of the present disclosure is to provide an information processing device, a control method, and a storage medium that can suitably suppress errors in estimating a target field.
  • Feature point information acquisition means for acquiring feature point information regarding a plurality of feature points of a field determined based on an image including at least part of a field of interest;
  • Candidate field generation means for generating, based on the feature point information, candidate fields representing candidates for the fields in a first coordinate system, which is a coordinate system based on which a display device having a camera that captures the image is based;
  • consistency determination means for determining a non-matching candidate field, which is the candidate field having no consistency, based on the plurality of candidate fields corresponding to the plurality of feature point information of the plurality of images; It is an information processing device having
  • One aspect of the control method is the computer obtaining feature point information for a plurality of feature points of a field of interest determined based on an image comprising at least a portion of the field of interest; based on the feature point information, generating a candidate field representing a candidate of the field in a first coordinate system, which is a coordinate system based on a display device having a camera that captures the image; determining a non-matching candidate field, which is the candidate field that does not have consistency, based on the plurality of candidate fields corresponding to the plurality of feature point information of the plurality of images; control method.
  • One aspect of the storage medium is obtaining feature point information for a plurality of feature points of a field of interest determined based on an image comprising at least a portion of the field of interest; based on the feature point information, generating a candidate field representing a candidate of the field in a first coordinate system, which is a coordinate system based on a display device having a camera that captures the image;
  • a memory for storing a program for causing a computer to execute a process of determining a non-matching candidate field, which is the candidate field having no match, based on the plurality of candidate fields corresponding to the plurality of feature point information of the plurality of images. is a medium.
  • FIG. 1 is a schematic configuration diagram of a display device according to a first embodiment;
  • FIG. An example of the data structure of structure data is shown.
  • 4 is a block diagram showing a functional configuration of a control unit;
  • FIG. (A) is a first label definition example of structural feature points when the target field is a tennis court.
  • (B) is a second label definition example of structural feature points when the target field is a tennis court.
  • (A) is a first label definition example of structure feature points when the target field is a pool.
  • B) A second label definition example of structure feature points when the target field is a pool.
  • (A) shows a captured image of a part of the target field.
  • (B) shows a transformed captured image generated by inverting the captured image;
  • FIG. 3 is a functional block diagram of a field estimating unit showing an outline of processing of the field estimating unit;
  • FIG. 4 is a diagram showing the relationship between a device coordinate system and a field coordinate system;
  • FIG. 4 is a diagram representing candidate fields based on feature point information of one image in the device coordinate system;
  • Fig. 10 shows the results of clustering the six generated candidate fields in the device coordinate system;
  • A is a diagram in which candidate fields based on feature point information are superimposed on a captured image on which feature extraction processing has been appropriately performed.
  • (B) is a diagram in which a candidate field based on feature point information is superimposed on a captured image in which an error has occurred in feature extraction processing.
  • FIG. 4 is a diagram showing the relationship between a device coordinate system and a field coordinate system
  • FIG. 4 is a diagram representing candidate fields based on feature point information of one image in the device coordinate system
  • Fig. 10 shows the results of clustering the six generated candidate fields in the device coordinate system
  • FIG. 10 is a diagram showing an overview of the process of determining estimated fields from matching candidate fields; 6 is an example of a flow chart showing an outline of processing related to display processing of a virtual object executed by a control unit in the first embodiment; It is an example of a flowchart showing a detailed processing procedure of calibration processing.
  • FIG. 11 is a block configuration diagram of a field estimator in a modified example; 1 shows the configuration of a display system according to a second embodiment; It is a block diagram of the server apparatus in 2nd Embodiment. It is an example of a flowchart showing a processing procedure executed by a control unit of a server device in the second embodiment. 1 shows a schematic configuration of an information processing apparatus according to a third embodiment; It is an example of a flow chart in a 3rd embodiment.
  • FIG. 1 is a schematic configuration diagram of a display device 1 according to the first embodiment.
  • the display device 1 is a device that can be worn by a user, and is, for example, a see-through type configured in the form of spectacles, and is configured to be wearable on the user's head.
  • the display device 1 realizes Augmented Reality (AR) by superimposing and displaying visual information on actual scenery when watching a sports game or a play (including a concert).
  • AR Augmented Reality
  • the above visual information is a two-dimensional or three-dimensional virtual object, hereinafter also referred to as a "virtual object".
  • the display device 1 may display the virtual object only for one eye of the user, or may display the virtual object for both eyes.
  • target field there is a field or structure (hereafter also referred to as a "target field") where sports, plays, etc. are held.
  • a virtual object that serves as additional information for assistance is superimposed on or around the target field.
  • the target field is, for example, a field targeted for watching sports (for example, a tennis court, a swimming pool, a stadium, etc.), or a field targeted for watching a play (for example, a theater, a concert hall, a multipurpose hall, various stages, etc.). Applicable.
  • the target field has a plurality of structural (that is, characteristic in shape) feature points (also called “structural feature points").
  • the target field functions as a reference in calibrating the display device 1 .
  • Virtual objects are, for example, a score board displayed above the tennis court in the case of tennis, a world record line superimposed in real time on the swimming pool in the case of swimming, and displayed superimposed on the stage at the theater. Including virtual performers, etc.
  • the display device 1 includes a light source unit 10, an optical element 11, a communication section 12, an input section 13, a storage section 14, a camera 15, a position/orientation detection sensor 16, and a control section 17.
  • the light source unit 10 has a light source such as a laser light source or an LCD (Liquid Crystal Display) light source, and emits light based on a drive signal supplied from the control section 17 .
  • the optical element 11 has a predetermined transmittance, transmits at least part of external light to enter the user's eyeball, and reflects at least part of the light from the light source unit 10 toward the user's eyeball. do. As a result, the virtual image corresponding to the virtual object formed by the display device 1 is superimposed on the landscape and viewed by the user.
  • the optical element 11 may be a half mirror whose transmittance and reflectance are substantially equal, or may be a mirror whose transmittance and reflectance are not equal (a so-called beam splitter).
  • the communication unit 12 exchanges data with external devices. For example, when the user uses the display device 1 for watching sports or watching a play, the communication unit 12 controls the display device 1 from a server device managed by the promoter based on the control of the control unit 17. Receive information about virtual objects.
  • the input unit 13 generates an input signal based on the user's operation and transmits it to the control unit 17 .
  • the input unit 13 is, for example, a button, a cross key, a voice input device, or the like for the user to give an instruction to the display device 1 .
  • the camera 15 Under the control of the control unit 17 , the camera 15 generates an image of the front of the display device 1 and supplies the generated image (also referred to as “captured image Im”) to the control unit 17 .
  • the position and orientation detection sensor 16 is a sensor (sensor group) that detects the position and orientation (orientation) of the display device 1, and includes, for example, a positioning sensor such as a GPS (Global Positioning Satellite) receiver, a gyro sensor, an acceleration sensor, It also includes an orientation detection sensor such as an IMU (Inertial Measurement Unit) that detects changes in the relative orientation of the display device 1 .
  • the position/orientation detection sensor 16 supplies the generated detection signal regarding the position and orientation of the display device 1 to the control unit 17 .
  • the control unit 17 detects the amount of change in the position and orientation from when the display device 1 is started, based on the detection signal supplied from the position and orientation detection sensor 16 .
  • control unit 17 instead of detecting the position of the display device 1 from the positioning sensor, the control unit 17 specifies the position of the display device 1 based on signals received from, for example, a beacon terminal or a wireless LAN device provided at the venue. good too. In another example, the control unit 17 may identify the position of the display device 1 based on a known position estimation technique using AR markers. In these cases, the position/orientation detection sensor 16 may not include a positioning sensor.
  • the control unit 17 has a processor such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit), a volatile memory that functions as a working memory of the processor, and the like, and performs overall control of the display device 1. .
  • a processor such as a CPU (Central Processing Unit) or a GPU (Graphics Processing Unit), a volatile memory that functions as a working memory of the processor, and the like, and performs overall control of the display device 1. .
  • the control unit 17 performs calibration processing for associating the real world space with the space recognized by the display device 1 based on the structural feature points of the target field recognized from the captured image Im at the display timing of the virtual object. I do.
  • the control unit 17 converts the coordinate system of the three-dimensional space based on the display device 1 (also referred to as “device coordinate system”) to the coordinate system of the three-dimensional space based on the target field (“ field coordinate system”).
  • the device coordinate system is an example of the "first coordinate system”
  • the field coordinate system is an example of the "second coordinate system”. Details of the calibration process will be described later.
  • control unit 17 generates a drive signal for driving the light source unit 10 based on the coordinate transformation information and the like, and supplies the drive signal to the light source unit 10 to display the virtual object on the light source unit 10.
  • the light for display also referred to as “display light”
  • display light is emitted to the optical element 11 .
  • the storage unit 14 is a non-volatile memory that stores various information necessary for the control unit 17 to control the display device 1 .
  • the storage unit 14 may include a removable storage medium such as flash memory.
  • the storage unit 14 also stores a program executed by the control unit 17 .
  • the storage unit 14 also has a sensor data storage unit 20, a parameter storage unit 21, and a structure data storage unit 22.
  • the sensor data storage unit 20 stores the imaged image Im generated by the camera 15 and the position and orientation of the display device 1 from the time the device coordinate system was set when the imaged image Im was generated (for example, when the display device 1 is activated).
  • change amount also referred to as “position/orientation change amount Ap”.
  • the control unit 17 constantly calculates the amount of change in the current position and orientation based on the position and orientation when the device coordinate system was set, based on the detection signal of the position and orientation detection sensor 16 . Then, when the captured image Im generated by the camera 15 is stored in the sensor data storage unit 20, the control unit 17 associates the position/orientation change amount Ap calculated when the captured image Im is generated with the captured image Im to Stored in the data storage unit 20 .
  • control unit 17 causes the sensor data storage unit 20 to store a combination of the most recent captured image Im and the position/orientation change amount Ap for a predetermined time period or a predetermined number. Information stored in the sensor data storage unit 20 is used in calibration processing.
  • the parameter storage unit 21 is an inference device (also referred to as a “feature extractor”) used when extracting the position information of the structure feature points of the target field and the classification information of the structure feature points from the captured image Im in the calibration process. store the parameters of For example, when the captured image Im is input, the above-described feature extractor is trained to output positional information of the structural feature points in the image for each classification (ie label) of the structural feature points to be extracted. learning model.
  • the positional information described above may be map information on the image that indicates the reliability of each coordinate value of the structural feature point, and is a coordinate value that indicates the position of the structural feature point in the image in units of pixels or subpixels. There may be.
  • the top m (m is an integer equal to or greater than 1) whose reliability exceeds a certain threshold for each classification of structural feature points are adopted as the positions of structural feature points.
  • the integer m is "1" in the examples of FIGS. 4A and 5A described later, and is "2" in the examples of FIGS. 4B and 5B described later.
  • the learning model used for learning the feature extractor may be a learning model based on a neural network, may be another type of learning model such as a support vector machine, or may be a combination thereof. .
  • the parameter storage unit 21 stores information such as the layer structure, the neuron structure of each layer, the number and size of filters in each layer, and the weight of each element of each filter. Stores various parameters.
  • the parameter storage unit 21 stores parameters related to the camera 15 necessary for displaying the virtual object, such as the focal length of the camera 15, internal parameters, principal points, and size information of the captured image Im.
  • the structure data storage unit 22 stores structure data that is data relating to the structure of the target field.
  • FIG. 2 shows an example of the data structure of structure data.
  • the structure data has size information and registered feature point information.
  • the size information is information about the size of the target field.
  • the registered feature point information is information relating to the structural feature points of the target field, and includes individual information for each structural feature point measured in advance.
  • the registered feature point information is information corresponding to each of these N structure feature points (first registered feature points information to N-th registered feature point information).
  • the registered feature point information includes at least a label indicating the classification of the target structural feature point and registered position information indicating the position of the target structural feature point in the field coordinate system.
  • the registered position information is coordinate information represented by the field coordinate system, and is set so that, for example, the position of any structural characteristic point becomes the origin.
  • the registered feature point information is used in calibration processing. Note that the length between any two structural feature points can be calculated based on the coordinate values indicated by the registered position information.
  • the structural data may include information specifying a structural characteristic point as the origin in the field coordinate system and information specifying each direction of each of the three axes of the field coordinate system.
  • the configuration of the display device 1 shown in FIG. 1 is an example, and various modifications may be made to this configuration.
  • the display device 1 may further include a speaker that outputs audio under the control of the controller 17 .
  • the display device 1 may also include a line-of-sight detection camera for changing whether or not a virtual object is displayed and the display position of the virtual object according to the line-of-sight position of the user.
  • the storage unit 14 does not have to have the sensor data storage unit 20 .
  • the control unit 17 performs calibration processing using the captured image Im immediately obtained from the camera 15 and the position/orientation change amount Ap calculated based on the detection signal of the position/orientation detection sensor 16 .
  • the display device 1 does not need to detect the position of the display device 1 by the position/orientation detection sensor 16 or the like.
  • the position/orientation detection sensor 16 is configured by a sensor that detects the orientation of the display device 1, and the control unit 17 detects only the amount of change in the orientation of the display device 1 from the setting of the device coordinate system. It may be calculated as the posture change amount Ap.
  • FIG. 3 is a block diagram showing the functional configuration of the control section 17.
  • the control unit 17 functionally includes an image conversion unit 40, a virtual object acquisition unit 41, a feature extraction unit 42, a coordinate transformation information generation unit 43, a reflection unit 44, a light source and a control unit 45 .
  • the blocks that exchange data are connected by solid lines, but the combinations of blocks that exchange data are not limited to those shown in FIG. The same applies to other functional block diagrams to be described later.
  • the image conversion unit 40 generates an image (also referred to as "converted captured image Ima") by performing predetermined image conversion on the captured image Im acquired by the feature extraction unit 42 from the sensor data storage unit 20.
  • the image conversion unit 40 may generate the converted captured image Ima by any data augmentation method.
  • the image conversion unit 40 may generate an image obtained by inverting the captured image Im as the converted captured image Ima, or may generate an image obtained by cropping (cutting) the captured image Im as the converted captured image Ima. You may In this case, the image conversion unit 40 may generate a plurality of cropped images as the converted captured image Ima by randomly or regularly changing the size and location of cropping.
  • the image conversion unit 40 generates one or more converted captured images Ima for one captured image Im.
  • the virtual object acquisition unit 41 acquires information (also referred to as "designated display information Id") that designates a virtual object that is to be superimposed and displayed as a virtual object on the scenery and its display position.
  • the virtual object may be information for drawing a two-dimensional object (two-dimensional drawing information) or information for drawing a three-dimensional object (three-dimensional drawing information).
  • the virtual object acquisition unit 41 designates distribution information to be delivered in a push-type or pull-type manner from the server device at a predetermined timing. Acquired as display information Id.
  • the specified display information Id includes information specifying the display position (for example, information indicating coordinate values in the field coordinate system) in addition to the virtual object.
  • information indicating combinations of virtual objects, display positions, and their display conditions may be stored in advance in the storage unit 14 .
  • the virtual object obtaining unit 41 determines that the stored display conditions are satisfied, it obtains the combination of the virtual object and the display position corresponding to the satisfied display conditions as the specified display information Id.
  • the feature extraction unit 42 generates feature point information "IF" for each image from the captured image Im (for example, the latest captured image Im) obtained from the sensor data storage unit 20 and the converted captured image Ima.
  • the feature extraction unit 42 inputs the above-described image to a feature extractor configured based on the parameters extracted from the parameter storage unit 21, and extracts feature point information IF for the input image from the information output by the feature extractor.
  • the feature extractor outputs the position (eg, coordinate value) of the structural feature point in the input image for each label indicating the classification of the structural feature point, and the feature extraction unit 42 outputs the structural feature point in the input image.
  • the feature extractor outputs values normalized so as not to depend on the image size as the coordinate values of the structure feature points
  • the feature extraction unit 42 multiplies the coordinate values by the image size of the input image. to calculate feature point candidate positions.
  • the coordinate transformation information generation unit 43 generates the structure data extracted from the structure data storage unit 22, the feature point information IF, the position/orientation change amount Ap at the time of generation of the captured image Im for which the feature extraction was performed, the parameters of the camera 15, and the like. , the coordinate conversion information “Ic” between the device coordinate system and the field coordinate system is generated.
  • the coordinate conversion information Ic is, for example, a combination of rotation matrix and translation vector generally used for coordinate conversion between three-dimensional spaces.
  • the coordinate conversion information Ic is not limited to information used when converting the field coordinate system to the device coordinate system, and may be information used when converting the device coordinate system to the field coordinate system. good.
  • the rotation matrix and translation vector for transforming from the field coordinate system to the device coordinate system are the rotation matrix (inverse matrix of the above-mentioned rotation matrix) and translation vector (code can be converted to the inverted translation vector described above).
  • the coordinate transformation information generation unit 43 also has a field estimation unit 46 .
  • the field estimation unit 46 estimates the target field (position of the target field) in the device coordinate system. The estimation method by the field estimation unit 46 will be described later.
  • the coordinate transformation information generation unit 43 generates coordinates based on the target field in the device coordinate system estimated by the field estimation unit 46 (also referred to as an “estimated field”) and the target field in the field coordinate system indicated by the structure data storage unit 22. Generate conversion information Ic.
  • the reflecting unit 44 reflects the coordinate transformation information Ic supplied from the coordinate transformation information generating unit 43 in the specified display information Id supplied from the virtual object acquiring unit 41, thereby creating a virtual object to be projected onto the optical element 11. generates a display signal "Sd" shown.
  • the reflecting unit 44 matches the device coordinate system with the field coordinate system by the coordinate conversion information Ic, and then generates the display signal Sd based on the designated display information Id.
  • the light source control unit 45 Based on the display signal Sd supplied from the reflection unit 44, the light source control unit 45 generates a drive signal that instructs the drive timing and light amount for driving the light sources of the light source unit 10 (for example, each light source corresponding to RGB). and supplies the generated drive signal to the light source unit 10 .
  • each process that is, the process of the reflection unit 44 and the light source control unit 45
  • the method may display a virtual object that superimposes the virtual object on the desired scene position.
  • documents disclosing such technology there are Japanese Patent Laying-Open Nos. 2015-116336 and 2016-525741.
  • the display device 1 performs user line-of-sight detection and the like, and performs control so that virtual objects are appropriately viewed.
  • FPGA Field-Programmable Gate Array
  • each component may be configured by an ASSP (Application Specific Standard Produce), an ASIC (Application Specific Integrated Circuit), or a quantum processor (quantum computer control chip).
  • ASSP Application Specific Standard Produce
  • ASIC Application Specific Integrated Circuit
  • quantum processor quantum computer control chip
  • FIG. 4A shows a first example of label definition of structure feature points when the target field is a tennis court
  • FIG. 11B is a second label definition example of structural feature points in the case of a court
  • FIGS. 4(A) and 4(B) the positions of the structural feature points are marked with circles and corresponding label numbers.
  • serial numbers from 0 to 13 are attached as labels to each structure feature point in order from the corners of the 14 structure feature points.
  • the same label is attached to symmetric structural feature points.
  • each label from 0 to 5 is attached to two structural feature points at symmetrical positions.
  • FIG. 5(A) is a first label definition example of the structural feature point when the target field is a pool
  • FIG. 5(B) is a second label definition example of the structural feature point when the target field is a pool. This is an example of label definition.
  • the positions of the structural feature points are marked with circles and corresponding label numbers.
  • floats of a predetermined color provided at predetermined intervals on ropes defining the course are selected as structural feature points.
  • each label from 0 to 24 is attached as a label to each structural feature point in order from the corner of the 25 structural feature points.
  • the same label is attached to symmetric structural feature points.
  • each label from 0 to 11 is attached to two structural feature points at symmetrical positions.
  • FIG. 6A shows a captured image Im obtained by photographing a part of the target field.
  • FIG. 6B shows a converted captured image Ima generated by inverting the captured image Im of FIG. 6A.
  • the target field is assumed to be a grid-like field, and each vertex of the grid is assumed to be a structural feature point.
  • labels "0" to "11" are assigned to the 12 structural feature points, and for convenience of explanation, the label numbers for the structural feature points are Illustrated.
  • the virtual lines representing the target field outside the imaging range of the captured image Im are indicated by dashed lines, and the labels for the structural feature points outside the imaging range are also clearly shown.
  • the structural feature points on the right side of the target field (the structural feature points labeled 9 to 11) do not appear on the captured image Im.
  • the converted captured image Ima shown in FIG. 6B which is obtained by left-right reversing the captured image Im, is the structure feature points (labels 9 to 11 in this case) on the right side of the target field that did not appear in the captured image Im due to the left-right reversal. This is an image in which structural feature points of ) appear. Note that the structural feature points on the left side (here, the structural feature points labeled 1 to 3) do not appear on the converted captured image Ima.
  • the image conversion unit 40 generates an inverted image of the captured image Im as the converted captured image Ima, thereby suitably generating an image representing the target field separately from the captured image Im. The same is true when a cropped image of the captured image Im is generated as the converted captured image Ima. In this manner, the image conversion unit 40 performs data augmentation so as to obtain a plurality of feature extraction results for the target field.
  • the field estimator 46 functionally includes a candidate field generator 51 , a consistency determiner 52 , and an estimated field determiner 53 .
  • each component of the field estimation unit 46 includes the position/orientation change amount Ap stored in the sensor data storage unit 20 and the camera parameters (internal parameters , including the size of the captured image Im).
  • the candidate field generation unit 51 executes processing for generating candidates (also called “candidate fields") representing the target field in the device coordinate system based on the feature point information IF for each image.
  • the candidate field generator 51 identifies, as candidate fields, for example, the estimated position of the structural feature point in the device coordinate system for each label and the estimated surface of the target field in the device coordinate system.
  • the candidate field generation unit 51 generates n (n is an integer equal to or less than k) candidate fields when there are k (k is an integer equal to or greater than 2) captured images Im and converted captured images Ima. . Since candidate fields cannot be generated for a converted captured image Ima that does not include any target field, the number of generated candidate fields may be less than the total number of the captured image Im and the converted captured image Ima.
  • the candidate field generator 51 when generating a candidate field from a certain image, if the value of the evaluation function that performs the least squares is greater than a predetermined threshold, the candidate field generator 51 considers the reliability of the candidate field to be generated to be low. , do not generate a candidate field for that image. In this case as well, the number of generated candidate fields is smaller than the total number of captured images Im and converted captured images Ima. A specific example of the candidate field generation method will be described later.
  • the consistency determination unit 52 determines candidate fields that are not consistent with other candidate fields (also called “non-matching candidate fields”). Then, the matching determination unit 52 supplies candidate fields other than the non-matching candidate fields (also called “matching candidate fields”) to the estimated field determining unit 53 .
  • the estimated field determination unit 53 determines estimated fields based on matching candidate fields. In this case, the estimated field determining unit 53 generates an estimated field by, for example, integrating matching candidate fields. In other words, the estimated field determination unit 53 determines the estimated position of the structure feature point of each label of the target field (and the estimated plane of the target field) in the device coordinate system.
  • the estimated field determination unit 53 may determine whether or not to generate an estimated field based on the determination result of the consistency determination unit 52 . In this case, the estimated field determination unit 53 sets a condition for determining an estimated field (also referred to as an “estimated field determination condition”), and generates an estimated field only when the estimated field determination condition is satisfied.
  • a condition for determining an estimated field also referred to as an “estimated field determination condition”
  • the estimated field determination unit 53 sets the estimated field determination condition to a condition based on the ratio of non-matching candidate fields to all candidate fields (in other words, a condition based on the ratio of matching candidate fields). For example, the estimated field determination unit 53 determines that the estimated field determination condition is satisfied when the ratio of non-matching candidate fields to all candidate fields is less than a predetermined ratio (for example, 30%), and determines that the estimated field determination condition is satisfied. Generate inferred fields. On the other hand, when the ratio of non-matching candidate fields to all candidate fields is equal to or greater than a predetermined ratio, estimated field determining section 53 determines that the estimated field determination condition is not satisfied, and does not generate an estimated field.
  • a predetermined ratio for example, 30%
  • the field estimation unit 46 regenerates candidate fields based on the feature point information IF of the newly generated captured image Im and the converted captured image Ima generated from the captured image Im.
  • the predetermined ratio described above is set to, for example, a suitable value stored in advance in the storage unit 14 .
  • the estimated field determination unit 53 can preferably suppress generation of estimated fields based on candidate fields with low reliability.
  • the estimated field determination conditions are not limited to those described above.
  • the estimation field determination unit 53 determines the number of candidate fields in the first cluster, which is the largest number generated by clustering n candidate fields, and the ratio of the total number of candidate fields in the first cluster and the second cluster to the whole. The necessity of generating the estimated field may be determined based on the above. A specific example of this will be described later with reference to FIG.
  • the structure feature points may be extracted as adjacent structure feature points, and as a result, all the structure feature points may be extracted with a regular shift.
  • errors in feature extraction results are errors (consistent errors) that occur almost consistently when only part of the target field is captured in the image, and general error detection and correction methods for feature extraction results are It cannot be dealt with.
  • the feature extraction accuracy is lowered, and as a result, the field estimation accuracy is also lowered.
  • the field estimation unit 46 determines non-matching candidate fields and matching candidate fields based on candidate fields based on k images including images generated by data augmentation, and estimates by matching candidate fields. Decide on a field.
  • the field estimating unit 46 preferably solves the problem of deterioration in estimation accuracy due to the presence of blind spots that are not imaged when estimating the target field having structural feature points that are regularly arranged, and the target field in the device coordinate system. can be accurately estimated.
  • the output feature extraction results may differ even with minor changes in the input image.
  • a feature extractor used in the feature extraction unit 42 is assumed to be a reasoner mainly based on a neural network. Therefore, the feature extractor used in the feature extraction unit 42 can output Variation occurs in the feature extraction results. This variation is more likely to occur when accurate feature extraction is difficult. Therefore, the field estimation unit 46 can suitably generate a plurality of candidate fields based on the feature point information IF generated from the captured image Im and the converted captured image Ima.
  • FIG. 8 is a diagram showing the relationship between the device coordinate system and the field coordinate system.
  • the field coordinate system is a coordinate system adopted in structure data, and is a coordinate system based on the target field.
  • the target field is a tennis court
  • the field coordinate system has respective axes in the lateral direction and longitudinal direction of the tennis court and in the vertical directions thereof.
  • the three axes of the device coordinate system will be x, y, and z
  • the three axes of the field coordinate system will be x ⁇ , y, and z.
  • the device coordinate system is a coordinate system set by the display device 1 at the time of activation or the like, and the shooting position (including the shooting direction) of the camera 15 in the device coordinate system is specified based on the position/orientation change amount Ap. Then, the candidate field generation unit 51 generates a candidate field representing the target field in the device coordinate system based on the structure data adopting the field coordinate system and the feature point information IF.
  • the candidate field generation unit 51 recognizes the vertical direction in the device coordinate system based on the output signal of the acceleration sensor included in the position/orientation detection sensor 16, and the device coordinate system and the field coordinate system based on the recognized vertical direction.
  • the device coordinate system may be adjusted so that the axes in the height direction (y-axis and y ⁇ -axis) are parallel.
  • the candidate field generation unit 51 can align one axis of the device coordinate system and the field coordinate system, it is possible to reduce the calculation cost for identifying feature point candidate positions and generating candidate fields in the device coordinate system, which will be described later. etc. can be realized.
  • FIG. 9 is a diagram representing candidate fields based on the feature point information IF of one image in the device coordinate system.
  • the target field is a grid-like field that is an imaging target in the captured image Im of FIG.
  • a total of 9 structural feature points are extracted by the feature extraction unit 42 .
  • the candidate field generator 51 generates the candidate field “C1” based on the feature point candidate positions P0 to P8.
  • the candidate field generation unit 51 calculates the device coordinates based on the shooting position (including shooting orientation) in the device coordinate system specified based on the position/orientation change amount Ap and the parameters of the camera 15 (including internal parameters). Identify feature point candidate positions P0 to P8 (broken lines shown in FIG. 9) projected onto the system. Then, the candidate field generating unit 51 refers to the structure data to specify the relative positional relationship (model of the target field) representing the distance between the structural feature points of the target field, and the positions of the specified structural feature points. Feature point candidate positions P0 to P8 and candidate fields in the device coordinate system are determined so that the relationship is maintained.
  • the candidate field generation unit 51 uses the method of least squares or the like to generate an error (2 The position of the structural feature points of the model of the target field is determined so that the sum of the multiplicative errors) is minimized.
  • the above example is an example of a method of estimating the extrinsic parameters (shooting position and orientation) of a camera assuming that feature points are present on a horizontal plane perpendicular to the vertical direction.
  • the candidate field generating unit 51 is not limited to this example, for example, it is not necessary to assume that the feature points are on the same plane, and it is possible to simultaneously optimize the internal parameters and the positions of the feature points on the image. Camera calibration techniques (eg, PnP techniques) may be performed.
  • the field estimator 46 clusters the candidate fields and determines the estimated fields from the candidate fields belonging to the main cluster to which the number of candidate fields to which the candidate fields belong is the largest.
  • FIG. 10 is a diagram showing the result of clustering the generated six candidate fields "C1" to "C6" in the device coordinate system.
  • the field estimation unit 46 classifies candidate fields into a first cluster CL1, a second cluster CL2, and a third cluster CL3 by using an arbitrary clustering method.
  • the field estimation unit 46 calculates three-dimensional center-of-gravity coordinates in the device coordinate system (for example, average coordinates of structural feature points) and Euler angles of yaw, pitch, and roll (vertical direction information). If the device coordinate system and the field coordinate system are aligned by , the above-described clustering is performed based on the vector representing the angle representing the orientation on the xz plane).
  • Such clustering methods include, but are not limited to, the single link method, perfect link method, group average method, Ward method, centroid method, weighted method, and median method.
  • the first cluster CL1 is the cluster to which the largest number of candidate fields belong (also referred to as "main cluster"), and the four candidate fields C1, C3, C4, and C6, which are the largest, belong to it.
  • the second cluster CL2 consists of the candidate field C2
  • the third cluster CL3 consists of the candidate field C5.
  • the field estimation unit 46 determines that the candidate fields belonging to the first cluster CL1, which is the main cluster, are matching candidate fields, and the candidate fields belonging to the second cluster CL2 and the third cluster CL3 other than the main cluster. Determine that it is a non-matching candidate field.
  • the field estimation unit 46 may determine matching candidate fields and non-matching candidate fields based on threshold processing. For example, the field estimator 46 calculates the distance between each candidate field based on the vector described above, and generates clusters whose mutual distance is less than a predetermined threshold. Then, the field estimating unit 46 determines a candidate field belonging to the main cluster having the largest number of candidate fields among the generated clusters as a matching candidate field, and determines a candidate field not belonging to the main cluster as a non-matching candidate field. .
  • the field estimation unit 46 determines whether or not the number of candidate fields "Nc1" belonging to the first cluster CL1, which has the largest number, is equal to or greater than a predetermined threshold "Ncth1". In the example of FIG. 10, the number of candidate fields Nc1 is "4". Then, if the number of candidate fields Nc1 is equal to or greater than the threshold value Ncth1, the field estimator 46 proceeds to the second step described below.
  • the field estimation unit 46 determines that the estimated field determination condition is not satisfied, and determines that the newly generated captured image Im and the converted captured image generated from the captured image Im The candidate field is regenerated based on the feature point information IF of Ima.
  • the field estimator 46 determines that the processing related to estimation of the target field should end.
  • the ratio R12 is less than the threshold value Rth12
  • the field estimating unit 46 determines that the estimated field determination condition is not satisfied.
  • the candidate field is regenerated based on the feature point information IF.
  • the field estimating unit 46 determines whether or not to generate an estimated field based on the result of determining the consistency of the n candidate fields generated by the candidate field generating unit 51 (here, the clustering result of the candidate fields). It is possible to suitably perform the determination of the end of the processing relating to the determination and estimation fields.
  • FIG. 11(A) is a diagram in which candidate fields based on feature point information IF are superimposed on a captured image Im on which feature extraction processing has been appropriately performed.
  • FIG. 11B is a diagram in which candidate fields based on the feature point information IF are superimposed on the captured image Im in which an error has occurred in the feature extraction processing.
  • a set of the correct label (X) of each structural feature point and the recognition result (Y) based on the feature extraction process and the candidate field generation process is represented as " X ⁇ Y”.
  • auxiliary lines that virtually represent candidate fields to be generated are indicated by dashed lines.
  • the structural feature points 0 to 8 within the imaging range of the captured image Im are correctly extracted from the structural feature points labeled 0 to 11.
  • a candidate field has been generated to represent the field.
  • the structure feature points of the labels 9 to 11, which were not captured in the captured image Im are also correctly recognized.
  • the field estimator 46 performs a process of determining an estimated field from a plurality of candidate fields based on the captured image Im and the converted captured image Ima generated by data augmentation. Also preferably, the field estimator 46 does not generate an estimated field when the ratio of matching candidate fields to all candidate fields is less than a predetermined ratio. As a result, the field estimator 46 can suitably suppress the generation of estimation fields with low accuracy.
  • FIG. 12 is a diagram showing an overview of the process of determining estimated fields from matching candidate fields.
  • the structural feature points of the matching candidate fields are indicated by circles, and the estimated fields to be determined are indicated virtually by dashed lines.
  • point groups “PC0” to “PC11” of structure feature points of four match candidate fields are formed for structure feature points labeled 0 to 11 of the target field.
  • the target field is a rectangular area having a length of "L1" in the lateral direction and a length of "L2" in the longitudinal direction. The lengths L1 and L2 of the target field are recorded in the structure data.
  • the estimation field determining unit 53 regards the estimation field as a rectangular area of L1 in the lateral direction and L2 in the longitudinal direction, and the diagonal corners of the rectangular area (for example, between labels 0 and 11).
  • the estimated field is determined such that the structural feature points) are determined by the point cloud of structural feature points on the diagonal of the match candidate field (for example, the point clouds PC0 and PC11).
  • the estimated field determining unit 53 uses, for example, the least squares method to calculate the error (squared error) between each vertex of the above-described rectangular area and the point group, and calculates the sum of the errors to minimize Determine the estimated field.
  • the estimated field determination unit 53 uses a model of the target field specified based on the structural data (for example, positional relationships such as distances between structural feature points), and An estimation field is determined to minimize the error between the structural feature points of the model and the corresponding point cloud.
  • the estimated field determination unit 53 uses the positional relationship such as the distance between the structural feature points specified by the structure data as a constraint condition, and the sum of the errors between the structural feature points and the point group of the estimated field for each label is the evaluation function , the structural feature points of the estimated field may be determined by optimization so as to minimize the evaluation function.
  • the estimated field determination unit 53 may obtain the above-described optimization solution using an arbitrary optimization solver.
  • the estimated field determination unit 53 may select a matching candidate field determined by a predetermined rule or randomly as an estimated field instead of generating an estimated field by integrating matching candidate fields. In this case, for example, the estimated field determination unit 53 calculates the average vector of the vectors representing the xyz coordinate values of the center of gravity and the attitude angle for each of the matching candidate fields, and calculates the matching candidate corresponding to the vector closest to the average vector. A field may be selected as an estimated field.
  • FIG. 13 is an example of a flow chart showing an overview of the processing related to the virtual object display processing executed by the control unit 17 in the first embodiment.
  • the control unit 17 detects activation of the display device 1 (step S11). In this case, the control unit 17 sets a device coordinate system based on the orientation and position of the display device 1 when the display device 1 is activated (step S12). After that, the control unit 17 obtains the captured image Im generated by the camera 15, and obtains the position/orientation change amount Ap based on the detection signal output by the position/orientation detection sensor 16 (step S13). The control unit 17 stores the combination of the captured image Im and the position/orientation change amount Ap acquired in step S13 in the sensor data storage unit 20 .
  • the control unit 17 determines whether or not there is a request to display the virtual object (step S14).
  • the virtual object acquisition unit 41 determines that there is a request to display a virtual object when receiving delivery information instructing display of a virtual object from a server device (not shown) managed by the promoter. Then, if there is no request to display the virtual object (step S14; No), subsequently in step S13, the captured image Im and the position/orientation change amount Ap are acquired.
  • step S15 if there is a request to display a virtual object (step S14; Yes), the control unit 17 executes calibration processing (step S15). Details of the procedure of this calibration process will be described later with reference to FIG.
  • the reflection unit 44 of the control unit 17 performs a display for displaying the virtual object corresponding to the virtual object and the display position specified in the display request, based on the coordinate transformation information Ic obtained in the calibration process in step S15.
  • a display signal Sd is generated (step S16).
  • the control unit 17 actually recognizes the space visually recognized by the user in the device coordinate system in consideration of the user's line-of-sight direction, position/orientation change amount Ap, etc., as in various conventional AR display products.
  • the display signal Sd is generated so that the virtual object is displayed at the designated position in the space.
  • the light source control section 45 of the control section 17 performs emission control of the light source unit 10 based on the display signal Sd (step S17).
  • the processing procedure of the flowchart shown in FIG. 13 is an example, and various changes can be made to this processing procedure.
  • control unit 17 executes the calibration process in step S15 every time there is a virtual object display request, but the present invention is not limited to this.
  • control unit 17 may perform the calibration process only when a predetermined time or longer has passed since the previous calibration process. In this way, the control unit 17 may perform the calibration process at least once after the display device 1 is activated.
  • control unit 17 determines the device coordinate system based on the position and orientation of the display device 1 when the display device 1 is activated, but is not limited to this. Instead of this, for example, the control unit 17 uses the position and orientation of the display device 1 when a display request is first received after the display device 1 is activated (that is, when the calibration process is performed for the first time) as a reference, A device coordinate system may be determined. In another example, each time a display request is made, the control unit 17 may reset the device coordinate system based on the position and orientation of the display device 1 at the time of the display request (that is, at the time of executing the calibration process). . In this case, there is no need to use the position/orientation change amount Ap in the process of generating coordinate transformation information Ic, which will be described later.
  • FIG. 14 is an example of a flowchart showing the detailed processing procedure of the calibration process in step S15 of FIG.
  • the image conversion unit 40 of the control unit 17 generates a converted captured image Ima by performing data augmentation based on the captured image Im acquired from the sensor data storage unit 20 or the like (step S21). Thereby, the image conversion unit 40 generates at least one converted captured image Ima for the captured image Im used for estimating the target field.
  • the feature extraction unit 42 extracts feature point candidate positions and candidate labels corresponding to each structural feature point of the target field for each of the captured image Im used in step S21 and the converted captured image Ima generated in step S21. is generated (step S22).
  • the feature extractor 42 configures a feature extractor based on the parameters acquired from the parameter storage section 21, and inputs each image to the feature extractor. Then, the feature extraction unit 42 generates feature point information IF for each image based on the information output by the feature extractor.
  • the candidate field generator 51 generates candidate fields from the feature point information IF of each image (step S23). Then, the matching determination unit 52 determines a non-matching candidate field by performing clustering or the like on the plurality of generated candidate fields (step S24). Then, the consistency determination unit 52 determines whether or not the estimated field determination condition is satisfied (step S25). Then, when it is determined that the estimation field determination condition is satisfied (step S25; Yes), the consistency determination unit 52 determines the estimation field based on the matching candidate fields (step S26). In this case, the consistency determination unit 52 determines the estimated field based on the integration processing or selection processing of the matching candidate fields.
  • the coordinate transformation information generation unit 43 generates coordinate transformation information Ic for transforming from the device coordinate system to the field coordinate system based on the estimated field determined by the field estimation unit 46 (step S27).
  • the coordinate conversion information generating unit 43 generates the detected position in the device coordinate system of each structural feature point indicated by the second field estimation information acquired in step S25 for each label of the structural feature point, and the The position of the structural feature point in the field coordinate system indicated by the registered position information is associated with each other and matched, and the matched positions for each label match each other (that is, the positional error for each label is minimized). Coordinate conversion information Ic is calculated.
  • the display device 1 extracts information obtained by extracting only pre-registered structural feature points (that is, already labeled) from the captured image Im in the calibration process, and extracts the structural feature points registered in the structural data. to match the information in As a result, the amount of calculation required for the matching process for calculating the coordinate transformation information Ic can be greatly reduced, and the influence caused by the extraction of noise (that is, feature points other than the target field) included in the captured image Im can be reduced. It is possible to calculate robust coordinate transformation information Ic that is not affected. In addition, in the present embodiment, since the collation described above is performed based on the estimated field estimated with high accuracy, it is possible to accurately calculate the coordinate transformation information Ic.
  • the field estimating unit 46 instead of determining an estimated field based on one captured image Im and a converted captured image Ima generated from the captured image Im, m (“m” is two or more (integer) of the captured image Im and the converted captured image Ima generated from the captured image Im.
  • FIG. 15 is a block configuration diagram of the field estimation unit 46 in the modified example.
  • the storage unit 14 further has a candidate field storage unit 23 generated by the candidate field generation unit 51.
  • the candidate field generation unit 51 stores m captured images Im and corresponding converted captured images Ima. field candidates generated for each are stored in the candidate field storage unit 23 .
  • the matching determination unit 52 selects non-matching candidates based on the candidate fields stored in the candidate field storage unit 23. It performs processing for judging the field and judging whether the conditions for determining the estimated field are successful or not.
  • control unit 17 does not have to have the image conversion unit 40 .
  • the consistency determination unit 52 can suitably determine the estimated field based on m candidate fields generated based on the m captured images Im.
  • the field estimating unit 46 can suitably generate a plurality of candidate fields used for determining an estimated field by using a plurality of captured images Im.
  • FIG. 16 shows the configuration of the display system in the second embodiment.
  • the display system according to the second embodiment has a display device 1A and a server device 2.
  • FIG. The second embodiment differs from the first embodiment in that the server apparatus 2 executes the calibration process and the like instead of the display apparatus 1A.
  • symbol is attached
  • the display device 1A transmits to the server device 2 an upload signal "S1", which is information necessary for the server device 2 to perform calibration processing and the like.
  • the upload signal S1 includes, for example, the captured image Im generated by the camera 15 and the position/orientation change amount Ap detected based on the output of the position/orientation detection sensor 16 .
  • the display device 1A receives the distribution signal “S2” transmitted from the server device 2
  • the virtual object is displayed by performing light emission control of the light source unit 10 based on the distribution signal S2.
  • the distribution signal S2 includes information corresponding to the display signal Sd of the first embodiment, and after receiving the distribution signal S2, the display device 1A performs the same processing as the light source control unit 45 of the first embodiment. By doing so, the light source unit 10 emits light for displaying the virtual object.
  • the server device 2 is, for example, a server device managed by a promoter, and based on the upload signal S1 received from the display device 1A, generates the distribution signal S2 and distributes the distribution signal S2 to the display device 1A.
  • FIG. 17 is a block diagram of the server device 2. As shown in FIG. The server device 2 has an input unit 26 , a control unit 27 , a communication unit 28 and a storage unit 29 .
  • the storage unit 29 is a non-volatile memory that stores various information necessary for the control unit 27 to control the server device 2 .
  • a program executed by the control unit 27 is stored in the storage unit 29 .
  • the storage unit 29 has a sensor data storage unit 20 , a parameter storage unit 21 and a structural data storage unit 22 .
  • the sensor data storage unit 20 stores the captured image Im and the position/orientation change amount Ap included in the upload signal S ⁇ b>1 under the control of the control unit 27 .
  • the storage unit 29 may be an external storage device such as a hard disk connected to or built into the server device 2, or may be a storage medium such as a flash memory.
  • the storage unit 29 may be a server device that performs data communication with the server device 2 (that is, a device that stores information so that other devices can refer to it). Further, in this case, the storage unit 29 may be composed of a plurality of server devices, and the sensor data storage unit 20, the parameter storage unit 21, and the structure data storage unit 22 may be distributed and stored.
  • the control unit 27 has, for example, a processor such as a CPU or GPU, a volatile memory that functions as a working memory, and the like, and performs overall control of the server device 2 .
  • the control unit 27 generates information on a virtual object to be displayed as a virtual object and a display position (that is, information corresponding to the specified display information Id in the first embodiment) based on user input to the input unit 26 or the like. Further, the control unit 27 refers to the sensor data storage unit 20, the parameter storage unit 21, and the structure data storage unit 22 to perform calibration processing and generate the distribution signal S2.
  • the control unit 27 includes functions corresponding to the image conversion unit 40, the virtual object acquisition unit 41, the feature extraction unit 42, the coordinate transformation information generation unit 43, and the reflection unit 44 shown in FIG.
  • FIG. 18 is an example of a flowchart showing a processing procedure executed by the control unit 27 of the server device 2 in the second embodiment.
  • the control unit 27 receives the upload signal S1 including the captured image Im and the position/posture change amount Ap from the display device 1A via the communication unit 28 (step S31). In this case, the control unit 27 updates the data stored in the sensor data storage unit 20 based on the upload signal S1. Then, the control unit 27 determines whether or not it is time to display the virtual object (step S32). If it is not the display timing (step S32; No), the controller 27 continues to receive the upload signal S1 from the display device 1A in step S31.
  • step S32 the control unit 27 executes calibration processing based on the latest upload signal S1 and the like received in step S31.
  • the controller 27 executes the flowchart shown in FIG.
  • the control unit 27 generates a distribution signal S2 for displaying the virtual object on the display device 1A based on the coordinate conversion information Ic obtained by the calibration process (step S34).
  • the control unit 27 transmits the generated distribution signal S2 to the display device 1A through the communication unit 28 (step S35).
  • the display device 1A that has received the distribution signal S2 displays the virtual object by controlling the light source unit 10 based on the distribution signal S2.
  • the display system can accurately calculate the coordinate transformation information Ic necessary for displaying the virtual object on the display device 1A, and allow the user to preferably visually recognize the virtual object.
  • the display device 1A may perform the calibration processing and the like instead of the server device 2 performing the calibration processing.
  • the display device 1A executes the processing of the flowchart shown in FIG. 14 by appropriately receiving information necessary for the calibration processing from the server device 2.
  • the display system can favorably allow the user of the display device 1A to visually recognize the virtual object.
  • FIG. 19 shows a schematic configuration of an information processing device 1X according to the third embodiment.
  • the information processing apparatus 1X mainly includes feature point information acquisition means 42X, candidate field generation means 51X, and consistency determination means 52X.
  • the information processing device 1X is implemented by, for example, the display device 1 or the control unit 17 of the display device 1 in the first embodiment, or the control unit 27 of the server device 2 in the second embodiment. Note that the information processing device 1X may be composed of a plurality of devices.
  • the feature point information acquisition means 42X acquires feature point information indicating candidate positions and candidate labels of a plurality of feature points of the field, which are determined based on an image including at least part of the target field.
  • the feature point information acquisition means 42X may receive feature point information generated by a processing block (including an apparatus other than the information processing apparatus 1X) other than the feature point information acquisition means 42X. Point information may be generated. In the latter case, the feature point information acquisition means 42X can be, for example, the feature extraction section 42 in the first embodiment or the second embodiment.
  • the candidate field generating means 51X generates, based on the feature point information, candidate fields representing field candidates in the first coordinate system, which is the coordinate system based on which the display device having the camera for capturing the image is based.
  • the candidate field generator 51X can be the candidate field generator 51 in the first embodiment or the second embodiment, for example.
  • the consistency determination means 52X determines non-matching candidate fields, which are candidate fields that do not have consistency, based on a plurality of candidate fields corresponding to a plurality of pieces of feature point information of a plurality of images.
  • the consistency determination means 52X can be, for example, the consistency determination section 52 in the first embodiment or the second embodiment.
  • FIG. 20 is an example of a flowchart in the third embodiment.
  • the feature point information acquiring means 42X acquires feature point information indicating candidate positions and candidate labels of a plurality of feature points in the field based on an image including at least part of the target field (step S41).
  • the candidate field generating means 51X Based on the feature point information, the candidate field generating means 51X generates a candidate field representing a field candidate in the first coordinate system, which is the coordinate system based on which the display device having the camera for capturing the image is based (step S42).
  • the consistency determination unit 52X determines a non-matching candidate field, which is a candidate field having no consistency, based on a plurality of candidate fields corresponding to a plurality of pieces of feature point information of a plurality of images (step S43).
  • the information processing apparatus 1X can accurately determine a non-matching candidate field having no match based on a plurality of candidate fields, and suitably suppress deterioration in field estimation accuracy. can.
  • Non-transitory computer readable media include various types of tangible storage media.
  • Examples of non-transitory computer-readable media include magnetic storage media (e.g., floppy disks, magnetic tapes, hard disk drives), magneto-optical storage media (e.g., magneto-optical discs), CD-ROMs (Read Only Memory), CD-Rs, CD-R/W, semiconductor memory (eg mask ROM, PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (Random Access Memory)).
  • the program may also be delivered to the computer on various types of transitory computer readable medium.
  • Examples of transitory computer-readable media include electrical signals, optical signals, and electromagnetic waves.
  • Transitory computer-readable media can deliver the program to the computer via wired channels, such as wires and optical fibers, or wireless channels.
  • [Appendix 1] Feature point information acquisition means for acquiring feature point information regarding a plurality of feature points of a field determined based on an image including at least part of a field of interest;
  • Candidate field generation means for generating, based on the feature point information, candidate fields representing candidates for the fields in a first coordinate system, which is a coordinate system based on which a display device having a camera that captures the image is based;
  • consistency determination means for determining a non-matching candidate field, which is the candidate field having no consistency, based on the plurality of candidate fields corresponding to the plurality of feature point information of the plurality of images;
  • Information processing device having [Appendix 2] 1.
  • information processing equipment [Appendix 3] 3. The information processing apparatus according to appendix 2, wherein the estimated field determination means generates the estimated field by integrating the matching candidate fields, or selects one of the matching candidate fields as the estimated field. [Appendix 4] 4. The information processing apparatus according to appendix 2 or 3, wherein the estimated field determining means determines whether or not to determine the estimated field based on a determination result by the consistency determining means. [Appendix 5] 5.
  • the information processing apparatus according to appendix 4, wherein the estimated field determining means performs the necessity determination based on a ratio of the non-matching candidate field or the matching candidate field to the plurality of candidate fields.
  • Appendix 6 6.
  • the information processing apparatus according to any one of appendices 2 to 5, wherein the estimated field determination means determines completion of the process related to the estimated field based on a determination result by the consistency determination means.
  • Appendix 7 further comprising image transformation means for generating one or more second images by transforming the first image containing at least part of the field; The plurality of images includes the first image and the second image; 7.
  • the information processing device according to item 1.
  • Appendix 8 8.
  • the information processing apparatus according to appendix 7, wherein the second image includes at least one of an inverted image and a cropped image of the first image.
  • Appendix 9 Based on the matching candidate fields, which are the plurality of candidate fields other than the non-matching candidate fields, and structure data regarding the structure of the fields, the first coordinate system and the second coordinate system, which is the coordinate system adopted in the structure data, are used. 9.
  • the information processing apparatus is the display device that displays a virtual object superimposed on a landscape, a light source unit that emits display light for displaying the virtual object; an optical element that reflects at least part of the display light so that the virtual object is superimposed on the scenery and visually recognized by an observer;
  • the information processing apparatus according to any one of Appendices 1 to 9, further comprising: [Appendix 11] the computer obtaining feature point information for a plurality of feature points of a field of interest determined based on an image comprising at least a portion of the field of interest; based on the feature point information, generating a candidate field representing a candidate of the field in a first coordinate system, which is a coordinate system based on a display device having a camera that captures the image; determining a non-matching candidate field, which is the candidate field that does not have
  • [Appendix 12] obtaining feature point information for a plurality of feature points of a field of interest determined based on an image comprising at least a portion of the field of interest; based on the feature point information, generating a candidate field representing a candidate of the field in a first coordinate system, which is a coordinate system based on a display device having a camera that captures the image;
  • a memory for storing a program for causing a computer to execute a process of determining a non-matching candidate field, which is the candidate field having no match, based on the plurality of candidate fields corresponding to the plurality of feature point information of the plurality of images. medium.
  • Reference Signs List 1 1A display device 1X, 1Y information processing device 2 server device 10 light source unit 11 optical element 12 communication unit 13 input unit 14 storage unit 15 camera 16 position and orientation detection sensor 20 sensor data storage unit 21 parameter storage unit 22 structural data storage unit 23 Candidate field storage unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Processing Or Creating Images (AREA)

Abstract

L'invention concerne un dispositif de traitement d'informations (1X) comprenant un moyen d'acquisition d'informations de point caractéristique (42X), un moyen de génération de champ candidat (51X) et un moyen de détermination de cohérence (52X). Le moyen d'acquisition d'informations de point caractéristique (42X) acquiert des informations de point caractéristique indiquant des positions candidates et des niveaux candidats pour une pluralité de points caractéristiques d'un champ à cibler, décidé sur la base d'une image comprenant au moins une partie du champ. Sur la base des informations de point caractéristique, le moyen de génération de champ candidat (51X) génère un champ candidat représentant un candidat de champ dans un premier système de coordonnées, qui est un système de coordonnées basé sur un dispositif d'affichage ayant une caméra qui capture l'image. Le moyen de détermination de cohérence (52X) détermine un champ candidat non apparié sur la base d'une pluralité de champs candidats correspondant à une pluralité d'éléments d'informations de point caractéristique pour une pluralité d'images.
PCT/JP2021/025338 2021-07-05 2021-07-05 Dispositif de traitement d'informations, procédé de commande et support de stockage WO2023281593A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2021/025338 WO2023281593A1 (fr) 2021-07-05 2021-07-05 Dispositif de traitement d'informations, procédé de commande et support de stockage
JP2023532892A JPWO2023281593A5 (ja) 2021-07-05 情報処理装置、制御方法及びプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/025338 WO2023281593A1 (fr) 2021-07-05 2021-07-05 Dispositif de traitement d'informations, procédé de commande et support de stockage

Publications (1)

Publication Number Publication Date
WO2023281593A1 true WO2023281593A1 (fr) 2023-01-12

Family

ID=84801398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/025338 WO2023281593A1 (fr) 2021-07-05 2021-07-05 Dispositif de traitement d'informations, procédé de commande et support de stockage

Country Status (1)

Country Link
WO (1) WO2023281593A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012137933A (ja) * 2010-12-27 2012-07-19 Kokusai Kogyo Co Ltd 被写地物の位置特定方法とそのプログラム、及び表示地図、並びに撮影位置取得方法とそのプログラム、及び撮影位置取得装置
JP2019134428A (ja) * 2019-02-13 2019-08-08 キヤノン株式会社 制御装置、制御方法、及び、プログラム
JP2020042447A (ja) * 2018-09-07 2020-03-19 Kddi株式会社 不動物体情報から端末位置を推定する装置、プログラム及び方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012137933A (ja) * 2010-12-27 2012-07-19 Kokusai Kogyo Co Ltd 被写地物の位置特定方法とそのプログラム、及び表示地図、並びに撮影位置取得方法とそのプログラム、及び撮影位置取得装置
JP2020042447A (ja) * 2018-09-07 2020-03-19 Kddi株式会社 不動物体情報から端末位置を推定する装置、プログラム及び方法
JP2019134428A (ja) * 2019-02-13 2019-08-08 キヤノン株式会社 制御装置、制御方法、及び、プログラム

Also Published As

Publication number Publication date
JPWO2023281593A1 (fr) 2023-01-12

Similar Documents

Publication Publication Date Title
CN109561296B (zh) 图像处理装置、图像处理方法、图像处理系统和存储介质
US10972715B1 (en) Selective processing or readout of data from one or more imaging sensors included in a depth camera assembly
US20200267371A1 (en) Handheld portable optical scanner and method of using
KR101761751B1 (ko) 직접적인 기하학적 모델링이 행해지는 hmd 보정
US10636185B2 (en) Information processing apparatus and information processing method for guiding a user to a vicinity of a viewpoint
EP3553465B1 (fr) Dispositif et procédé de traitement d'informations
US11086395B2 (en) Image processing apparatus, image processing method, and storage medium
US8705868B2 (en) Computer-readable storage medium, image recognition apparatus, image recognition system, and image recognition method
WO2021130860A1 (fr) Dispositif de traitement d'informations, procédé de commande et support de stockage
US20130100140A1 (en) Human body and facial animation systems with 3d camera and method thereof
US11156843B2 (en) End-to-end artificial reality calibration testing
US20120069018A1 (en) Ar process apparatus, ar process method and storage medium
US8625898B2 (en) Computer-readable storage medium, image recognition apparatus, image recognition system, and image recognition method
US20120219177A1 (en) Computer-readable storage medium, image processing apparatus, image processing system, and image processing method
JP2002259976A (ja) 特定点検出方法及び装置
US10634918B2 (en) Internal edge verification
US8571266B2 (en) Computer-readable storage medium, image processing apparatus, image processing system, and image processing method
US8718325B2 (en) Computer-readable storage medium, image processing apparatus, image processing system, and image processing method
JP2014032623A (ja) 画像処理装置
JP7364052B2 (ja) 情報処理装置、制御方法及びプログラム
WO2023119412A1 (fr) Dispositif de traitement d'informations, procédé de commande et support de stockage
WO2023281593A1 (fr) Dispositif de traitement d'informations, procédé de commande et support de stockage
JP4896762B2 (ja) 画像処理装置および画像処理プログラム
WO2023281587A1 (fr) Dispositif de traitement d'informations, procédé de commande et support de stockage
WO2023281585A1 (fr) Dispositif de traitement d'informations, procédé de commande et support de stockage

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21949229

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023532892

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE