CA2860316A1

CA2860316A1 - System and method for space filling regions of an image

Info

Publication number: CA2860316A1
Application number: CA2860316A
Authority: CA
Inventors: Lev Faynshteyn; Ian Hall
Original assignee: Kisp Inc
Current assignee: Kisp Inc
Priority date: 2014-08-21
Filing date: 2014-08-21
Publication date: 2016-02-21

Abstract

A system and method for space filling regions of an image of a physical space are provided.
Various algorithms and transformations enable a rendering unit in communication with an image capture device to generate visual renderings of a physical space from which obstacles have been removed.

Description

2 TECHNICAL FIELD

3 [0001] The following relates generally to image processing and more specifically to space

4 filling techniques to render a region of an image of a physical space using other regions of the image.

7 [0002] In design fields such as, for example, architecture, interior design, and interior 8 decorating, renderings and other visualisation techniques assist interested parties, such as, for 9 example, contractors, builders, vendors and clients, to plan and validate potential designs for physical spaces.
11 [0003] Designers commonly engage rendering artists in order to sketch and illustrate 12 designs to customers and others. More recently, designers have adopted various digital 13 rendering techniques to illustrate designs. Some digital rendering techniques are more realistic, 14 intuitive and sophisticated than others.
[0004] When employing digital rendering techniques to visualise designs applied to existing 16 spaces, the rendering techniques may encounter existing elements, such as, for example, 17 furniture, topography and clutter, in those spaces.

19 [0005] In visualising a design to an existing physical space, it is desirable to allow a user to capture an image of the existing physical space and apply design changes and elements to the 21 image. However, when the user removes an existing object shown in the image, a void is 22 generated where the existing object stood. The void is unsightly and results in a less realistic 23 rendering of the design.
24 [0006] In one aspect, a system is provided for space filling regions of an image of a physical space, the system comprising a rendering unit operable to generate a tileable representation of 26 a sample region of the image and replicating the tileable representation across a target region in 27 the image.
28 [0007] In another aspect, a method is provided for space filling regions of an image of a 29 physical space, the method comprising: (1) in a rendering unit, generating a tileable 1 representation of a sample region of the image; and (2) replicating the tileable representation 2 across a target region in the image.
3 [0008] In embodiments, a system is provided for assigning world coordinates to at least one 4 point in an image of a physical space captured at a time of capture by an image capture device.
The system comprises a rendering unit configured to: (i) ascertain, for the time of capture, a 6 focal length of the image capture device; (ii) determine, in world coordinates, for the time of 7 capture, an orientation of the image capture device; (iii) determine, in world coordinates, for the 8 time of capture, a distance between the image capture device and a reference point in the 9 physical space; and (iv) generate a view transformation matrix comprising matrix elements determined by the focal length, the orientation and the distance to enable transformation 11 between the coordinate system of the image and the world coordinates.
12 [0009] In further embodiments, the system for assigning world coordinates is configured to 13 space fill regions of the image, the rendering unit being further configured to: (i) select, based on 14 user input, a sample region in the image; (ii) map the sample region to a reference plane; (iii) generate a tileable representation of the sample region; (iv) select, based on user input, a target 16 region in the reference plane; and (v) replicate the tileable representation of the sample region 17 across the target region.
18 [0010] In still further embodiments, the rendering unit is configured to determine the 19 distance between the image capture device and the reference point by:
(i) causing a reticule to be overlaid on the image using a display unit; (ii) obtaining from a user by a user input device 21 the known length and orientation in world coordinates of a line corresponding to a captured 22 feature of the physical space; (iii) adjusting the location and size of the reticule with respect to 23 the image in response to user input provided by the user input device;
(iv) obtaining from the 24 user by the user input device an indication that the reticule is aligned with the line; and (iv) determining the distance from the image capture device to the reference point, based on the 26 size and orientation of the reticule and the size and orientation of the line.
27 [0011] In embodiments, the rendering unit is configured to determine the distance from the 28 image capture device to the reference point by: (i) determining that a user has placed the image 29 capture device on a reference plane; (ii) determining the acceleration of the image capture device as the user moves the image capture device from the reference plane to an image 31 capture position; (iii) deriving the distance of the image capture device from the reference plane 32 from the acceleration; and (iv) determining the distance between the image capture device and 1 the reference point, based on the focal length of the image capture device and the distance of 2 the image capture device from the reference plane.
3 [0012] In further embodiments, the rendering unit is configured to determine the distance 4 from the image capture device to the reference point by requesting user input of an estimated distance from the image capture device to a reference plane.
6 [0013] In yet further embodiments, the rendering unit determines the orientation in world 7 coordinates of the image capture device by: (i) obtaining acceleration of the image capture 8 device from an accelerometer of the image capture device; (ii) determining from the acceleration 9 when the image capture device is at rest; and (iii) assigning the acceleration at rest as a proxy for the orientation in world coordinates of the image capture device.
11 [0014] In embodiments, the rendering unit generates the tileable representation of the 12 sample region by using a Poisson gradient-guided blending technique. The tileable 13 representation of the sample region may comprise four sides and the rendering unit enforces 14 identical boundaries for all four sides of the tileable representation of the sample region.
[0015] In further embodiments, the rendering unit replicates the tileable representation of 16 the sample region across the target area by applying rasterisation.
17 [0016] In still further embodiments, the rendering unit generates ambient occlusion for the 18 target area.
19 [0017] In embodiments, a method is provided for assigning world coordinates to at least one point in an image of a physical space captured at a time of capture by an image capture device, 21 the method comprising a rendering unit: (i) ascertaining, for the time of capture, a focal length of 22 the image capture device; (ii) determining, in world coordinates, for the time of capture, an 23 orientation of the image capture device; (iii) determining, in world coordinates, for the time of 24 capture, a distance between the image capture device and a reference point in the physical space; and (iv) generating a view transformation matrix comprising matrix elements determined 26 by the focal length, the orientation and the distance to enable transformation between the 27 coordinate system of the image and the world coordinates.
28 [0018] In further embodiments, a method is provided for space filling regions of an image, 29 comprising the method for assigning world coordinates to at the least one point in the image of the physical space and comprising the rendering unit further: (i) selecting, based on user input, 31 a sample region; (ii) mapping the sample region to a reference plane;
(iii) generating a tileable 1 representation of the sample region; (iv) selecting, based on user input, a target region in the 2 reference plane; and (v) replicating the tileable representation of the sample region across the 3 target region.
4 [0019] In still further embodiments, the rendering unit in the method for assigning world coordinates to at least one point in an image of the physical space determines the distance 6 between the image capture device and the reference point by: (i) causing a reticule to be 7 overlayed on the image using a display unit; (ii) obtaining from a user by a user input device the 8 known length and orientation in world coordinates of a line corresponding to a captured feature 9 of the physical space; (iii) adjusting the location and size of the reticule with respect to the image in response to user input on the user input device; (iv) obtaining from the user by the user 11 input device an indication that the reticule is aligned with the line;
and (v) determining the 12 distance from the image capture device to the reference point, based on the size and orientation 13 of the reticule and the size and orientation of the line.
14 [0020] In embodiments, the rendering unit in the method for assigning world coordinates to at least one point in an image of the physical space determines the distance from the image 16 capture device to the reference point by: (i) determining that a user has placed the image 17 capture device on a reference plane; (iii) determining the acceleration of the image capture 18 device as the user moves the image capture device from the reference plane to an image 19 capture position; (iv) deriving the distance of the image capture device from the reference plane from the acceleration; and (iv) determining the distance between the image capture device and 21 the reference point, based on the focal length of the image capture device and the distance of 22 the image capture device from the reference plane.
23 [0021] In further embodiments, the rendering unit in the method for assigning world 24 coordinates to at least one point in an image of the physical space determines the distance from the image capture device to the reference point by requesting user input of an estimated 26 distance from the image capture device to a reference plane.
27 [0022] In still further embodiments, the rendering unit in the method for assigning world 28 coordinates to at least one point in an image of the physical space determines the orientation in 29 world coordinates of the image capture device by: (i) obtaining acceleration of the image capture device from an accelerometer of the image capture device; (ii) determining from the 31 acceleration when the image capture device is at rest; and (iii) assigning the acceleration at rest 32 as a proxy for the orientation in world coordinates of the image capture device.

1 [0023] In embodiments, the rendering unit in the method for assigning world coordinates to 2 at least one point in an image of the physical space generates the tileable representation of the 3 sample region comprises by using a Poisson gradient-guided blending technique. In further 4 embodiments, the tileable representation of the sample region comprises four sides and the rendering unit enforces identical boundaries for all four sides of the tileable representation of the 6 sample region.
7 [0024] In yet further embodiments, the rendering unit in the method for assigning world 8 coordinates to at least one point in an image of the physical space replicates the tileable 9 representation of the sample region across the target area by applying rasterisation.
[0025] In embodiments, the rendering unit in the method for assigning world coordinates to 11 at least one point in an image of the physical space further generates ambient occlusion for the 12 target area.

14 [0026] A greater understanding of the embodiments will be had with reference to the Figures, in which:
16 [0027] Fig. 1 illustrates an example of a system for space filling regions of an image;
17 [0028] Fig. 2 illustrates an embodiment of the system for space filling regions of an image;
18 [0029] Fig. 3 is a flow diagram illustrating a process for calibrating a system for space filling 19 regions of an image;
[0030] Figs. 4-6 illustrate embodiments of a user interface module for calibrating a system 21 for space filling regions of an image;
22 [0031] Fig. 7 illustrates a flow diagram illustrating a process for space filling regions of an 23 image; and 24 [0032] Figs. 8-10 illustrate embodiments of a user interface module for space filling regions of an image.

27 [0033] Embodiments will now be described with reference to the figures. It will be 28 appreciated that for simplicity and clarity of illustration, where considered appropriate, reference 29 numerals may be repeated among the figures to indicate corresponding or analogous elements.
In addition, numerous specific details are set forth in order to provide a thorough understanding

5 1 of the embodiments described herein. However, it will be understood by those of ordinary skill in 2 the art that the embodiments described herein may be practiced without these specific details.
3 In other instances, well-known methods, procedures and components have not been described 4 in detail so as not to obscure the embodiments described herein. Also, the description is not to be considered as limiting the scope of the embodiments described herein.

6 [0034] It will also be appreciated that any module, unit, component, server, computer,

7 terminal or device exemplified herein that executes instructions may include or otherwise have

8 access to computer readable media such as storage media, computer storage media, or data

9 storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Computer storage media may include volatile and non-volatile, removable 11 and non-removable media implemented in any method or technology for storage of information, 12 such as computer readable instructions, data structures, program modules, or other data.
13 Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other 14 memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any 16 other medium which can be used to store the desired information and which can be accessed 17 by an application, module, or both. Any such computer storage media may be part of the device 18 or accessible or connectable thereto. Any application or module herein described may be 19 implemented using computer readable/executable instructions that may be stored or otherwise held by such computer readable media and executed by the one or more processors.
21 [0035] Referring now to Fig. 1, an exemplary embodiment of a system for space filling 22 regions of an image of a physical space is depicted. In the depicted embodiment, the system is 23 provided on a mobile tablet device 101. However, aspects of systems for space filling regions of 24 an image may be provided on other types of devices, such as for, example, mobile telephones, laptop computers and desktop computers.
26 [0036] The mobile tablet device 101 comprises a touch screen 104.
Where the mobile tablet 27 device 101 comprises a touch screen 104, it will be appreciated that a display unit 103 and an 28 input unit 105 are integral and provided by the touch screen 104. In alternate embodiments, 29 however, the display unit and the input unit may be discrete units. In still further embodiments, the display unit and some elements of the user input unit may be integral while other input unit 31 elements may be remote from the display unit. For example, the mobile tablet device 101 may 32 comprise physical switches and buttons (not shown).

1 [0037] The mobile tablet device 101 further comprises: a rendering unit 107 employing a ray 2 tracing engine 108; an image capture device 109, such as, for example, a camera or video 3 camera; and an accelerometer 111. In embodiments, the mobile tablet device may comprise 4 other suitable sensors (not shown).
[0038] The mobile tablet device may comprise a network unit 141 providing, for example, 6 Wi-Fi, cellular, 3G, 4G, Bluetooth and/or LTE functionality, enabling network access to a 7 network 151, such as, for example, the Internet or a local intranet. A
server 161 may be 8 connected to the network 151. The server 161 may be linked to a database 171 for storing data, 9 such as models of furniture, finishes, floor coverings and colour swatches relevant to users of the mobile tablet device 101, users including, for example, architects, designers, technicians 11 and draftspersons. In aspects, the actions described herein as being performed by the 12 rendering unit may further or alternatively be performed outside the mobile tablet device by the 13 server 161 on the network 151.
14 [0039] In aspects, one or more of the aforementioned components of the mobile tablet device 101 is in communication with, and remote from, the mobile tablet device 101.
16 [0040] Referring now to Fig. 2, an image capture device 201 is shown pointing generally 17 toward an object 211 in a physical space. The image capture device 201 has its own coordinate 18 system defined by X-, Y- and Z- axes, where the Z-axis is normal to the image capture device 19 lens 203 and where the X-, Y-, and Z-axes intersect at the centre of the image capture device lens 203. The image capture device 201 may capture an image which includes portions of at 21 least some objects 211 having at least one known dimension, which fall within its field of view.
22 [0041] The field of view is defined by the view frustum, as shown.
The view frustum is 23 defined by: the focal length F along the image capture device Z-axis, and the lines emanating 24 from the centre of the image capture device lens 203 at angles a to the Z-axis. On some image capture devices, including mobile tablet devices and mobile telephones, the focal length F and, 26 by extension, the angles a, are fixed and known, or ascertainable. In certain image capture 27 devices, the focal length F is variable but is ascertainable for a given time, including at the time 28 of capture.
29 [0042] It will be appreciated that the rendering unit must reconcile multiple coordinate systems, as shown in Fig. 2, such as, for example, world coordinates, camera (or image capture 31 device) coordinates, object coordinates and projection coordinates. The rendering unit is 32 configured to model the image as a 3-dimensional (3D) space by assigning world coordinates to 1 one or more points in the image of the physical space. The rendering unit assigns world 2 coordinates to the one or more points in the image by generating a view transformation matrix 3 that transforms points on the image to points having world coordinates, and vice versa. The 4 rendering unit is operable to generate a view transformation matrix to model, for example, an image of a physical space appearing in 2D on the touchscreen of a mobile tablet device.
6 [0043] The rendering unit may apply the view transformation matrix to map user input 7 gestures to the 3D model, and further to render design elements applied to the displayed image.
8 [0044] The view transformation matrix is expressed as the product of three matrices: VTM =
9 N =T = R, where VTM is the view transformation matrix, N is a normalisation matrix, T is a translation matrix and R is a rotation matrix. In order to generate the view transformation matrix, 11 the rendering unit must determine matrix elements through a calibration process, a preferred 12 mode of which is shown in Fig. 3 and hereinafter described.
13 [0045] As shown in Fig. 3, at block 301, in a specific example, the user uses the image 14 capture device to take a photograph, i.e., capture an image, of a physical space to which a design is to be applied. The application of a design may comprise, for example, removal and/or 16 replacement of items of furniture, removal and/or replacement of floor coverings, revision of 17 paint colours or other suitable design creations and modifications. The space may be generally 18 empty or it may contain numerous items, such as, for example, furniture, people, columns and 19 other obstructions, at the time the image is captured.
[0046] In Fig. 2, the image capture device is shown pointing generally downward in relation 21 to world coordinates. In aspects, the rendering unit performs a preliminary query to the 22 accelerometer while the image capture device is at rest in the capture position to determine 23 whether the image capture device is angled generally upward or downward.
If the test returns 24 an upward angle, the rendering unit causes the display unit to display a prompt to the user to re-capture the photograph with the image capture device pointing generally downward.
26 [0047] At block 303, the rendering unit generates the normalisation matrix N, which 27 normalises the view transformation matrix according to the focal length of the image capture 28 device. As previously described, the focal length for a device having a fixed focal length is 29 constant and known, or derivable from a constant and known half angle a.
If the focal length of the image capture device is variable, the rendering unit will need to ascertain the focal length or 31 angle of the image capture device for the time of capture. The normalisation matrix N is defined 32 for a given half-angle a is defined as:

F1/tan a 0 0 0 1 0 1/tan a 0 , where a is the half angle of the image capture device's field of view.

2 [0048] At block 305, the rendering unit generates the rotation matrix R. The rotation matrix represents the orientation of the image capture device coordinates in relation to the world 4 coordinates. The image capture device comprises an accelerometer configured for communication with the rendering unit, as previously described. The accelerometer provides an acceleration vector G, as shown in Fig. 2. When the image capture device is at rest, any acceleration which the accelerometer detects is solely due to gravity. The acceleration vector G

corresponds to the degree of rotation of the image capture device coordinates with respect to 9 the world coordinates. At rest, the acceleration vector G is parallel to the world space z-axis.
The rendering unit can therefore assign the acceleration at rest as a proxy for the orientation in 11 world coordinates of the image capture device.
12 [0049] The rendering unit derives unit vectors NX and NY from the acceleration vector G
13 and the image capture device coordinates X and Y:
x G
NX = _________________________________________ YXGI
axxx N-Y =
x A-7.v1 The resulting rotation vector appears as follows:
-N X.x \Y ..r G.x 0-Za.y NY.y G..y 0 N- X z 1V-Y a.z o 16 _o U U 1_.
17 [0050] At block 307, the rendering unit generates the translation matrix T. The translation matrix accounts for the distance at which the image capture device coordinates are offset from 19 the world coordinates. The rendering unit assigns an origin point 0 to the view space, as shown in Fig. 2. The assigned origin projects to the centre of the captured space, i.e., along the image capture device Z-axis toward the centre of the image captured by the image capture device at 1 block 301. The rendering unit assumes that the origin is on the "floor"
of the physical space, and 2 assigns the origin a position in world coordinates of (0, 0, 0). The origin is assigned a world 3 space coordinate on the floor of the space captured in the image so that the only possible 4 translation is along the image capture device's Z-axis. The image capture device's direction is thereby defined as: NX = Z,NY = Z, d = z, 0. The displacement along that axis is represented in the 6 resulting translation matrix:

7 -0 0 0 1 -, where D represents the distance, for the time of capture, between the image 8 capture device and a reference point in the physical space.
9 [0051] The value for D is initially unknown and must be ascertained through calibration or other suitable technique. At block 309, the rendering unit causes a reticule 401 to be overlaid on 11 the image using the display unit, as shown in Fig. 4. The rendering unit initially causes display of 12 the reticule to correspond to a default orientation, size and location, as shown. For example, the 13 default size may be 6 feet, as shown at prompt 403.
14 [0052] As shown in Figs. 3 and 4, at block 311 a prompt 403 is displayed on the display unit to receive from the user a reference dimension and orientation of a line corresponding to a 16 visible feature within the physical space. For example, as illustrated in Figs. 3 and 4, the 17 dimension of the line may correspond to the distance in world coordinates between the floor and 18 the note paper 405 on the dividing wall 404. In aspects, a visible door, bookcase or desk sitting 19 on the floor having a height known to the user could be used as visible features. The user selects the size of the reference line by scrolling through the dimensions listed in the selector 21 403; the rendering unit will then determine that the reticule corresponds to the size of the 22 reference line.
23 [0053] As shown in Fig. 4, the reticule 401 is not necessarily initially displayed in alignment 24 with the reference object. As shown in Fig. 3, at blocks 313 to 317, the rendering unit receives from the user a number of inputs described hereinafter to align the reticule 401. At block 313, 26 the rendering unit receives a user input, such as, for example, a finger gesture or single finger 27 drag gesture, or other suitable input technique to translate the reticule to a new position. At 28 block 315, the rendering unit determines the direction that a ray would take if cast from the 29 image capture device to the world space coordinates of the new position by applying to the user 1 input gesture the inverse of the previously described view transformation matrix. The rendering 2 unit then determines the x- and y-coordinates in world space where the ray would intersect the 3 floor (z=0). The rendering unit applies those values to calculate the following translation matrix 4 that would bring the reticule to the position selected by the user:

o 0 1 0 _0 0 0 1 6 [0054] At block 317, the rendering unit rotates the reticule in response to a user input, such 7 as, for example, a two-finger rotation gesture or other suitable input.
The rendering unit rotates 8 the reticule about the world-space z-axis by angle 8 by applying the following local reticule 9 rotation matrix:
cos(0) ¨ sin(0) 0 0-sin(0) cos(60) 0 0 11 [0055] The purpose of this rotation is to align the reticule along the base of the reference 12 object, as shown in Fig. 5, so that the orientation of the reticule is aligned with the orientation of 13 the reference line. When the user has aligned the reticule 501 with the reference object 511, its 14 horizontal fibre 503 is aligned with the intersection between the reference object 511 and the floor 521, its vertical fibre 507 extends vertically from the floor 521 toward, and intersecting with, 16 the reference point 513, and its normal fibre 505 extends perpendicularly from the reference 17 object 511 along the floor 521.
18 [0056] Recalling that the reticule was initially displayed as a default size to which the user 19 assigned a reference dimension, as previously described, it will be appreciated that the initial height of the marker 509 may not necessarily correspond to the world coordinate height of the 21 reference point 513 above the floor 521. Referring again to Fig. 3, at block 319, the rendering 22 unit responds to further user input by increasing or decreasing the height of the marker 509. In 23 aspects, the user adjusts the height of the vertical fibre 507 to align the marker 509 with the 24 reference point 513 by, for example, providing a suitable touch gesture to slide a slider 531, as shown, so that the vertical fibre 507 increases or decreases in height as the user moves the 26 slider bead 533 up or down, respectively; however, other input methods, such as arrow keys on 1 a fixed keyboard, or mouse inputs could be implemented to effect the adjustment. Once the 2 user has aligned the marker 509 of the reticule 507 with the reference point 513, the rendering 3 unit can use the known size, location and orientation of each of the reticule and the line to solve 4 the view transformation matrix for the element D. A fully calibrated space is shown in Fig. 6.
[0057] Once the rendering unit has determined the view transformation matrix, the 6 rendering unit may begin receiving design instructions from the user and applying those 7 changes to the space.
8 [0058] In further aspects, other calibration techniques may be performed instead of, or in 9 addition to, the calibration techniques described above. In at least one aspect, the rendering unit first determines that a user has placed the image capture device on the floor of the physical 11 space. Once the image capture device is at rest on the floor, the user lifts it into position to 12 capture the desired image of the physical space. As the user moves the device into the capture 13 position, the rendering unit determines the distance from the image capture device to the floor 14 based on the acceleration of the image capture device. For example, the rendering unit calculates the double integral of the acceleration vector over the elapsed time between the floor 16 position and the capture position to return the displacement of the image capture device from 17 the floor to the capture position. The accelerometer also provides the image capture device 18 angle with respect to the world coordinates to the rendering unit once the image capture device 19 is at rest in the capture position, as previously described. With the height, focal length, and image capture device angle with respect to world coordinates known, the rendering unit has 21 sufficient data to generate the view transformation matrix.
22 [0059] In still further aspects, the height of the image capture device in the capture position 23 is determined by querying from the user the user's height. The rendering unit assumes that the 24 image capture device is located a distance below the user's height, such as for example, 4 inches, and uses that location as the height of the image capture device in the capture position.
26 [0060] Alternatively, the rendering unit queries from the user an estimate of the height of the 27 image capture device from the floor.
28 [0061] It will be appreciated that the rendering unit may also default to an average height off 29 the ground, such as, for example, 5 feet, if the user does not wish to assist in any of the aforementioned calibration techniques.

1 [0062] In aspects, the user may wish to apply a new flooring design to the image of the 2 space, as shown in Fig. 8; however, the captured image of the space may comprise obstacles, 3 such as, for example the chair 811 and table 813 shown. If the user would like to view a 4 rendering of the captured space without the obstacles, the rendering unit needs to space fill the regions on the floor where the obstacles formerly stood.
6 [0063] In Fig. 7, a flowchart illustrates a method for applying space filling regions of the 7 captured image. At block 701, the user selects a sample region to replicate across a desired 8 region in the captured image. As shown in Fig. 8, the rendering unit causes the display unit to 9 display a square selector 801 mapped to the floor 803 of the captured space. The square selector 801 identifies the sample region of floor 803. In aspects, the square selector 801 is 11 semi-transparent to simultaneously illustrate both its bounding area and the selected pattern, as 12 shown. In alternate embodiments, however, the square selector 801 may be displayed as a 13 transparent region with a defined border (not shown). In further aspects, a viewing window 805 14 is provided to display the selected area to the user at a location on the display unit, as shown.
The rendering unit translates and flattens the pixels of the sample region bounded by the square 16 selector 801 into the viewing window 805, and, in aspects, updates the display in real-time 17 according to the user's repositioning of the selector.
18 [0064] The user may: move the selector 801 by dragging a finger or cursor over the display 19 unit; rotate the selector 801 using two-finger twisting input gestures or other suitable input;
and/or scale the selector 801 by using, for example, a two-finger pinch. As show in Fig. 7 at 21 block 703, the rendering unit applies the following local scaling matrix to scale the selector:
s 0 0 0 0.900 22 0 0 0 1_ 23 [0065] Once the user has selected a sample region to replicate, the user defines a target 24 region in the image of the captured space to which to apply the pattern of the sample region, at block 705 shown in Fig. 7. The rendering unit causes a closed, non-self intersecting vector-26 based polygon 901 to be displayed on the display unit, as shown in Fig.
9. In order to ensure 27 that the polygon 901 always defines an area, rather than a line, the polygon 901 comprises at 28 least three control points 903. The user may edit the polygon 901 by adding, removing and 29 moving the control points 903 using touch gestures or other suitable input methods, as 1 described herein. In aspects, the control points 903 can be moved individually or in groups. In 2 still further aspects, the control points may 903 be snapped to the edges and corners of the 3 captured image, providing greater convenience to the user.
4 [0066]
After the user has finished configuring the polygon, the rendering unit applies the pattern of the selected region to the selected target region, as shown in Fig.
7 at blocks 707 and 6 709. At block 707, the rendering unit generates a tileable representation of the pattern in the 7 sample region, using suitable techniques, such as, for example, the Poisson gradient-guided 8 blending technique described in Patrick Perez, Michel Gangnet, and Andrew Blake. 2003.
9 Poisson image editing. ACM SIGGRAPH 2003 Papers (SIGGRAPH '03). ACM, New York, NY, USA, 313-318. Given a rectangular sample area, such as the area bounded by the selector 801 11 shown in Fig. 8, the rendering unit generates a tileable, i.e., repeatable, representation of the 12 sample region by setting periodic boundary values on its borders. In aspects, the rendering unit 13 enforces identical boundaries for all four sides of the square sample region. When the tileable 14 representation is replicated, as described below, the replicated tiles will thereby share identical boundaries with adjacent tiles, reducing the apparent distinction between tiles.
16 [0067]
At block 709, the rendering unit replicates the tile across the target area by applying 17 rasterisation, such as, for example, the OpenGL rasteriser, and applying the existing view 18 transformation matrix to the vector-based polygon 901, shown in Fig. 9 using a tiled texture map 19 consisting of repeated instances of the tileable representation described above.
[0068] In aspects, the rendering unit enhances the visual accuracy of the modified image 21 by generating ambient occlusion for the features depicted therein, as shown in at blocks 711 to 22 7. The rendering unit generates the ambient occlusion in cooperation with a ray tracing engine.
23 At block 719, the rendering unit receives from the ray tracing engine ambient occlusion values, 24 which it blends with the rasterised floor surface. In aspects, the rendering unit further enhances visual appeal and realism by blending the intersections of the floor and the walls from the 26 generated image with those shown in the originally captured image.
27 [0069]
At block 711, the rendering unit infers that the polygon 901, as shown in Fig.
9, 28 represents the floor of the captured space, and that objects bordering the target region are wall 29 surfaces. The rendering unit applies the view transformation matrix to determine the world space coordinates corresponding to the display coordinates of the polygon 901 and generates a 31 3D representation of the space by extruding virtual walls perpendicularly from the floor along the 32 edges of the polygon 901. As shown in Fig. 7 at block 713, the rendering unit provides the 1 resulting virtual geometries to a ray tracing engine. The rendering unit only generates virtual 2 walls that meet the following condition: the virtual walls must face toward the inside of the 3 polygon 901, and the virtual walls must face the user, i.e., towards the image capture device.
4 These conditions are necessary to ensure that the rendering unit does not extrude virtual walls that would obscure the rendering, as will be appreciated below.
6 [0070] The rendering unit determines the world coordinates of the bottom edge of a given 7 virtual wall by projecting two rays from the image capture device to the corresponding edge of 8 the target area. The rays provide the world space x and y coordinates for the virtual wall where 9 it meets the floor, i.e., at z=0. The rendering unit determines the height for the given virtual wall by projecting a ray through a point on the upper border of the display unit directly above the 11 display coordinate of one of the end points of the corresponding edge.
The ray is projected 12 along a plane that is perpendicular to the display unit and that intersects the world coordinate of 13 the end point of the corresponding edge. The rendering unit calculates the height of the virtual 14 wall as the distance between the world coordinate of the end point and the world coordinate the ray directly above the end point.
16 [0071] In cooperation with the rendering unit, the ray tracing engine generates an ambient 17 occlusion value for the floor surface. At block 713, the rendering unit transmits the virtual 18 geometry generated at block 711 to the ray tracing engine. At block 715, the ray tracing engine 19 casts shadow rays from a plurality of points on the floor surface toward a vertical hemisphere.
For a given point, any ray emanating therefrom which hits one of the virtual walls represents 21 ambient lighting that would be unavailable to that point. The proportion of shadow rays from the 22 given point that would hit a wall to the shadow rays that would not hit a wall is a proxy for the 23 level of ambient light at the given point on the floor surface.
24 [0072] Because the polygon may only extend to the borders of the display unit, any virtual walls extruded from the edges of the polygon will similarly only extend to the borders of the 26 display unit. However, this could result in unrealistic brightening during ray tracing, since the 27 shadow rays cast from points on the floor space toward the sides of the display unit will not 28 encounter virtual walls past the borders. Therefore, in aspects, the rendering unit extends the 29 virtual walls beyond the borders of the display unit in order to reduce the unrealistic brightening.
[0073] In aspects, the ray tracing engine further enhances the realism of the rendered 31 design by accounting for colour bleeding, at block 717. The ray tracing engine samples the 32 colour of the extruded virtual walls at the points of intersection of the extruded virtual walls with 1 all the shadow rays emanating from each point on the floor. For a given point on the floor, the 2 ray tracing engine calculates the average of the colour of all the points of intersection for that 3 point on the floor. The average provides a colour of virtual light at that point on the floor.
4 [0074] In further aspects, the ray tracing engine favours generating the ambient occlusion for the floor surface, not the extruded virtual geometries. Therefore, the ray tracing engine casts 6 primary rays without testing against the extruded geometry; however, the ray tracing engine 7 tests the shadow rays against the virtual walls of the excluded geometry.
This simulates the 8 shadow areas of low illumination typically encountered where the virtual walls meet the floor 9 surface.
[0075] It will be appreciated that ray tracing incurs significant computational expense. In 11 aspects, the ray tracing engine reduces this expense by calculating the ambient occlusion at a 12 low resolution, such as, for instance, at 5 times lower resolution than the captured image. The 13 rendering unit then scales up to the original resolution the ambient occlusion obtained at lower 14 resolution. In areas where the ambient occlusion is highly variable from one sub-region to the next, the rendering unit applies a bilateral blurring kernel to prevent averaging across dissimilar 16 sub-regions.
17 [0076] As shown in Fig. 10, the systems and methods described herein for space-filling 18 regions of the captured image provide an approximation of the captured space with the 19 obstacles removed.
[0077] Although the invention has been described with reference to certain specific 21 embodiments, various modifications thereof will be apparent to those skilled in the art. The 22 scope of the claims should not be limited by the preferred embodiments, but should be given the 23 broadest interpretation consistent with the description as a whole.

Claims

What is claimed is:

1. A system for assigning world coordinates to at least one point in an image of a physical space captured at a time of capture by an image capture device, the system comprising a rendering unit configured to:
ascertain, for the time of capture, a focal length of the image capture device;
determine, in world coordinates, for the time of capture, an orientation of the image capture device;
determine, in world coordinates, for the time of capture, a distance between the image capture device and a reference point in the physical space; and generate a view transformation matrix comprising matrix elements determined by the focal length, the orientation and the distance to enable transformation between the coordinate system of the image and the world coordinates.

2. The system of claim 1, wherein the system is configured to space fill regions of the image, the rendering unit being further configured to:
select, based on user input, a sample region in the image;
map the sample region to a reference plane;
generate a tileable representation of the sample region;
select, based on user input, a target region in the reference plane; and replicate the tileable representation of the sample region across the target region.

3. The system of claim 1, wherein the rendering unit is configured to determine the distance between the image capture device and the reference point by:
causing a reticule to be overlaid on the image using a display unit;
obtaining from a user by a user input device the known length and orientation in world coordinates of a line corresponding to a captured feature of the physical space;
adjusting the location and size of the reticule with respect to the image in response to user input provided by the user input device;

obtaining from the user by the user input device an indication that the reticule is aligned with the line; and determining the distance from the image capture device to the reference point, based on the size and orientation of the reticule and the size and orientation of the line.

4. The system of claim 1, wherein the rendering unit is configured to determine the distance from the image capture device to the reference point by:
determining that a user has placed the image capture device on a reference plane;
determining the acceleration of the image capture device as the user moves the image capture device from the reference plane to an image capture position;
deriving the distance of the image capture device from the reference plane from the acceleration; and determining the distance between the image capture device and the reference point, based on the focal length of the image capture device and the distance of the image capture device from the reference plane.

5. The system of claim 1, wherein the rendering unit is configured to determine the distance from the image capture device to the reference point by requesting user input of an estimated distance from the image capture device to a reference plane.

6. The system of claim 1, wherein the rendering unit determines the orientation in world coordinates of the image capture device by:
obtaining acceleration of the image capture device from an accelerometer of the image capture device;
determining from the acceleration when the image capture device is at rest;
and assigning the acceleration at rest as a proxy for the orientation in world coordinates of the image capture device.

7. The system of claim 2, wherein the rendering unit generates the tileable representation of the sample region by using a Poisson gradient-guided blending technique.

8. The system of claim 7, wherein the tileable representation of the sample region comprises four sides and the rendering unit enforces identical boundaries for all four sides of the tileable representation of the sample region.

9. The system of claim 2, wherein the rendering unit replicates the tileable representation of the sample region across the target area by applying rasterisation.

10. The system of claim 2, wherein the rendering unit generates ambient occlusion for the target area.

11. A method for assigning world coordinates to at least one point in an image of a physical space captured at a time of capture by an image capture device, the method comprising:
a rendering unit:
ascertaining, for the time of capture, a focal length of the image capture device;
determining, in world coordinates, for the time of capture, an orientation of the image capture device;
determining, in world coordinates, for the time of capture, a distance between the image capture device and a reference point in the physical space; and generating a view transformation matrix comprising matrix elements determined by the focal length, the orientation and the distance to enable transformation between the coordinate system of the image and the world coordinates.

12. The method of claim 11 for space filling regions of the image, the method comprising:
the rendering unit further:
selecting, based on user input, a sample region;
mapping the sample region to a reference plane;
generating a tileable representation of the sample region;
selecting, based on user input, a target region in the reference plane; and replicating the tileable representation of the sample region across the target region.

13. The method of claim 11, wherein the rendering unit determines the distance between the image capture device and the reference point by:
causing a reticule to be overlayed on the image using a display unit;
obtaining from a user by a user input device the known length and orientation in world coordinates of a line corresponding to a captured feature of the physical space;

adjusting the location and size of the reticule with respect to the image in response to user input on the user input device;
obtaining from the user by the user input device an indication that the reticule is aligned with the line; and determining the distance from the image capture device to the reference point, based on the size and orientation of the reticule and the size and orientation of the line.

14. The method of claim 11, wherein the rendering unit determines the distance from the image capture device to the reference point by:
determining that a user has placed the image capture device on a reference plane;
determining the acceleration of the image capture device as the user moves the image capture device from the reference plane to an image capture position;
deriving the distance of the image capture device from the reference plane from the acceleration; and determining the distance between the image capture device and the reference point, based on the focal length of the image capture device and the distance of the image capture device from the reference plane.

15. The method of claim 11, wherein the rendering unit is configured to determine the distance from the image capture device to the reference point by requesting user input of an estimated distance from the image capture device to a reference plane.

16. The method of claim 11, wherein the rendering unit determines the orientation in world coordinates of the image capture device by:
obtaining acceleration of the image capture device from an accelerometer of the image capture device;
determining from the acceleration when the image capture device is at rest;
and assigning the acceleration at rest as a proxy for the orientation in world coordinates of the image capture device.

17. The method of claim 12, wherein the rendering unit generates the tileable representation of the sample region comprises by using a Poisson gradient-guided blending technique.

18. The method of claim 17, wherein the tileable representation of the sample region comprises four sides and the rendering unit enforces identical boundaries for all four sides of the tileable representation of the sample region.

19. The method of claim 12, wherein the rendering unit replicates the tileable representation of the sample region across the target area by applying rasterisation.

20. The method of claim 12, further comprising the rendering unit generating ambient occlusion for the target area.