WO2018039936A1

WO2018039936A1 - Fast uv atlas generation and texture mapping

Info

Publication number: WO2018039936A1
Application number: PCT/CN2016/097395
Authority: WO
Inventors: Guojun Chen; Xin Tong; Yue DONG
Original assignee: Microsoft Technology Licensing, Llc.
Priority date: 2016-08-30
Filing date: 2016-08-30
Publication date: 2018-03-08

Abstract

Methods, systems, and devices are described herein for texture mapping. In one aspect, a method for applying texture to a model based on a plurality of images may include generating texture coordinates from a plurality of images, wherein the texture coordinates include a plurality of points corresponding to pixels in the plurality of images. The method may include, for at least one of the plurality of points, obtaining first color information from a corresponding pixel of a first image, obtaining second color information from the corresponding pixel of a second image, and combining the first and second color information based on a number of neighboring pixels associated with each of the first image and the second image. The method may further include applying the combined color information to the corresponding point of the texture coordinates to generate a texture atlas for a model generated from the multiple images.

Description

FAST UV ATLAS GENERATION AND TEXTURE MAPPING

TECHNICAL FIELD

This disclosure relates generally to image processing， and more specifically to texture mapping three dimensional images or models.

BACKGROUND

In the process of representing images or models in three dimensional (3D) form， texture may be abstracted from two dimensional (2D) or 3D image data and mapped to 2D space， for example in a known process called UV texture mapping. This process may take， for example， multiple images of the same object (e.g.， captured by a camera or image sensor from various relative locations around the object) and combine them into a single unified 2D image or a 3D model. UV texture mapping may include converting image data， such as in the form of color information associated with points of the image or pixels， in one coordinate system (e.g.， Cartesian coordinates x and y) ， to a different coordinate system or space， such as UV space， to create a UV texture atlas. Traditional UV atlas generation requires computing vertex neighborhood information for the conversion and mapping of color information to UV space. This process may be computationally intensive， and may produce artifacts when the two spaces do not exactly match up， negatively affecting the resulting image or model. Accordingly， improvements can be made to improve current texture mapping techniques.

SUMMARY

Illustrative examples of the disclosure include， without limitation， methods， systems， and various devices. In one aspect， a method for applying texture to a model based on a plurality of images may include： generating texture coordinates from a plurality of images， wherein the texture coordinates comprise a plurality of points corresponding to pixels in the plurality of images； for at least one point of the plurality of points， obtaining first color information from a corresponding pixel of a first image of the multiple images， obtaining second color information from the corresponding pixel of a second image of the multiple images， and combining the first color information and the second color information in the first image and the second image based on a number of neighboring pixels associated with the first image and the second image； and applying the combined color information to the corresponding point of the texture coordinates to generate a texture atlas for a model generated from the multiple images.

Other features of the systems and methods are described below. The features， functions， and advantages can be achieved independently in various examples or may be combined in yet other examples， further details of which can be seen with reference to the following description and drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present disclosure will be described more fully hereinafter with reference to the accompanying drawings， in which：

FIG. 1 depicts an example diagram of multiple camera positions positioned relative to a three dimensional (3D) object.

FIG. 2 depicts an example diagram of a process for texture mapping a 3D model.

FIG. 3 depicts an example diagram of a texture map generated from overlapping images.

FIGS. 4A and 4B depict an example processes for generating a texture map from a 3D model or 3D scanned imaged data.

FIG. 5 depicts an example process for generating texture coordinates.

FIG. 6 depicts an example process for combing color information from multiple images.

FIG. 7 depicts an example process for texture atlas packing or compression.

FIGS. 8A and 8B depict an example process for selecting a best frame from which to obtain pixel information for combining color information to produce a texture atlas.

FIG. 9 depicts an example general purpose computing environment in which the techniques described herein may be embodied.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

Systems and techniques are described herein for performing texture mapping for 3D objects， such as 3D objects that are scanned or generated from multiple images. In one aspect， the described techniques may generate color texture for 3D scanned objects faster and/or more efficiently than known techniques. Unlike traditional UV atlas generation that needs to compute vertex neighborhood information， the described techniques may take advantage of an image sequence that includes color information， such as a scanned RGB image sequence， and perform all or a majority of the neighborhood computation in the image domain， significantly speeding up the UV atlas generation and texture mapping process. In some aspects， artifacts caused by the inaccuracy of registration between the reconstructed mesh and the input frames may be removed by blending on the texture boundaries. In yet some aspects， the resulting texture map may be packed or compressed to reduce the amount of memory needed to store the texture map， resources to communicate the texture map， etc.

In one example， the described techniques may include one or more of the following three main components. The first component may include texture coordinate generation， which includes creating texture coordinates for every vertex of one or more images by considering the normal directions. Given the model and camera poses， each triangle of a mesh representing a 3D object， for example， may be projected to the camera direction， which maximizes the projected area (i.e.， has biggest Normal dot View-direction value) ， and takes the projection position as the texture coordinate.

The second component may include color texture mapping， which includes computing the color texture for each pixel of the texture map. With the determined texture coordinate， each surface point can be mapped to one pixel on the texture map. Computing the color for each pixel of the surface map may then result in assigning color for each surface point.

In some cases， the 3D scan process may produce a rough registration between the reconstructed mesh and the RGB image sequences. For every surface point， RGB color information (or other color information format) from each frame may be obtained. However， due to the inaccurate registration， the RGB color values， such as for neighboring pixels， may not be consistent with each other. Either blending all the views or only selecting one frame may produce artifacts free results. In one aspect， the RGB color from the best view frame may be selected. Blending may be performed around the edges where neighboring pixels get their RGB color from different frames (e.g.， at frame or image boundaries) . Instead ofperforming the neighborhood blend on the UV space (e.g.， after conversion to UV space) ， such blending may be performed on each measured frame. The blended color information may then be projected back to the UV space. By performing blending this way， complicated computation that handles UV atlas chart boundaries may be avoided， and the process for color texture mapping may be performed more quickly.

The third component may include reducing the texture image size， for example， by slicing the texture image into smaller patches and removing those patches not used for any surface point.

FIG. 1 depicts an example diagram 100 of multiple camera positions positioned relative to a three dimensional (3D) object. As illustrated， four

cameras

110， 120， 130， and 140 are positioned around 3D object 105， which is a globe. Each camera or

camera position

110， 120， 130， and 140 is associated with a pose that includes a position and orientation. In one aspect， pose may be represented by or include a camera or image

sensor view direction

115， 125， 135， or 145. Images captured from the

image sensors

110， 120， 130， and 140 may be combined to produce a 3D model of object 105， according to various techniques.

According to the described techniques， using the pose information of each camera or

image sensor

110， 120， 130， and 140， along with the pose information of the 3D object 105， images or frames captured by

cameras

110， 120， 30， and 140 may be combined to produce texture coordinates for the 3D object 105. The texture coordinates may be used to map color information from the images to the 3D model to simulate or apply color texture to the 3D model. The multiple images may be stitched together to generate a mesh structure that represents the 3D object 105 in 3D space. The mesh structure may include a number of connected triangles. The texture coordinates may be determined for every vertex of the mesh generated from the images by considering the normal directions from each vertex. Each triangle of the combined image/3D model may be projected to the

camera direction

115， 125， 135， and 145， which maximizes the projected area (i.e.， has biggest Normal dot View-direction value) . The projection position may then be used as the texture coordinate for each of the triangles.

It should be appreciated， that any number of cameras or

image sensors

110， 120， 130， or 140 may be used to capture any number of images or frames to be combined to form a 3D model of 3D object 105. This disclosure assumes that all images are captured with similar exposure settings； however， it should be appreciated that with some pre-processing， the described techniques may be applied to images having different exposure settings. It should also be appreciated that only one camera or image may be used to capture multiple images of 3D object 105， for example， by moving the camera to different positions relative to the 3D object 105.

FIG. 2 depicts an example diagram 200 of a process for texture mapping a 3D model. A 3D model 205， for example of globe 105， may be represented in Cartesian coordinates， such as having x， y， and z dimensions. The 3D model 205 may also be represented in texture space， such as defined by UV coordinates， or other two dimensional space， for example depicted as UV image or map 210. UV coordinates of image 210 may define the surface of 3D object 205. In one example， the surface of 3D model 215 may be UV unwrapped into 2D image 220， which may provide a contiguous， spatially coherent image/coordiantes for texture mapping， at operation 230. Texture information， which may include color information in the form of RGB values， may be obtained， for example， via the techniques described herein， and may be represented by texture image 225. The texture image 225， which may be obtained from multiple images captured relative to 3D object 215， such as depicted in FIG. 1， may be mapped to UV space (e.g.， onto image/coordinates 220) ， at operation 235. The texture/UV map 220 may then be applied to the surface of 3D model 215 to generate a 3D model 215 having texture information， at operation 240.

FIG. 3 depicts an example diagram 300 of a texture map generated from different images. Diagram 300 includes an image or texture map 305， for example of North America， taken from

3D object

105， 215. Texture map 305 represents the result of combining multiple frames or

images

310， 315， 320， and 325， for example， according to the described techniques. As illustrated， certain images or frames 310， 315， 320， and 325 may overlap. In this scenario， combining these frames to produce texture map 305 may be difficult to accomplish， either in terms of requiring a lot of computing resources via known UV atlas generation techniques， and/or may produce unwanted artifacts in the output texture map 305. This is primarily due to the fact that color information of pixels in the overlapping regions of the

frames

310， 315， 320， and 325 may be different， such that when they are combined， neighboring pixels/color information may be inconsistent， creating discontinuities in the texture map 305.

According to the described techniques， color information taken from pixels in the overlap regions may be first combined in the image space， and then may be mapped to UV space. This may eliminate the need for complex UV atlas combination techniques. The combining may include weighting different pixels in and/or proximate to the overlap regions based on which image the neighboring pixels are associated with. For example， color information of pixels associated with area 330 may be associated with multiple images or frames 310， 315， and 320. In order to blend the color information sourced from

frames

310， 315， and 320， more weight may be given to color information taken from an image with more neighboring pixels also coming from that image. For instance， pixels containing color information that are located in the top right hand corner of area 330 may be more consistent with other color information taken from pixels also sourced from the same image 330. Likewise， color information of pixels in the approximate center of area 330 may be proximate equally to pixels sourced from

images

310， 315， and to a lesser degree， image 320. By weighting color information of a pixel by which image or frame neighboring color information comes from， a more consistent and contiguous color representation for the texture map 305 may be generated.

FIG. 4A depicts an example process 400a for generating a texture map from a 3D model or 3D scanned imaged data.

Process 400 may begin at operation 405， where texture coordinates may be generated from a plurality of images. Operation 405 may include creating texture coordinates for every vertex of one or more images by considering the normal directions. Given model and camera pose information (e.g.， obtained from a system 100) ， each triangle of a mesh representing a 3D object， for example， may be projected to the camera direction， for example， to maximizes the projected area (e.g.， has biggest Normal dot View-direction value) . The projection positions may then be used as the texture coordinates.

Next， at operation 410， color information from the plurality of images may be applied to the texture map/coordinates. Operation 410 may include computing the color texture for each pixel of the texture map. With the determined texture coordinate， each surface point can be mapped to one pixel on the texture map. Computing the color for each pixel of the surface map may then result in assigning color for each surface point.

In some cases， the 3D scan process may produce a rough registration between the reconstructed mesh and the RGB image sequences. For every surface point， RGB color information (or other color information format) from each frame may be obtained. However， due to the inaccurate registration， the RGB color values， such as for neighboring pixels， may not be consistent with each other. In one aspect， the RGB color from the best view frame may be selected. Blending may be performed around the edges where neighboring pixels get their RGB color from different frames (e.g.， at frame or image boundaries) . Instead of performing the neighborhood blend on the UV space (e.g.， after conversion to UV space) ， such blending may be performed on each measured frame (e.g.， in x， y， z coordinates) . The blended color information may then be projected back to the UV space. By performing blending this way， complicated computation that handles UV atlas chart boundaries may be avoided， and the process for color texture mapping may be performed more quickly.

In some aspects (but not all aspects， process 400 may further include operation 415， in which the texture image size may be reduced， for example， by slicing the texture image into smaller patches and removing those patches not used for any surface point.

FIG. 4B depicts another example process 400b for generating a texture map from a 3D model or 3D scanned imaged data.

Process 400b may begin at operation 420， in which texture coordinates may be generated from a plurality of images. Next， at operation 425， at least one point may be selected from the plurality of images. The point may include a point of a 3D object， such as

object

105， 215， that is shared between multiple images (e.g.， in an overlap area as depicted in FIG. 3) . Next， at operation 430， first color information (e.g.， RGB information) of a pixel corresponding to the point may be obtained from a first image of the plurality of images. Next， at operation 435， second color information from a corresponding pixel of a second image for the same point may also be obtained. In some aspects， the first image and or the second image， from which color information is obtained， may be selected based on a visibility metric. The visibility metric may include a value indicating which image is normal or most normal to the point， such as such as a dot product of the camera direction and the point. In some aspects， a silhouette region of at least one of the first image or the second image may be determined. The visibility metric of pixels or points within the silhouette region may then be reduced or altered， for example， to reduce the input of those pixels to the final texture map. In one example， the silhouette region may be dilated such that the visibility metric is reduced for a larger number of pixels.

Next， at operation 440， the first color information may be combined with the second color information based on a number of neighboring pixels associated with the first image and the second image. In some aspects， operation 440 may include associating a first weight to the first color information based on a first number of neighboring pixels in the first image and include associating a second weight to the second color information based on a second number of neighboring pixels in the second image. Operation 440 may additionally include combining the first color information and the second color information according to the first weight and the second weight. In yet some aspects， operation 440 may be performed for a plurality of points prior to applying the combined color information to the corresponding points of the texture coordinates (e.g.， in image or x， y， z space， rather than texture or UV space) .

In some examples， the number of neighboring pixels for which to determine weighting may be selected based on an area of pixels surrounding the corresponding pixel.

Process 400b may end with operation 445， in which the combined color information may be applied to the corresponding point of the texture coordinates/map to generate a texture map or atlas. The texture map or atlas may then be used to wrap or cover the surface (s) of a 3D object to convey texture/color information on the surface (s) of the object.

In some aspects， operations 425 through 445 may be repeated for any number of points on the 3D model. In some aspects， operation 425 may include selecting each point that is located within an overlap region of two or more images (as described above in reference to FIG. 3) . In some aspects， a number of surrounding points may also be selected， for example based on a threshold distance (e.g.， radius) from an image boundary， and the like， to further reduce any artifacts in the resulting texture map and/or produce a more contiguous color profile for the texture map.

In some aspects， each of the plurality of images comprises a plurality of triangles and is associated with a camera view. In this case， generating the texture coordinates from the plurality of images may further include projecting the plurality of triangles of each of the multiple images to the associated camera view to generate the texture coordinates， wherein the plurality of points corresponding to the pixels in the plurality of images include a number of vertices of the plurality of triangles.

In some cases， process 400 may additional include compressing the texture atlas by removing empty regions of the texture atlas， as will be described in greater detail below.

FIG. 5 depicts an example process 500 for generating texture coordinates. Process 500 may be a more detailed example of operations 405 and/or operations 420 of

processes

400a and 400b， as described above. Given the model and camera poses (e.g.， see FIG. 1 above) ， each triangle of the mesh representing the 3D object may be projected to the camera direction which maximizes the projected area (e.g.， has biggest Normal dot View-direction value) . Each projection position may be set as the texture coordinate.

An example process is provided below：

FIG. 6 depicts an example process 600 for combining color information from multiple images or frames to generate a texture map or atlas.

Given model with texture coordinates/attributes and N frames captured in N different camera poses， process 600 fetches colors from the frames to generate the texture atlas.

For a surface point with valid texture coordinate， if it can be seen in one specific camera view， then the projection position (x and y coordinate) can be used to fetch the color from the corresponding frame. The color may then be written or associated with the texture coordinate to form the texture atlas.

In practice， most surface points can be visible in multiple frames， which means that multiple color candidates may be available to fill each texture coordinate or texel. In order to generate the best quality texture atlas in in a limited time period， one of the following procedures may be implemented. The most straightforward solution is to average all the color candidates fetched from visible frames. However， this solution requires perfect geometry registration to avoid blurry artifacts. Because perfect geometry registration or alignment is not feasible， another solution is needed.

Another option is to pick a “best” color from the candidate images. This type of solution will generally produce a sharp， clear result， but may suffer from seam artifacts. For example， suppose surface point A fetches color from frame i， but its neighboring point B fetches color from a different frame j. A seam may occur between points A and B， for example， in the case that the geometry registration is not perfect， or the shading effect of the two frames is not similar or the same. To minimize such artifacts， sophisticated cutting algorithms may be implemented to carefully place the selection boundaries， in the texture or UV space. However， these cutting algorithms can be complex and quite time consuming. Instead of this option， the described techniques blur the selection boundaries in image space (x and y coordinates) to produce more aesthetic results.

For example， suppose that the best frame selection strategy is already defined， which means that given a surface point， it can be uniquely decided which frame is the best frame from which to obtain color information. Boundary blurring may then be implemented via the following process. Given a surface point p， its neighborhood in a specified range may be examined， e.g. if x neighboring points selects color from frame i， y points select color from frame j， z points select color from frame k， etc. P may be projected to all these frames， to obtain all the valid colors ( (p) ， j (p) ， Ck (p) ... ) . These color values may then be blended via the following equation： (Note that we fetch colors from the projection position of p， but not its neighboring points) ：

Cfinal (p) ＝ (x*Ci (p) +y*Cj (p) +z*Ck (p) +… ) / (x+y+z +… )

Note that such equation can be safely applied to non-boundary region， since all the neighboring points just select colors from same frame， a sharp， clear image or texture map results in either case.

Applying the Boundary Blur Operation

Directly applying the blurting operations to the geometry surface may be difficult and time consuming， since the neighborhood may be irregularly defined. Blurring or blending may instead be applied in each of the frames， (e.g.， in x， y space) before the color is obtained to fill the atlas. More specifically： for surface pixel p in frame i (under view direction Vi) ， its R*R neighborhood may be examined， and for each neighboring pixel， its best-frame-selection result may be queried. As described above， for example， x points may select frame i， y points select frame j， z points select frame k， etc. P may then be projected into frame i， j， k using its geometry information， the color information may be obtained， and the blending via the above equation may be applied.

After the frames have been blurred， color may naively be picked or selected from the best frames for each texel or texture coordinate in the atlas， using the same best-frame-selection strategy， since the selection boundaries on each of the frames have already been blurred， a seamless image atlas results.

Note that the above process only works on continuous regions. For silhouette regions， which contains geometry discontinuities， the above-described blending operations may not be effective， due to the fact that some of the neighboring pixels (in frame space) may not be the true neighbors in surface space (e.g. surface points in different depth layers may project to similar image space positions) . Such cases may raise the constraint that the best-frame-selection strategy should or take best effort to avoid selecting pixels from such silhouette regions.

Best-Frame-Selection Strategy

As similarly described in the UV coordinate generation process above， for each surface point， the visible frame which has biggest N dot V values may be selected as the best frame. This selection technique has the added benefit of increasing the likelihood that the boundary blur operator is successful， in that it is also desirable to have the projected positions to be far away from the silhouette region. To accomplish this， the silhouette region may be masked in each frame， and ifthe projected position falls into silhouette region， a punishment or disincentive for selection may be associated with the silhouette region， for example by subtracting from the N dot V value some specified value (e.g.， 1.0) ， so to reduce its chance to be the best frame selection result.

To generate the silhouette mask for a specific frame， the geometry information may be rendered into the framebuffer using the corresponding camera pose， (e.g.， . the per-frame G-Buffer) ， and the silhouette may be detected by comparing the depth difference between each pixel and its neighbors. After the initial silhouette mask is generated， the mask may be dilated by R/2 pixels， so that for each unmasked pixel， its R*R neighborhood may not contain silhouettes. The details of an example frame-to-atlas process can be summarized as follows：

Best-frame index selection may be performed for each frame， and the atlas， so that the resulting texture contains the best frame index for each pixel. Note that the above process， which is performed in frame space， is used to blur the selection boundaries while， the following process， which is performed in texture space， is used to fetch the final color：

Note that since the boundaries are blurred in each individual frame， the blur kernels for the same surface point in different frames may not be exactly the same. Due to this fact， seam artifacts may not completely be eliminated in the final result. One simple solution for handling this issue is to run the blur algorithm or process multiple times， and for each iteration， the result produced by the previous iteration may be used as the input. Via experimentation， two iterations generally provide a substantially seamless result.

FIG. 7 depicts an example process 700 for texture atlas packing or compression. The texture atlas generated from the above processes may contain a large amount of percentage or empty regions. In some aspects， a packing or compression process may be applied to the texture atlas to reduce the size of the final texture atlas or map.

In some aspects， the atlas may be divided into M *N uniform grids. The empty grids may then be discarded. The remaining grids may then be re-packed into a single texture. An example process for texture compression is provided below：

One drawback of a normal based best-frame-selection strategy is that it may generate fragmented selection results， particularly in or on bumpy surface regions. Optionally， best frame selection may also include a refinement step to address this potential problem.

After the best-frame-selection algorithm on each frame has been applied， a selection-mask， which indicates the selected pixels of this frame for texture generation， may be obtained. In one example， a best-frame-index buffer for each frame i may be obtained， where the mask by be obtained by discarding the pixels which index is not equal to i. Because the basic best-frame-selection depends on the surface normal， such a mask can be quite fragmented， as illustrated in frame 805. To generate a more continuous selection result， the following process may be applied to the initial selection result.

First， the frames may be sorted by each respective camera poses in an order that the camera position of the i th frame are most distant from the camera positions of the first i-1 frames. Subsequently， the frames may be processed one by one. Given a selection-mask of the i th frame， the mask by dilated by (a user specified) K pixels， as illustrated in frame 810， then shrunk or reduced in size by K pixels， as illustrated in frame 815， in order to fill holes and connect the fragmented pieces or regions. To avoid multiple-selection， mask pixels that have already been selected by previous frames may be discarded. In addition， mask pixels that are close to geometry boundaries may also be discarded， to produce the final mask， as illustrated in frame 820. After this process， a more continuous mask may be obtained for each frame. Using these masks， new best-frame-index buffers for each frame and the atlas may be obtained.

Another example is shown in FIG. 8B. As illustrated in frame 825， different colors (represented by different shading patterns) indicate different frame selections for the surface points. In original algorithm， the frame selection result on bumpy regions is quite fragmented， while with the implementation of the frame selection refinement step， as illustrated in frame 830， the texture map is more continuous.

An example frame-selection refinement process or algorithm is provided below. Note that the following process may be applied after the best-frame-selection step， and may generate refined best-frame-index buffers for frames and the texture atlas. The following process may be optionally implemented with the best-frame selection process described above， for example， via one or more user selections via a modeling application and/or user interface. In some aspects， the following process may added additional processing time to the 3D modeling (e.g.， 20％) .

The techniques described above may be implemented on one or more computing devices or environments， as described below. FIG. 9 depicts an example general purpose computing environment， in which some of the techniques described herein may be embodied. The computing system environment 902 is only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the presently disclosed subject matter. Neither should the computing environment 902 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated in the example operating environment 902. In some embodiments the various depicted computing elements may include circuitry configured to instantiate specific aspects of the present disclosure. For example， the term circuitry used in the disclosure can include specialized hardware components configured to perform function (s) by firmware or switches. In other example embodiments， the term circuitry can include a general purpose processing unit， memory， etc.， configured by software instructions that embody logic operable to perform function (s) . In example embodiments where circuitry includes a combination of hardware and software， an implementer may write source code embodying logic and the source code can be compiled into machine readable code that can be processed by the general purpose processing unit. Since one skilled in the art can appreciate that the state of the art has evolved to a point where there is little difference between hardware， software， or a combination of hardware/software， the selection of hardware versus software to effectuate specific functions is a design choice left to an implementer. More specifically， one of skill in the art can appreciate that a software process can be transformed into an equivalent hardware structure， and a hardware structure can itself be transformed into an equivalent software process. Thus， the selection of a hardware implementation versus a software implementation is one of design choice and left to the implementer.

Computer 902， which may include any of a mobile device or smart phone， tablet， laptop， desktop computer， or collection of networked devices， cloud computing resources， etc.， typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computer 902 and includes both volatile and nonvolatile media， removable and non-removable media. The system memory 922 includes computer-readable storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 923 and random access memory (RAM) 960. A basic input/output system 924 (BIOS) ， containing the basic routines that help to transfer information between elements within computer 902， such as during start-up， is typically stored in ROM 923. RAM 960 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 959. By way of example， and not limitation， FIG. 9 illustrates operating system 925， application programs 926， other program modules 927 including a texture mapping application 965， and program data 928.

The computer 902 may also include other removable/non-removable， volatile/nonvolatile computer storage media. By way of example only， FIG. 9 illustrates a hard disk drive 938 that reads from or writes to non-removable， nonvolatile magnetic media， a magnetic disk drive 939 that reads from or writes to a removable， nonvolatile magnetic disk 954， and an optical disk drive 904 that reads from or writes to a removable， nonvolatile optical disk 953 such as a CD ROM or other optical media. Other removable/non-removable， volatile/nonvolatile computer storage media that can be used in the example operating environment include， but are not limited to， magnetic tape cassettes， flash memory cards， digital versatile disks， digital video tape， solid state RAM， solid state ROM， and the like. The hard disk drive 938 is typically connected to the system bus 921 through a non-removable memory interface such as interface 934， and magnetic disk drive 939 and optical disk drive 904 are typically connected to the system bus 921 by a removable memory interface， such as interface 935 or 936.

The drives and their associated computer storage media discussed above and illustrated in FIG. 9， provide storage of computer-readable instructions， data structures， program modules and other data for the computer 902. In FIG. 9， for example， hard disk drive 938 is illustrated as storing operating system 958， application programs 957， other program modules 956， and program data 955. Note that these components can either be the same as or different from operating system 925， application programs 926， other program modules 927， and program data 928. Operating system 958， application programs 957， other program modules 956， and program data 955 are given different numbers here to illustrate that， at a minimum， they are different copies. A user may enter commands and information into the computer 902 through input devices such as a keyboard 951 and pointing device 952， commonly referred to as a mouse， trackball or touch pad. Other input devices (not shown) may include a microphone， joystick， game pad， satellite dish， scanner， retinal scanner， or the like. These and other input devices are often connected to the processing unit 959 through a user input interface 936 that is coupled to the system bus 921， but may be connected by other interface and bus structures， such as a parallel port， game port or a universal serial bus (USB) . A monitor 942 or other type of display device is also connected to the system bus 921 via an interface， such as a video interface 932. In addition to the monitor， computers may also include one or more output devices such as speakers 944 and printer 943， which may be connected through an output peripheral interface 933.

The computer 902 may operate in a networked environment using logical connections to one or more remote computers， such as a remote computer 946. The remote computer 946 may be a personal computer， a server， a router， a network PC， a peer device or other common network node， and typically includes many or all of the elements described above relative to the computer 902， although only a memory storage device 947 has been illustrated in FIG. 9. The logical connections depicted in FIG. 9 include a local area network (LAN) 945 and a wide area network (WAN) 949， but may also include other networks. Such networking environments are commonplace in offices， enterprise-wide computer networks， intranets， the Internet， and cloud computing resources.

When used in a LAN networking environment， the computer 902 is connected to the LAN 945 through a network interface or adapter 937. When used in a WAN networking environment， the computer 902 typically includes a modem 905 or other means for establishing communications over the WAN 949， such as the Internet. The modem 905， which may be internal or external， may be connected to the system bus 921 via the user input interface 936， or other appropriate mechanism. In a networked environment， program modules depicted relative to the computer 902， or portions thereof， may be stored in the remote memory storage device. By way of example， and not limitation， FIG. 9 illustrates remote application programs 948 as residing on memory device 947. It will be appreciated that the network connections shown are example and other means of establishing a communications link between the computers may be used.

In some aspects， other programs 927 may include a texture mapping application 965 that includes the functionality as described above. In some cases， texture mapping application 965， may execute some or all operations of

processes

400， 500， 600 and/or 700， for example， in conjunction with image sensor (s) 970， which may be communicatively coupled to computer 902 via output peripheral interface 933.

Each of the processes， methods and algorithms described in the preceding sections may be embodied in， and fully or partially automated by， code modules executed by one or more computers or computer processors. The code modules may be stored on any type of non-transitory computer-readable medium or computer storage device， such as hard drives， solid state memory， optical disc and/or the like. The processes and algorithms may be implemented partially or wholly in application-specific circuitry. The results of the disclosed processes and process steps may be stored， persistently or otherwise， in any type of non-transitory computer storage such as， e.g.， volatile or non-volatile storage. The various features and processes described above may be used independently of one another， or may be combined in various ways. All possible combinations and subcombinations are intended to fall within the scope of this disclosure. In addition， certain methods or process blocks may be omitted in some implementations. The methods and processes described herein are also not limited to any particular sequence， and the blocks or states relating thereto can be performed in other sequences that are appropriate. For example， described blocks or states may be performed in an order other than that specifically disclosed， or multiple blocks or states may be combined in a single block or state. The example blocks or states may be performed in serial， in parallel or in some other manner. Blocks or states may be added to or removed from the disclosed example embodiments. The example systems and components described herein may be configured differently than described. For example， elements may be added to， removed from or rearranged compared to the disclosed example embodiments.

It will also be appreciated that various items are illustrated as being stored in memory or on storage while being used， and that these items or portions thereof may be transferred between memory and other storage devices for purposes of memory management and data integrity. Alternatively， in other embodiments some or all of the software modules and/or systems may execute in memory on another device and communicate with the illustrated computing systems via inter-computer communication. Furthermore， in some embodiments， some or all of the systems and/or modules may be implemented or provided in other ways， such as at least partially in firmware and/or hardware， including， but not limited to， one or more application-specific integrated circuits (ASICs) ， standard integrated circuits， controllers (e.g.， by executing appropriate instructions， and including microcontrollers and/or embedded controllers) ， field-programmable gate arrays (FPGAs) ， complex programmable logic devices (CPLDs) ， etc. Some or all of the modules， systems and data structures may also be stored (e.g.， as software instructions or structured data) on a computer-readable medium， such as a hard disk， a memory， a network or a portable media article to be read by an appropriate drive or via an appropriate connection. For purposes of this specification and the claims， the phrase “computer-readable storage medium” and variations thereof， does not include waves， signals， and/or other transitory and/or intangible communication media. The systems， modules and data structures may also be transmitted as generated data signals (e.g.， as part of a carrier wave or other analog or digital propagated signal) on a variety of computer-readable transmission media， including wireless-based and wired/cable-based media， and may take a variety of forms (e.g.， as part of a single or multiplexed analog signal， or as multiple discrete digital packets or frames) . Such computer program products may also take other forms in other embodiments. Accordingly， the present disclosure may be practiced with other computer system configurations.

Conditional language used herein， such as， among others， “can， ” “could， ” “might， ” “may， ” “e.g. ” and the like， unless specifically stated otherwise， or otherwise understood within the context as used， is generally intended to convey that certain embodiments include， while other embodiments do not include， certain features， elements， and/or steps. Thus， such conditional language is not generally intended to imply that features， elements and/or steps are in any way required for one or more embodiments or that one or more embodiments necessarily include logic for deciding， with or without author input or prompting， whether these features， elements and/or steps are included or are to be performed in any particular embodiment. The terms “comprising， ” “including， ” “having” and the like are synonymous and are used inclusively， in an open-ended fashion， and do not exclude additional elements， features， acts， operations and so forth. Also， the term “or” is used in its inclusive sense (and not in its exclusive sense) so that when used， for example， to connect a list of elements， the term “or” means one， some or all of the elements in the list.

While certain example embodiments have been described， these embodiments have been presented by way of example only and are not intended to limit the scope of the inventions disclosed herein. Thus， nothing in the foregoing description is intended to imply that any particular feature， characteristic， step， module or block is necessary or indispensable. Indeed， the novel methods and systems described herein may be embodied in a variety of other forms； furthermore， various omissions， substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the inventions disclosed herein. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of certain of the inventions disclosed herein.

Claims

A method for applying texture by a computing device to a model based on a plurality of images， the method comprising：

generating， by a processor of the computing device， texture coordinates from a plurality of images， wherein the texture coordinates comprise a plurality of points corresponding to pixels in the plurality of images；

for at least one point of the plurality of points， obtaining first color information from a corresponding pixel of a first image of the plurality of images， obtaining second color information from the corresponding pixel of a second image of the plurality of images， and combining the first color information and the second color information in the first image and the second image based on a number of neighboring pixels associated with the first image and the second image； and

applying the combined color information to the corresponding point of the texture coordinates to generate a texture atlas for a model generated from the plurality of images.
The method of claim 1， wherein combining the first color information and the second color information in the first image and the second image based on a number of neighboring points associated with the first image and the second image further comprises：

associating a first weight to the first color information based on the number of neighboring pixels in the first image；

associating a second weight to the second color information based on the number of neighboring pixels in the second image； and

combining the first color information and the second color information according to the first weight and the second weight.
The method of claim 1， wherein the first image is selected for obtaining the first color information from the corresponding pixel based on a visibility metric.
The method of claim 1， wherein obtaining the first color information， obtaining the second color information， and combining the first color information and the second color information in the first image and the second image is performed for the plurality of points prior to applying the combined color information to the corresponding points of the texture coordinates.
The method of claim 1， wherein each of the plurality of images comprises a plurality of triangles and is associated with a camera view， and wherein generating the texture coordinates from the plurality of images further comprises：

projecting the plurality of triangles of each of the multiple images to the associated camera view to generate the texture coordinates， wherein the plurality of points corresponding to the pixels in the plurality of images comprise a number of vertices of the plurality of triangles.
The method of claim 1， wherein applying the combined color information to the corresponding point of the texture coordinates to generate the texture atlas further comprises：

selecting the first image or the second image from which to obtain the combined color information based on a visibility metric.
The method of claim 1， wherein the number of neighboring pixels is selected based on an area of pixels surrounding the corresponding pixel.
The method of claim 1， wherein the first image is selected for obtaining the first color information from the corresponding pixel based on a visibility metric， and wherein the method further comprises：

determining a silhouette region of at least one of the first image or the second image； and

reducing the visibility metric of pixels within the silhouette region.
The method of claim 8， further comprising dilating the silhouette region such that the visibility metric is reduced for a larger number of pixels.
The method of claim 1， further comprising compressing the texture atlas by removing empty regions of the texture atlas.
A system for generating a texture map for a three-decisional (3D) model based on a plurality of images， the system comprising：

at least one image sensor configured to capture a plurality of images of a 3D object；

a processor communicatively coupled to the at least one image sensor； and

memory communicatively coupled to the at least one image sensor， wherein the system is programmed to perform the following operations：

generate texture coordinates from the plurality of images， wherein the texture coordinates comprise a plurality of points corresponding to pixels in the plurality of images；

for at least one point of the plurality of points， obtain first color information from a corresponding pixel of a first image of the plurality of images， obtain second color information from the corresponding pixel of a second image of the plurality of images， and combine the first color information and the second color information in the first image and the second image based on a number of neighboring pixels associated with the first image and the second image； and

apply the combined color information to the corresponding point of the texture coordinates to generate a texture atlas for a 3D model of the 3D object generated from the plurality of images.
The system of claim 11， wherein combining the first color information and the second color information in the first image and the second image based on a number of neighboring points associated with the first image and the second image further comprises：

associating a first weight to the first color information based on the number of neighboring pixels in the first image；

associating a second weight to the second color information based on the number of neighboring pixels in the second image； and

combining the first color information and the second color information according to the first weight and the second weight.
The system of claim 11， wherein obtaining the first color information， obtaining the second color information， and combining the first color information and the second color information in the first image and the second image is performed for the plurality of points prior to applying the combined color information to the corresponding points of the texture coordinates.
The system of claim 11， wherein applying the combined color information to the corresponding point of the texture coordinates to generate the texture atlas further comprises：

selecting the first image or the second image from which to obtain the combined color information based on a visibility metric.
A computer readable storage medium having stored thereon instructions that， upon execution by at least one processor of a computing device， cause the computing device to perform operations for generating a texture map for a three dimensional (3D) model， the operations comprising：

generating texture coordinates from a plurality of images， wherein the texture coordinates comprise a plurality of points corresponding to pixels in the plurality of images；

for at least one point of the plurality of points， obtaining first color information from a corresponding pixel of a first image of the plurality of images， obtaining second color information from the corresponding pixel of a second image of the plurality of images， and combining the first color information and the second color information in the first image and the second image based on a number of neighboring pixels associated with the first image and the second image； and

applying the combined color information to the corresponding point of the texture coordinates to generate a texture atlas for a 3D model generated from the plurality of images.