CN117152338A - Modeling method and electronic equipment - Google Patents

Modeling method and electronic equipment Download PDF

Info

Publication number
CN117152338A
CN117152338A CN202210556215.6A CN202210556215A CN117152338A CN 117152338 A CN117152338 A CN 117152338A CN 202210556215 A CN202210556215 A CN 202210556215A CN 117152338 A CN117152338 A CN 117152338A
Authority
CN
China
Prior art keywords
model
target
target image
image
deformable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210556215.6A
Other languages
Chinese (zh)
Inventor
孙文超
李江伟
郑波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202210556215.6A priority Critical patent/CN117152338A/en
Publication of CN117152338A publication Critical patent/CN117152338A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/10Geometric effects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Geometry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The application provides a modeling method and electronic equipment. In the method, electronic equipment acquires a target image corresponding to a target object to be modeled, and determines a deformable model corresponding to the target image from a plurality of preset deformable models according to the target image. And the electronic equipment performs deformation adjustment on the deformable model according to the target image to generate an initial model. The electronic equipment acquires the image characteristics of the target image, maps the image characteristics of the target image to the initial model, and generates model parameters of the target model corresponding to the target image. And the electronic equipment generates a target model according to the model parameters of the target model and the initial model. According to the scheme, the electronic equipment can generate the target model corresponding to the target object according to the preset deformable model, multiple images do not need to be acquired for the target object at the same time, and the user operation is facilitated.

Description

Modeling method and electronic equipment
Technical Field
The present application relates to the field of terminal technologies, and in particular, to a modeling method and an electronic device.
Background
The three-dimensional modeling technology can construct a model with three-dimensional data through a virtual three-dimensional space, and the model can be displayed in a display interface of electronic equipment or a virtual space displayed by AR equipment/VR equipment, so that a more realistic and vivid display effect is provided. The three-dimensional modeling technology can be applied to three-dimensional modeling of an entity in the real world to obtain a virtual three-dimensional model corresponding to the entity, and more real interaction experience can be provided for a user when the virtual model is displayed in a display interface of electronic equipment.
In the existing three-dimensional modeling technology, a plurality of photos of modeling objects with different visual angles are required to be input, the camera pose of each photo is calculated, point clouds and grids are generated by utilizing a multi-visual angle solid geometry technology, textures and texture maps are calculated, and then a three-dimensional model is obtained. Although the method can generate a three-dimensional model which is similar to a modeling object, a large number of multi-view photos need to be shot, so that the operation of a user is complex, and the user experience is poor; in addition, the modeling process is complex, a large amount of cloud side processing time is needed, and the modeling efficiency is low. In addition, the current three-dimensional modeling technology is a method for modeling a static rigid object, and for a dynamic object such as an animal, it is difficult to obtain a multi-view photo required for modeling, so that three-dimensional modeling cannot be completed.
Disclosure of Invention
The application provides a modeling method and electronic equipment, which are used for providing a high-fidelity three-dimensional modeling method with convenient operation.
In a first aspect, the present application provides a modeling method that may be applied to an electronic device. The method comprises the following steps: the method comprises the steps that electronic equipment obtains a target image corresponding to a target object to be modeled, and a deformable model corresponding to the target image is determined from a plurality of preset deformable models according to the target image. And the electronic equipment carries out deformation adjustment on the deformable model corresponding to the target image according to the target image, and generates an initial model corresponding to the target image. The electronic equipment acquires the image characteristics of the target image, maps the image characteristics of the target image to the initial model, and generates model parameters of a target model corresponding to the target image, wherein the model parameters of the target model are used for indicating the geometric characteristics and the texture characteristics of the target model. And the electronic equipment generates the target model according to the model parameters of the target model and the initial model.
Based on the method, the electronic equipment can generate the target model corresponding to the target object according to the target image corresponding to the target object and the preset deformable model, so that multiple images do not need to be acquired for the target object at the same time, the operation of a user is facilitated, and the user experience is improved; meanwhile, the electronic equipment can determine the model parameters of the target model according to the target image and the initial model obtained after deformation adjustment, so that the target model is closer to the target object, and the simulation effect is more vivid.
In one possible design, the determining, according to the target image, a deformable model corresponding to the target image from a plurality of preset deformable models includes: determining a target type corresponding to the target image based on the trained type recognition network; determining a deformable model corresponding to the target image from the preset deformable models according to the target type corresponding to the target image; the preset deformable models are in one-to-one correspondence with various types.
Through the design, the application provides a plurality of preset deformable models, each deformable model corresponds to one type, and the electronic equipment can determine the deformable model corresponding to the target image according to the target type corresponding to the target image, so that the modeling is not required to be performed again according to a plurality of images with different angles, but deformation adjustment is performed on the basis of the deformable model corresponding to the target image, the modeling is completed, the realistic modeling effect is ensured, and meanwhile, the complexity of modeling operation is reduced.
In one possible design, the method further comprises: shooting the target object in response to a first operation triggered by a user, and acquiring a plurality of preview images corresponding to the target object; the first operation is used for starting a photographing function of the electronic device, and the plurality of preview images are images corresponding to the target object acquired by the electronic device before the photographing operation triggered by the user is not detected; respectively determining the type corresponding to each preview image in the plurality of preview images based on the trained type recognition network; reminding a user to trigger photographing operation when the types corresponding to the preview images with the continuously set quantity are the same;
the obtaining the target image corresponding to the target object to be modeled includes: and responding to the photographing operation triggered by the user, and acquiring the target image.
Through the design, the electronic equipment can remind a user to take a picture when determining that the types of the plurality of preview images acquired by the target object are consistent, so that the type corresponding to the target can be accurately identified by the target image acquired by the electronic equipment, and the modeling accuracy is further improved.
In one possible design, the performing deformation adjustment on the deformable model corresponding to the target image according to the target image, and generating an initial model corresponding to the target image includes: determining a binary image corresponding to the target image, wherein the binary image corresponding to the target image is used for representing the outline of a picture of a target object in the target image; performing multiple system adjustment and pose adjustment on the deformable model corresponding to the target image until the adjusted deformable model meets a first condition, and taking the adjusted deformable model as the initial model;
Wherein the first condition includes at least one of: the profile similarity between the adjusted deformable model and the binary image is greater than a first set threshold; and the error value between the joint point in the adjusted deformable model and the joint point in the target image is smaller than a second set threshold value.
Through the design, the electronic equipment can adjust the deformable model according to the target image and the binary image corresponding to the target image, so that the shape of the deformable model is close to the shape of the target object in the target image, and a realistic simulation modeling effect is realized.
In one possible design, the acquiring the image feature of the target image includes: performing semantic segmentation processing on the target image based on the trained semantic segmentation network to obtain a semantic segmentation result corresponding to the target image; the semantic segmentation result corresponding to the target image comprises a plurality of areas in the target image and labels corresponding to each area in the plurality of areas; and determining the image characteristics of the target image according to the target image and the semantic segmentation result corresponding to the target image.
In one possible design, the mapping the image features of the target image onto the initial model, to generate model parameters of a target model corresponding to the target image, includes: determining a body space corresponding to the initial model; the volume space is used for representing the position information of each three-dimensional point in the initial model; model parameters of the target model are generated according to the body space and image features of the target image based on the trained texture optimization network.
In one possible design, the model parameters of the target model include geometric parameters and texture parameters; the geometric parameters include any one or more of a density or directional distance function SDF, and the texture parameters include transparency and color of each three-dimensional point in the object model.
In one possible design, the generating the target model from the model parameters of the target model and the initial model includes: generating a texture map corresponding to the target model according to model parameters of the target model; and mapping the texture map corresponding to the target model and the initial model to generate the target model.
In one possible design, the generating the texture map corresponding to the target model according to the model parameters of the target model includes: performing rasterization processing according to model parameters of the target model to generate the texture map; or generating the texture map based on the trained convolutional neural network according to the model parameters of the target model.
In one possible design, the target type to which the target image corresponds belongs to any of an animal, a plant, and a human.
In a second aspect, the present application provides an electronic device including a plurality of functional modules; the plurality of functional modules interact to implement the method performed by the electronic device in the first aspect and the embodiments thereof. The plurality of functional modules may be implemented based on software, hardware, or a combination of software and hardware, and the plurality of functional modules may be arbitrarily combined or divided based on the specific implementation.
In a third aspect, the present application provides an electronic device comprising at least one processor and at least one memory, the at least one memory storing computer program instructions that, when executed by the electronic device, perform the method performed by the electronic device in the first aspect and embodiments thereof.
In a fourth aspect, the application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method performed by the electronic device of the first aspect and embodiments thereof described above.
In a fifth aspect, the present application also provides a computer-readable storage medium having stored therein a computer program which, when executed by a computer, causes the computer to perform the method performed by the electronic device in the first aspect and embodiments thereof.
In a sixth aspect, the present application further provides a chip for reading a computer program stored in a memory, and executing the method executed by the electronic device in the first aspect and embodiments thereof.
In a seventh aspect, the present application further provides a chip system, where the chip system includes a processor, and the processor is configured to enable a computer device to implement a method performed by an electronic device in the first aspect and embodiments thereof. In one possible design, the chip system further includes a memory for storing programs and data necessary for the computer device. The chip system may be formed of a chip or may include a chip and other discrete devices.
Drawings
FIG. 1 is a schematic view of a scene to which embodiments of the present application are applicable;
fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 3 is a software structural block diagram of an electronic device according to an embodiment of the present application;
FIG. 4 is a schematic flow chart of a modeling method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a target image according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a binary image corresponding to a target image according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a preset plurality of deformable models according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a deformable model corresponding to a target image according to an embodiment of the present application;
FIG. 9 is a schematic diagram of a deformation adjustment of a deformable model according to an embodiment of the present application;
FIG. 10 is a schematic diagram of an initial model corresponding to a target image according to an embodiment of the present application;
FIG. 11 is a schematic diagram of a semantic segmentation result corresponding to a target image according to an embodiment of the present application;
FIG. 12 is a schematic diagram of an implicit volume expression scheme according to an embodiment of the present application;
FIG. 13 is a schematic diagram of a texture map corresponding to a target model according to an embodiment of the present application;
FIG. 14 is a schematic diagram of a target model according to an embodiment of the present application;
fig. 15 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 16 is a flowchart of a modeling method according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings. Wherein in the description of embodiments of the application, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.
It should be understood that in embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one (item) below" or the like, refers to any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, a and b, a and c, b and c, or a, b and c, wherein a, b and c can be single or multiple.
The three-dimensional modeling technology can construct a model with three-dimensional data through a virtual three-dimensional space, and the model can be displayed in a display interface of electronic equipment or a virtual space displayed by augmented reality (augmented reality, AR) equipment/Virtual Reality (VR) equipment, so that a more realistic and vivid display effect is provided. The three-dimensional modeling technology can be applied to three-dimensional modeling of an entity in the real world to obtain a virtual three-dimensional model corresponding to the entity, and more real interaction experience can be provided for a user when the virtual model is displayed in a display interface of electronic equipment.
For example, AR technology is a technology for integrating real world information and virtual world information to display. VR technology is a computer simulation technology that can create and experience a virtual world by using a computer to create a simulated environment into which a user is immersed. The AR technology and the VR technology can simulate entity information which is difficult to experience in the real world to obtain virtual information, and the virtual information is displayed on a picture or space to be perceived by a user, so that sensory experience exceeding reality is provided for the user. In an AR scene or a VR scene, a virtual model can be obtained by carrying out three-dimensional modeling on real objects, and the virtual model is displayed in a display interface, so that more real interaction experience is provided for users.
In the existing three-dimensional modeling technology, a plurality of photos of modeling objects with different visual angles are required to be input, the camera pose of each photo is calculated, point clouds and grids are generated by utilizing a multi-visual angle solid geometry technology, textures and texture maps are calculated, and then a three-dimensional model corresponding to the modeling objects is obtained. Although the method can generate a three-dimensional model which is similar to a modeling object, a large number of multi-view photos need to be shot, so that the operation of a user is complex, and the user experience is poor; in addition, the modeling process is complex, a large amount of cloud side processing time is needed, and the modeling efficiency is low. In addition, the current three-dimensional modeling technology is a method for modeling a static rigid object, and for a dynamic object such as an animal, it is difficult to obtain a multi-view photo required for modeling, so that three-dimensional modeling cannot be completed.
For example, in a current three-dimensional modeling scheme for animals, the animals need to be taken into specialized large-scale instruments and equipment, the large-scale instruments and equipment are provided with imaging devices at a plurality of angles, and the imaging devices can acquire images of the animals at the same time, so that animal images at a plurality of angles can be acquired, and then three-dimensional modeling is performed according to the acquired animal images, so that a three-dimensional model corresponding to the animals is obtained. In the scheme, the cost for acquiring a plurality of animal images required by modeling is high, the processing for three-dimensional modeling according to the plurality of animal images is complex, and the modeling efficiency is low.
Based on the problems, the application provides a modeling method which is used for providing a three-dimensional modeling method which is convenient to operate and high in fidelity. Fig. 1 is a schematic diagram of a scenario where the modeling method provided by the embodiment of the present application is applicable. Referring to fig. 1, the scene includes an electronic device, a user, and a target object to be modeled. For example, the target object to be modeled in fig. 1 is a kitten.
In the embodiment of the application, the electronic equipment can acquire the target image corresponding to the target object to be modeled. When the electronic equipment is provided with a camera, a user can shoot a target object by using the electronic equipment to acquire a target image; or the user can use other image pickup devices to pick up the target object to obtain a target image, and the image pickup devices send the target image to the electronic equipment. And the electronic equipment determines a deformable model corresponding to the target image in the preset plurality of deformable models according to the target image. And the electronic equipment carries out deformation adjustment on the deformable model corresponding to the target image according to the target image to obtain an initial model corresponding to the target image. The electronic equipment acquires image features of the target image, maps the image features of the target image to an initial model corresponding to the target image, obtains model parameters of the target model, and generates the target model according to the model parameters of the target model and the initial model corresponding to the target image. Optionally, the electronic device may perform skeleton binding on the target model, render the target model in the AR application, and drive the target model to obtain an animation effect, so as to provide a real interactive experience for the user.
According to the scheme, the electronic equipment can generate the target model corresponding to the target object according to at least one image acquired by the target object and the preset deformable model, so that a plurality of images do not need to be acquired by the target object at the same time, the operation of a user is facilitated, and the user experience is improved; meanwhile, the electronic equipment can determine the model parameters of the target model according to the target image and the initial model obtained after deformation adjustment, so that the target model is closer to the target object, and the simulation effect is more vivid.
It should be noted that, the system to which the modeling method provided by the embodiment of the present application is applicable may further include a server, where after the electronic device obtains the target image, the electronic device may further send the target image to the server, and the server determines, according to the target image, a deformable model corresponding to the target image in the plurality of preset deformable models. And the server carries out deformation adjustment on the deformable model corresponding to the target image according to the target image to obtain an initial model corresponding to the target image. The server acquires the image characteristics of the target image, maps the image characteristics of the target image to an initial model corresponding to the target image, obtains model parameters of the target model, and generates the target model according to the model parameters of the target model and the initial model corresponding to the target image. The server transmits the object model to the electronic device. That is, part or all of the steps in the modeling method provided by the application can be executed by the server, so that the calculation pressure of the electronic equipment is reduced, and the modeling efficiency is further improved. The specific implementation can be referred to the description of the modeling method for the electronic device in the embodiment of the present application, and the repetition is not repeated.
Embodiments of an electronic device, and for using such an electronic device, are described below. The electronic device in the embodiment of the present application may be, for example, a tablet computer, a mobile phone, an augmented reality (augmented reality, AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a personal digital assistant (personal digital assistant, PDA), a wearable device, an internet of things (internet of thing, ioT) device, or the like, and the embodiment of the present application does not limit the specific type of the electronic device.
Fig. 2 is a schematic structural diagram of an electronic device 100 according to an embodiment of the present application. As shown in fig. 2, the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, keys 190, a motor 191, an indicator 192, a camera 193, a display 194, a user identification module (subscriber identification module, SIM) card interface 195, and the like.
The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors. The controller may be a neural hub and a command center of the electronic device 100, among others. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution. A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.
The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the electronic device 100, and may also be used to transfer data between the electronic device 100 and a peripheral device. The charge management module 140 is configured to receive a charge input from a charger. The power management module 141 is used for connecting the battery 142, and the charge management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 and provides power to the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like.
The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like. The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 100 may be used to cover a single or multiple communication bands. Different antennas may also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed into a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 150 may provide a solution for wireless communication including 2G/3G/4G/5G, etc., applied to the electronic device 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA), etc. The mobile communication module 150 may receive electromagnetic waves from the antenna 1, perform processes such as filtering, amplifying, and the like on the received electromagnetic waves, and transmit the processed electromagnetic waves to the modem processor for demodulation. The mobile communication module 150 can amplify the signal modulated by the modem processor, and convert the signal into electromagnetic waves through the antenna 1 to radiate. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be provided in the same device as at least some of the modules of the processor 110.
The wireless communication module 160 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN) (e.g., wireless fidelity (wireless fidelity, wi-Fi) network), bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field communication, NFC), infrared technology (IR), etc., as applied to the electronic device 100. The wireless communication module 160 may be one or more devices that integrate at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, frequency modulate it, amplify it, and convert it to electromagnetic waves for radiation via the antenna 2.
In some embodiments, antenna 1 and mobile communication module 150 of electronic device 100 are coupled, and antenna 2 and wireless communication module 160 are coupled, such that electronic device 100 may communicate with a network and other devices through wireless communication techniques. The wireless communication techniques may include the Global System for Mobile communications (global system for mobile communications, GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), wideband code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR techniques, among others. The GNSS may include a global satellite positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a beidou satellite navigation system (beidou navigation satellite system, BDS), a quasi zenith satellite system (quasi-zenith satellite system, QZSS) and/or a satellite based augmentation system (satellite based augmentation systems, SBAS).
The display 194 is used to display a display interface of an application, such as a display page of an application installed on the electronic device 100. The display 194 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194, N being a positive integer greater than 1.
The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format. In some embodiments, electronic device 100 may include 1 or N cameras 193, N being a positive integer greater than 1.
The internal memory 121 may be used to store computer executable program code including instructions. The processor 110 executes various functional applications of the electronic device 100 and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a storage program area and a storage data area. The storage program area may store an operating system, software code of at least one application program, and the like. The storage data area may store data (e.g., captured images, recorded video, etc.) generated during use of the electronic device 100, and so forth. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to enable expansion of the memory capabilities of the electronic device. The external memory card communicates with the processor 110 through an external memory interface 120 to implement data storage functions. For example, files such as pictures and videos are stored in an external memory card.
The electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playing, recording, etc.
The sensor module 180 may include a pressure sensor 180A, an acceleration sensor 180B, a touch sensor 180C, and the like, among others.
The pressure sensor 180A is used to sense a pressure signal, and may convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194.
The touch sensor 180C, also referred to as a "touch panel". The touch sensor 180C may be disposed on the display 194, and the touch sensor 180C and the display 194 form a touch screen, which is also referred to as a "touch screen". The touch sensor 180C is used to detect a touch operation acting thereon or thereabout. The touch sensor may communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to touch operations may be provided through the display 194. In other embodiments, the touch sensor 180C may also be disposed on the surface of the electronic device 100 at a different location than the display 194.
The keys 190 include a power-on key, a volume key, etc. The keys 190 may be mechanical keys. Or may be a touch key. The electronic device 100 may receive key inputs, generating key signal inputs related to user settings and function controls of the electronic device 100. The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration alerting as well as for touch vibration feedback. For example, touch operations acting on different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization. The indicator 192 may be an indicator light, may be used to indicate a state of charge, a change in charge, a message indicating a missed call, a notification, etc. The SIM card interface 195 is used to connect a SIM card. The SIM card may be contacted and separated from the electronic device 100 by inserting the SIM card interface 195 or extracting it from the SIM card interface 195.
It is to be understood that the components shown in fig. 2 are not to be construed as a particular limitation of the electronic device 100, and the electronic device may include more or less components than illustrated, or may combine certain components, or may split certain components, or may have a different arrangement of components. Furthermore, the combination/connection relationships between the components in fig. 2 may also be modified.
Fig. 3 is a software structure block diagram of an electronic device according to an embodiment of the present application. As shown in fig. 3, the software structure of the electronic device may be a hierarchical architecture, for example, the software may be divided into several layers, each layer having a distinct role and division of work. The layers communicate with each other through a software interface. In some embodiments, the operating system is divided into four layers, from top to bottom, an application layer, an application framework layer (FWK), a runtime (run time) and a system library, and a kernel layer, respectively.
The application layer may include a series of application packages (application package). As shown in fig. 3, the application layer may include a camera, settings, skin modules, user Interfaces (UIs), three-way applications, and the like. The three-party application program can comprise a gallery, calendar, conversation, map, navigation, WLAN, bluetooth, music, video, short message, and the like. In the embodiment of the application, the application program layer can comprise a target installation package of a target application, wherein the electronic device requests to download from a server, and the function files and the layout files in the target installation package are adapted to the electronic device.
The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer may include some predefined functions. As shown in FIG. 3, the application framework layer may include a window manager, a content provider, a view system, a telephony manager, a resource manager, and a notification manager.
The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like. The content provider is used to store and retrieve data and make such data accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebooks, etc.
The view system includes visual controls, such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, a display interface including a text message notification icon may include a view displaying text and a view displaying a picture.
The telephony manager is for providing communication functions of the electronic device. Such as the management of call status (including on, hung-up, etc.).
The resource manager provides various resources for the application program, such as localization strings, icons, pictures, layout files, video files, and the like.
The notification manager allows the application to display notification information in a status bar, can be used to communicate notification type messages, can automatically disappear after a short dwell, and does not require user interaction. Such as notification manager is used to inform that the download is complete, message alerts, etc. The notification manager may also be a notification in the form of a chart or scroll bar text that appears on the system top status bar, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in a status bar, a prompt tone is emitted, the electronic device vibrates, and an indicator light blinks, etc.
The runtime includes a core library and a virtual machine. The runtime is responsible for the scheduling and management of the operating system.
The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of an operating system. The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.
The system library may include a plurality of functional modules. For example: surface manager (surface manager), media library (media library), three-dimensional graphics processing library (e.g., openGL ES), two-dimensional graphics engine (e.g., SGL), image processing library, etc.
The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.
Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio and video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.
The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.
The 2D graphics engine is a drawing engine for 2D drawing.
The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.
The hardware layer may include various sensors such as acceleration sensors, gyroscopic sensors, touch sensors, and the like.
It should be noted that the structures shown in fig. 2 and fig. 3 are only examples of the electronic device provided in the embodiments of the present application, and the electronic device provided in the embodiments of the present application is not limited in any way, and in a specific implementation, the electronic device may have more or fewer devices or modules than those shown in fig. 2 or fig. 3.
The modeling method provided by the embodiment of the application is described below.
Example 1
Fig. 4 is a schematic flow chart of a modeling method according to an embodiment of the present application. Referring to fig. 4, the modeling method provided by the embodiment of the present application includes steps of obtaining an image, loading a deformable model, generating an initial model, performing semantic segmentation processing, determining image features, determining model parameters of a target model, generating a target model, and the like. Each step in the modeling method provided by the embodiment of the present application is further described below with reference to fig. 4:
and a, electronic equipment acquires a target image corresponding to a target object to be modeled.
In the embodiment of the present application, the target object may be an animal, such as a cat, a dog, a horse, etc. That is, in the embodiment of the application, the dynamic object can be modeled in three dimensions. The target object may also be a static object, such as a plant, furniture, or the like. The target object may also be a person. The following description will be made by taking the target object as an animal as an example:
in an alternative embodiment, the electronic device may capture a target object to obtain a target image, where the target image may be at least one image.
Optionally, in the process of shooting the target object by using the electronic device, the electronic device may acquire a preview image corresponding to the target object, where the preview image is an image displayed by the electronic device in the display interface before the electronic device does not detect a shooting operation triggered by the user. The electronic device may perform animal type recognition on the preview image to determine the animal type to which the preview image corresponds. In particular implementations, the electronic device may determine the animal type corresponding to the preview image from the preview image based on the trained type recognition network. When the electronic device determines that the animal types corresponding to the continuously set number of preview images are the same, the electronic device can remind the user to shoot, such as displaying a message for reminding the user to shoot or playing audio for reminding the user to shoot. Through the design, the target image acquired by the electronic equipment can be ensured to represent the characteristics of the target object, and the modeling accuracy is improved.
In another alternative embodiment, the electronic device may receive the target image sent by the other image capturing device, where the target image may be at least one image obtained by capturing, by the other image capturing device, the target object.
And b, the electronic equipment determines a deformable model corresponding to the target image from a plurality of preset deformable models.
In an alternative embodiment, the electronic device may perform animal type recognition on the target image based on the trained type recognition network, and determine the target type corresponding to the target image. It can be understood that, in the embodiment of the present application, the target object may be an animal, a plant, a person, etc., and the target type corresponding to the target image belongs to any of the animal, the plant, the person, etc. In the first embodiment, the target object is an animal, and the target type corresponding to the target image is an animal type.
In practice, the trained type recognition network may extract feature vectors of the target image and recognize the animal type based on the extracted feature vectors. For example, when the type recognition network is a convolutional neural network, the type recognition network may perform convolutional processing on the target image to extract a feature vector of the target image, and then determine an animal type corresponding to the target image based on the feature vector of the target image. The animal type corresponding to the target image determined by the electronic device may include at least one animal type corresponding to the target image and a probability value corresponding to each animal type. For example, assume that fig. 5 is a target image acquired by an electronic device. The electronic device takes the target image shown in fig. 5 as an input of the type recognition network, and obtains an output of the type recognition network, where the output may be an animal type corresponding to the target image, and the animal type is: cloth doll cat, english short, siamese; wherein, the probability of the corresponding puppet cat is 90%, the probability of the short is 10% and the probability of Siamese is 6%.
It should be noted that the trained type recognition network may be used to determine the animal type corresponding to the image input to the type recognition network, and the type recognition network may be obtained by training based on the sample image with the animal type tag.
Optionally, the electronic device may further determine a binary image corresponding to the target image, where the binary image corresponding to the target image may be used to calculate the deformation coefficient in step c, which may be described with reference to step c. In a specific implementation, the electronic device may determine the binary image corresponding to the target image based on a type recognition network, where the type recognition network may also be used to determine the binary image corresponding to the input image. It will be appreciated that while the trained type recognition network may be used to determine the binary image, the type recognition network may also need to be trained using the binary image corresponding to the sample image during the training of the type recognition network. For example, for the target image shown in fig. 5. The binary image corresponding to the target image determined by the electronic device may be as shown in fig. 6.
After determining the animal type corresponding to the target image, the electronic device can determine a deformable model corresponding to the target image from a plurality of preset deformable models. The deformable models have a one-to-one correspondence with the animal types, and the deformable model corresponding to the target image is the deformable model corresponding to the animal type corresponding to the target image. For example, fig. 7 is a schematic diagram of a preset plurality of deformable models according to an embodiment of the present application. The deformable model corresponding to the target image determined by the electronic device according to the animal type corresponding to the target image shown in fig. 5 may be as shown in fig. 8.
And c, the electronic equipment carries out deformation adjustment on the deformable model to obtain an initial model corresponding to the target image.
In the embodiment of the application, in order to obtain the three-dimensional model which is more attached to the target object in the target image, after the electronic equipment determines the deformable model corresponding to the target image, the deformation coefficient and the pose can be calculated so as to perform deformation adjustment on the deformable model corresponding to the target image, and an initial model with the shape and the angle similar to those of the target object in the target image is obtained.
In an alternative embodiment, the electronic device may determine the deformation coefficient and the pose according to the target image, the binary image corresponding to the target image determined in the step b, and the deformable model corresponding to the target image. If the electronic device can perform multiple coefficient adjustment and pose adjustment on the deformable model corresponding to the target image, when the profile similarity between the adjusted deformable model and the binary image is greater than a first set threshold value, and/or when the error value between the joint point in the adjusted deformable model and the joint point in the target image is smaller than a second set threshold value, the deformation adjustment on the deformable model can be stopped, and at the moment, the adjusted deformable model can be considered to simulate the form corresponding to the target, and the electronic device can use the adjusted deformable model as an initial model corresponding to the target image. In specific implementation, after each coefficient adjustment or pose adjustment is performed on the deformable model, the electronic device can calculate the contour similarity between the adjusted deformable model and the binary image, extract the articulation point in the adjusted deformable model, and calculate the error value between the articulation point in the adjusted deformable model and the articulation point in the target image.
It should be noted that, the first set threshold and the second set threshold may be set values related to animal types, that is, since the profile features and the joint point features of different animal types are different, the electronic device in the embodiment of the present application may store the first set threshold and the second set threshold corresponding to each animal type, so as to improve accuracy of deformation adjustment of the deformable model, and make an initial model obtained after adjustment of the deformable model more fit with the target object.
For example, fig. 9 is a schematic diagram of a process of performing deformation adjustment on a deformable model by using an electronic device according to an embodiment of the present application. Referring to fig. 9, the electronic device may adjust the coefficient and pose of the deformable model, after each adjustment, compare the adjusted deformable model with the binary image corresponding to the target image, if the difference between the adjusted deformable model and the binary image is large in fig. 9 (a), the adjustment process shown in fig. 9 (b), (c), and (d) needs to be continued until the adjusted deformable model is similar to the binary image, and if the similarity between the adjusted deformable model and the binary image shown in fig. 9 (d) is greater than a first set threshold, the deformable model shown in fig. 9 (d) is used as the initial model corresponding to the target image.
For another example, the electronic device may obtain the initial model shown in fig. 10 after performing the deformation adjustment on the deformable model shown in fig. 8 based on the target image shown in fig. 5 and the binary image shown in fig. 6. Referring to fig. 10, it can be seen that the initial model adjusted by deformation is more conforming to the shape of the target object in the target image.
And d, carrying out semantic segmentation processing on the target image by the electronic equipment to obtain a semantic segmentation result corresponding to the target image.
Alternatively, the semantic segmentation result of the target image may include a plurality of regions obtained by dividing the target image and a label corresponding to each region, where the label corresponding to each region may represent a name of the region, e.g., the label may be a nose, an ear, etc.
In an optional implementation manner, the electronic device may perform semantic segmentation processing on the target image based on the trained semantic segmentation network according to the target image and the binary image corresponding to the target image, to obtain a label corresponding to each pixel in the target image. Optionally, the trained semantic segmentation network may include a deep learning network and a classifier network, where the deep learning neural network may extract each channel vector of the target image, and the classifier network may perform classification processing based on each channel vector, so as to determine a label corresponding to each pixel point in the target image. The labels corresponding to any pixel point may be background, nose, ear, eye, chest, leg, claw, tail, head, back, etc. The region formed by a plurality of pixels of the same label can be regarded as the region corresponding to the label in the target image.
Wherein the semantic segmentation network may be trained based on labeled sample images. For example, the semantic segmentation network may include a fully connected convolutional neural network and a classifier network.
For example, fig. 11 is a schematic diagram of a semantic segmentation result corresponding to a target image according to an embodiment of the present application, and referring to fig. 11, after the electronic device performs semantic segmentation processing on the target image shown in fig. 5, the semantic segmentation result shown in fig. 11 may be obtained, where different areas in fig. 11 correspond to different labels, and in fig. 11, areas corresponding to 9 labels, including a nose, an ear, an eye, a chest, a leg, a claw, a tail, a head, and a back, are included.
And e, the electronic equipment determines the image characteristics of the target image.
In an alternative embodiment, the electronic device may determine the image feature of the target image according to the target image and the semantic segmentation result corresponding to the target image. For example, the electronic device may determine the image features of the target image based on the trained convolutional neural network, and in implementation, the electronic device may use the semantic segmentation results of the target image and the target image as the input of the trained convolutional neural network, and obtain two sets of feature vectors output by the trained convolutional neural network, where the two sets of feature vectors respectively correspond to the semantic segmentation results of the target image and the target image. The electronic equipment can splice the two groups of feature vectors to obtain the image features of the target image. It can be seen that the image features of the target image can be used to indicate the color and label of each pixel in the target image.
And f, mapping the image features of the target image onto an initial model corresponding to the target image by the electronic equipment, and determining model parameters of the target model corresponding to the target image.
In the embodiment of the application, the initial model after the electronic device executes the deformation adjustment obtained in the step c can represent the geometric shape of the target model corresponding to the target image, and the electronic device also needs to determine the color texture of the target model corresponding to the target image.
In an alternative embodiment, the initial model after the electronic device performs the deformation adjustment obtained in the step c is a geometric grid model, which is an explicit expression of a geometric relationship. When the electronic equipment maps the image features of the target image to the initial model corresponding to the target image, the complete three-dimensional feature information of the target model, such as the geometric features and texture features of the target model, can be obtained in an implicit body space expression mode.
Fig. 12 is a schematic diagram of an implicit body space expression mode according to an embodiment of the present application. Referring to fig. 12, the electronic device may convert the initial model into a volumetric representation, the volumetric space of the initial model including positional information for each three-dimensional point in the initial model, e.g., the volumetric space including three-dimensional coordinates for each three-dimensional point. And e, the electronic equipment takes the body space of the initial model and the image characteristics of the target image obtained in the step e as inputs of a trained texture optimization network, and the trained texture optimization network outputs model parameters of the target model. Model parameters of the target model may include geometric parameters and texture parameters of the target model. Alternatively, the geometric parameters may include positional information for each three-dimensional point in the target model, e.g., the geometric parameters may be density, directional distance functions (signed distance function, SDF); texture parameters may include transparency and color for each three-dimensional point in the object model. The texture optimization network may be obtained by training according to image features of the sample image, a body space of an initial model corresponding to the sample image, and model parameters of a model corresponding to the sample image.
Referring to fig. 12, a manner of mapping image features of a two-dimensional target image (e.g., color and label of each pixel point in the target image) to an initial model of a three-dimensional volume space through a neural network is implicit volume space expression. In a specific implementation, the texture optimization network may process each pixel point in the two-dimensional target image, where each pixel point in the target image corresponds to a set of three-dimensional points in the initial model according to the parallel projection technique, and a set of three-dimensional points corresponding to any one pixel point has the same X-axis coordinate and Y-axis coordinate. The texture optimization network can map the color and the label of each pixel point to a group of three-dimensional points corresponding to the pixel point, so that the texture optimization network obtains the density, the color and the transparency of each three-dimensional point.
Through the design, one pixel point in the target image can correspond to one group of three-dimensional points in the initial model, so that the electronic equipment can fill the texture features of one group of three-dimensional points in the initial model corresponding to the pixel point according to the features of one pixel point in the target image, and the electronic equipment can acquire the three-dimensional model with complete texture features. Therefore, the modeling scheme provided by the embodiment of the application does not require shooting a plurality of images in all directions for the target object, and can realize modeling of the target object according to one target image shot by the target object, thereby being convenient for user operation; the modeling accuracy is guaranteed, meanwhile, the calculation complexity of the electronic equipment is reduced, and the modeling efficiency is improved.
It should be noted that, in some embodiments of the present application, the electronic device may not perform step e, but use the object image, the semantic segmentation result of the object image obtained in step d, and the volume space of the initial model as inputs of the trained texture optimization network, and obtain model parameters of the object model output by the trained texture optimization network. That is, at this time, the image features of the target image may be the target image and the semantic segmentation result of the target image, and when training the texture optimization network, training may be performed based on the sample image, the semantic segmentation result of the sample image, the body space of the initial model corresponding to the sample image, and the model parameters of the model corresponding to the sample image.
And g, the electronic equipment can generate a target model according to the model parameters of the target model and the initial model, wherein the target model is a three-dimensional model corresponding to the target image.
In the embodiment of the application, the electronic equipment can generate the texture map corresponding to the target model according to the geometric parameters and the texture parameters of the target model.
In an alternative implementation manner, the electronic device may perform grid rasterization processing according to the geometric parameters and texture parameters of the target model, and map three-dimensional information into a two-dimensional plane, so as to obtain a texture map.
In another alternative embodiment, the electronic device may generate a texture map based on the trained convolutional neural network. The electronic device may take the model parameters of the target model as inputs to the trained convolutional neural network, and obtain a texture map of the target model output by the trained convolutional neural network.
For example, fig. 13 is a schematic diagram of a texture map corresponding to a target model according to an embodiment of the present application. Referring to fig. 13, after determining model parameters of the target model for the target image shown in fig. 5, the electronic device may generate a texture map corresponding to the target model according to geometric parameters and texture parameters among the model parameters.
After the electronic equipment obtains the texture map of the target model, mapping processing can be carried out on the texture map corresponding to the target model and the initial model to generate the target model. For example, fig. 14 is a schematic diagram of a target model according to an embodiment of the present application. Referring to fig. 14, the target model is a three-dimensional model simulating the target object in the target image shown in fig. 5.
In addition, it should be noted that, in the embodiment of the present application, a plurality of convolutional neural networks may be involved, such as a type recognition network in step b, a semantic segmentation network in step c, a convolutional neural network for determining the image feature of the target image in step e, and a convolutional neural network for generating the texture map in step g, where the convolutional neural networks may be trained based on the same training method and different sample data, different convolutional neural networks may set different super parameters, and the plurality of convolutional neural networks obtained by training have different functions.
Example two
In another embodiment of the present application, the modeling method shown in fig. 4 may further perform modeling according to a plurality of target images corresponding to the target object. The modeling method provided in this embodiment is further described below by taking modeling based on two target images corresponding to a target object as an example.
In this embodiment, the two target images acquired by the electronic device may be images captured on the target object at different orientations, for example, the two target images may be a front image and a back image of the target object, respectively. The manner in which the electronic device acquires the two target images can be implemented with reference to step a in the first embodiment. For example, when the user shoots the target object using the electronic device, the user may be reminded to shoot the target object, for example, when the electronic device displays a shooting interface, the user is reminded to shoot the target object to acquire the first target image, then the electronic device may display a reminding message reminding the user to hold the electronic device for movement, and when the user moves to other directions of the target object, the user is reminded to shoot the target object again to acquire the second target image.
After the electronic device acquires the two target images, the electronic device may execute steps b-e on each target image respectively. The method comprises the steps of weighing two target images into an image A and an image B, executing a step B-a step c by electronic equipment aiming at the image A, determining an initial model corresponding to the image A, and then executing a step d-a step e to determine the image characteristics of the image A. Similarly, for the image B, the electronic device executes steps B-c to determine an initial model corresponding to the image B, and then executes steps d-e to determine the image characteristics of the image B.
In this embodiment, after determining the initial model corresponding to the image a and the initial model corresponding to the image B, the electronic device may execute step f, unlike step f in the first embodiment, since there are two initial models and image features of two target images in this embodiment, the electronic device may execute the following operations in this embodiment:
because the animal type corresponding to the image A and the animal type corresponding to the image B are consistent, the initial model corresponding to the image A and the initial model corresponding to the image B are obtained after deformation adjustment based on the same deformable model. Therefore, the electronic device may perform the volume space expression on the initial model corresponding to the image a and the initial model corresponding to the image B, and calculate the geometric transformation relationship between the volume space a and the volume space B after obtaining the volume space a of the initial model corresponding to the image a and the volume space B of the initial model corresponding to the image B, where the volume space a and the volume space B satisfy the following formulas:
wherein,for body space A, & gt>For volume B, τ is the geometric transformation relationship between volume a and volume B.
The electronic device may generate a target volume of the target initial model according to the volume a, the volume B, and the geometric transformation relationship. The target initial model is an initial model generated for the image A and the image B, and the target volume space can be a volume space expression for the target initial model.
In this embodiment, after determining the target volume space, the electronic device may map the image features of the image a and the image features of the image B onto the target initial model, and determine model parameters of the target model corresponding to the target object. Because two target images are used for modeling in the embodiment of the application, when the electronic equipment adopts an implicit body space expression mode to acquire the complete three-dimensional characteristic information of the target model, the image characteristics of the target body space, the image A and the image B can be used as the input of the trained texture optimization network, and the model parameters of the target model output by the texture optimization network can be acquired. Model parameters of the target model may include geometric parameters and texture parameters of the target model. Alternatively, the geometric parameters may include position information of each three-dimensional point in the target model, for example, the geometric parameters may be density, SDF; texture parameters may include transparency and color for each three-dimensional point in the object model.
In particular implementations, the electronic device may determine the model parameters of the target model based on the texture-optimization network, the set of three-dimensional points in the target initial model may correspond to one pixel in image a, or the set of three-dimensional points in the target initial model may correspond to one pixel in image B, or the set of three-dimensional points in the target initial model may correspond to one pixel in image a and one pixel in image B. When a set of three-dimensional points in the initial target model corresponds to one pixel point in the image a or a set of three-dimensional points in the initial target model corresponds to one pixel point in the image B, the method of generating transparency and color of each three-dimensional point by the texture optimization network is similar to the first embodiment, and the specific implementation can be seen in the first embodiment. When a group of three-dimensional points in the target initial model corresponds to one pixel point in the image A and one pixel point in the image B, the texture optimization network can calculate the distance between the three-dimensional points and the pixel points in the image A according to the camera pose of the image A and the image B and calculate the distance between the three-dimensional points and the pixel points in the image B when determining the transparency and the color of each three-dimensional point, select one pixel point with a shorter distance, and determine the transparency and the color of the three-dimensional point according to the color and the label of the selected pixel point.
Based on the mode, the electronic equipment obtains the model parameters of the target model after determining the position information, the color and the transparency of each three-dimensional point in the target initial model. The electronic device may determine the target model according to the model parameters of the target model and the target initial model, and the specific implementation may refer to step g in the first embodiment, and the repetition is not repeated.
According to the modeling method provided by the second embodiment, the electronic equipment can also model according to a plurality of target images, so that the target objects can be modeled according to more image features, the modeling accuracy is further improved, and after the electronic equipment displays the target model to a user, a realistic sensory experience can be provided for the user.
Based on the above embodiments, the present application further provides an electronic device, and fig. 15 is a schematic structural diagram of the electronic device according to the embodiment of the present application. Referring to fig. 15, the electronic device includes an input module 1501, a type recognition module 1502, a morphing adjustment module 1503, a semantic segmentation module 1504, an implicit volumetric expression module 1505, and a mapping module 1506. The function of each model is further described below:
and the input module 1501 is configured to acquire a target image corresponding to the model to be modeled. Alternatively, the input module 1501 may acquire one or more target images. Such as input model 1501, may be used to perform step a in either embodiment one or embodiment two.
The type recognition module 1502 is configured to determine a target type corresponding to the target image based on the trained similar recognition network, and determine a deformable model corresponding to the target image from a plurality of preset deformable models according to the determined target type corresponding to the target image. Such as type identification module 1502 may be used to perform step b in either embodiment one or embodiment two.
The deformation adjustment model 1503 is used for calculating deformation parameters and pose corresponding to the deformable model determined by the type recognition module 1502 so as to perform deformation adjustment on the deformable model, and obtain an initial model corresponding to the target image. For example, the deformation adjustment module may be used to perform step c in the first embodiment or the second embodiment.
The semantic segmentation module 1504 is configured to perform semantic segmentation processing on the target image, and obtain a semantic segmentation result corresponding to the target image. Such as semantic segmentation module 1504 may be used to perform step d in either embodiment one or embodiment two.
The implicit body space expression module 1505 is configured to convert an initial model corresponding to the target image into a body space expression, and generate image features of the target image according to the target image and a semantic segmentation result of the target image. The implicit body space expression module 1505 is further configured to map image features of the target image onto an initial model corresponding to the target image by using an implicit body space expression manner, and determine model parameters of the target model corresponding to the target image. Such as implicit volume expression module 1505, may be used to perform steps e-f in either embodiment one or embodiment two.
The mapping module 1506 is configured to generate a texture map corresponding to the target model according to the model parameters determined by the implicit body space expression module 1505, and perform mapping processing on the texture map corresponding to the target model and the initial model to generate the target model. Such as the mapping module 1506 may be used to perform step g in embodiment one or embodiment two.
It should be noted that, when the modeling method provided by the embodiment of the present application is executed by the electronic device shown in fig. 15, the method may be implemented with reference to the above embodiments, and the repetition is not repeated.
Based on the above embodiments, the present application also provides a modeling method, which can be performed by an electronic device. Fig. 16 is a flowchart of a modeling method according to an embodiment of the present application. Referring to fig. 16, the method includes the steps of:
s1601: and the electronic equipment acquires a target image corresponding to the target object to be modeled.
S1602: and the electronic equipment determines a deformable model corresponding to the target image from a plurality of preset deformable models according to the target image.
S1603: and the electronic equipment performs deformation adjustment on the deformable model corresponding to the target image according to the target image, and generates an initial model corresponding to the target image.
S1604: the electronic equipment acquires the image characteristics of the target image, maps the image characteristics of the target image to the initial model, and generates model parameters of the target model corresponding to the target image.
Wherein model parameters of the target model are used to indicate geometric features and texture features of the target model;
s1605: and the electronic equipment generates a target model according to the model parameters of the target model and the initial model.
It should be noted that, in the embodiment of the modeling method shown in fig. 16, reference may be made to the above embodiments of the present application, and the repetition is not repeated.
Based on the above embodiments, the present application further provides an electronic device, where the electronic device includes at least one processor and at least one memory, where the at least one memory stores computer program instructions, and when the electronic device is running, the at least one processor executes functions executed by the electronic device in the methods described in the embodiments of the present application.
Based on the above embodiments, the present application also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the methods described in the embodiments of the present application.
Based on the above embodiments, the present application also provides a computer-readable storage medium having stored therein a computer program which, when executed by a computer, causes the computer to execute the methods described in the embodiments of the present application.
Based on the above embodiment, the present application further provides a chip, where the chip is configured to read a computer program stored in a memory, and implement the methods described in the embodiments of the present application.
Based on the above embodiments, the present application provides a chip system, which includes a processor for supporting a computer device to implement the methods described in the embodiments of the present application. In one possible design, the chip system further includes a memory for storing programs and data necessary for the computer device. The chip system can be composed of chips, and can also comprise chips and other discrete devices.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (14)

1. A modeling method, characterized by being applied to an electronic device, the method comprising:
acquiring a target image corresponding to a target object to be modeled;
determining a deformable model corresponding to the target image from a plurality of preset deformable models according to the target image;
performing deformation adjustment on a deformable model corresponding to the target image according to the target image to generate an initial model corresponding to the target image;
Acquiring image features of the target image, mapping the image features of the target image to the initial model, and generating model parameters of a target model corresponding to the target image; model parameters of the target model are used for indicating geometric features and texture features of the target model;
and generating the target model according to the model parameters of the target model and the initial model.
2. The method according to claim 1, wherein the determining, from the target image, a deformable model corresponding to the target image from a preset plurality of deformable models includes:
determining a target type corresponding to the target image based on the trained type recognition network;
determining a deformable model corresponding to the target image from the preset deformable models according to the target type corresponding to the target image; the preset deformable models are in one-to-one correspondence with various types.
3. The method of claim 1 or 2, wherein the method further comprises:
shooting the target object in response to a first operation triggered by a user, and acquiring a plurality of preview images corresponding to the target object; the first operation is used for starting a photographing function of the electronic device, and the plurality of preview images are images corresponding to the target object acquired by the electronic device before the photographing operation triggered by the user is not detected;
Respectively determining the type corresponding to each preview image in the plurality of preview images based on the trained type recognition network;
reminding a user to trigger photographing operation when the types corresponding to the preview images with the continuously set quantity are the same;
the obtaining the target image corresponding to the target object to be modeled includes:
and responding to the photographing operation triggered by the user, and acquiring the target image.
4. A method according to any one of claims 1-3, wherein said performing deformation adjustment on the deformable model corresponding to the target image according to the target image, generating an initial model corresponding to the target image, comprises:
determining a binary image corresponding to the target image, wherein the binary image corresponding to the target image is used for representing the outline of a picture of a target object in the target image;
performing multiple system adjustment and pose adjustment on the deformable model corresponding to the target image until the adjusted deformable model meets a first condition, and taking the adjusted deformable model as the initial model;
wherein the first condition includes at least one of:
the profile similarity between the adjusted deformable model and the binary image is greater than a first set threshold; or (b)
And the error value between the joint point in the adjusted deformable model and the joint point in the target image is smaller than a second set threshold value.
5. The method of any of claims 1-4, wherein the acquiring image features of the target image comprises:
performing semantic segmentation processing on the target image based on the trained semantic segmentation network to obtain a semantic segmentation result corresponding to the target image; the semantic segmentation result corresponding to the target image comprises a plurality of areas in the target image and labels corresponding to each area in the plurality of areas;
and determining the image characteristics of the target image according to the target image and the semantic segmentation result corresponding to the target image.
6. The method according to any one of claims 1-5, wherein mapping the image features of the target image onto the initial model, generating model parameters of a target model corresponding to the target image, comprises:
determining a body space corresponding to the initial model; the volume space is used for representing the position information of each three-dimensional point in the initial model;
model parameters of the target model are generated according to the body space and image features of the target image based on the trained texture optimization network.
7. The method of claim 6, wherein the model parameters of the target model include geometric parameters and texture parameters; the geometric parameters include any one or more of a density or directional distance function SDF, and the texture parameters include transparency and color of each three-dimensional point in the object model.
8. The method of any of claims 1-7, wherein the generating the target model from model parameters of the target model and the initial model comprises:
generating a texture map corresponding to the target model according to model parameters of the target model;
and mapping the texture map corresponding to the target model and the initial model to generate the target model.
9. The method of claim 8, wherein the generating a texture map corresponding to the target model according to model parameters of the target model comprises:
performing rasterization processing according to model parameters of the target model to generate the texture map; or alternatively
And generating the texture map based on the trained convolutional neural network according to the model parameters of the target model.
10. The method of claim 2, wherein the target type to which the target image corresponds is of any of an animal, a plant, and a human.
11. An electronic device comprising at least one processor coupled to at least one memory, the at least one processor configured to read a computer program stored by the at least one memory to perform the method of any of claims 1-10.
12. An electronic device comprising a plurality of functional modules; the plurality of functional modules interact to implement the method of any of claims 1-10.
13. A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any of claims 1-10.
14. A computer readable storage medium having instructions stored therein which, when run on a computer, cause the computer to perform the method of any of claims 1-10.
CN202210556215.6A 2022-05-19 2022-05-19 Modeling method and electronic equipment Pending CN117152338A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210556215.6A CN117152338A (en) 2022-05-19 2022-05-19 Modeling method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210556215.6A CN117152338A (en) 2022-05-19 2022-05-19 Modeling method and electronic equipment

Publications (1)

Publication Number Publication Date
CN117152338A true CN117152338A (en) 2023-12-01

Family

ID=88882972

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210556215.6A Pending CN117152338A (en) 2022-05-19 2022-05-19 Modeling method and electronic equipment

Country Status (1)

Country Link
CN (1) CN117152338A (en)

Similar Documents

Publication Publication Date Title
US12020472B2 (en) Image processing method and image processing apparatus
WO2020102978A1 (en) Image processing method and electronic device
CN113382154A (en) Human body image beautifying method based on depth and electronic equipment
WO2023284715A1 (en) Object reconstruction method and related device
CN116048244B (en) Gaze point estimation method and related equipment
WO2022156473A1 (en) Video playing method and electronic device
US20240013432A1 (en) Image processing method and related device
WO2023124948A1 (en) Three-dimensional map creation method and electronic device
CN116095413B (en) Video processing method and electronic equipment
CN113538227B (en) Image processing method based on semantic segmentation and related equipment
CN115147451A (en) Target tracking method and device thereof
WO2022161386A1 (en) Pose determination method and related device
CN115115679A (en) Image registration method and related equipment
CN116452778A (en) Augmented reality system, method and equipment for constructing three-dimensional map by multiple devices
WO2023216957A1 (en) Target positioning method and system, and electronic device
CN116205806B (en) Image enhancement method and electronic equipment
CN117152338A (en) Modeling method and electronic equipment
CN112862977A (en) Management method, device and equipment of digital space
CN117727073B (en) Model training method and related equipment
US12056412B2 (en) Image display method and electronic device
EP4414941A1 (en) Augmented reality system, augmented reality scenario positioning method, and device
WO2024046162A1 (en) Image recommendation method and electronic device
CN112783993B (en) Content synchronization method for multiple authorized spaces based on digital map
WO2022222705A1 (en) Device control method and electronic device
CN117635811A (en) Model driving method, system and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication