WO2024060474A1 - 视频生成方法、信息显示方法及计算设备 - Google Patents

视频生成方法、信息显示方法及计算设备 Download PDF

Info

Publication number
WO2024060474A1
WO2024060474A1 PCT/CN2023/071967 CN2023071967W WO2024060474A1 WO 2024060474 A1 WO2024060474 A1 WO 2024060474A1 CN 2023071967 W CN2023071967 W CN 2023071967W WO 2024060474 A1 WO2024060474 A1 WO 2024060474A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
image
information
video
images
Prior art date
Application number
PCT/CN2023/071967
Other languages
English (en)
French (fr)
Inventor
詹鹏鑫
刘奎龙
梅波
Original Assignee
阿里巴巴(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴(中国)有限公司 filed Critical 阿里巴巴(中国)有限公司
Publication of WO2024060474A1 publication Critical patent/WO2024060474A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/20Linear translation of whole images or parts thereof, e.g. panning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting

Definitions

  • Embodiments of the present application relate to the field of computer application technology, and in particular, to a video generation method, an information display method, and a computing device.
  • the video is generated by shooting objects, the shooting cost is high.
  • multiple object images can also be spliced to generate videos.
  • this method The generated video has poor visual effects.
  • Embodiments of the present application provide a video generation method, an information display method, and a computing device to solve the technical problem of poor visual effects of videos in the prior art.
  • embodiments of the present application provide a video generation method, including:
  • a target video is generated.
  • an embodiment of the present application provides a video generation method, including:
  • a target video is generated based on the material information and the multiple target images.
  • embodiments of the present application provide an information display method, including:
  • an image processing request is sent to the server; the image processing request is used by the server to determine at least one original image containing the target object, and construct the at least one original image.
  • embodiments of the present application provide a computing device, including a processing component and a storage component; the storage component stores one or more computer instructions; the one or more computer instructions are used by the processing component Called and executed to implement the video generation method as described in the above first aspect or the video generation method as described in the above second aspect or the information display method as described in the above third aspect.
  • embodiments of the present application provide a computer storage medium that stores a computer program.
  • the computing program When executed by a computer, it implements the video generation method as described in the first aspect or the video generation method as described in the second aspect.
  • At least one original image containing the target object is acquired, a three-dimensional model corresponding to the at least one original image is constructed, and multiple transformation parameters are used to transform the at least one three-dimensional model into multiple target images to determine the corresponding target object.
  • a target video is generated.
  • the embodiment of the present application obtains multiple target images by reconstructing the original image in three dimensions and adjusting the three-dimensional model using transformation parameters, so that the multiple target images are obtained.
  • the target object in the video synthesized from the target image has dynamic effects, and the final target video is generated based on the material information matched by the target object, which improves the visual effect of the video and enables the target object to be better expressed.
  • Figure 1 shows a schematic structural diagram of an embodiment of an information processing system provided by this application
  • FIG. 2 shows a flow chart of one embodiment of the video generation method provided by this application
  • FIG. 3 shows a flow chart of another embodiment of the video generation method provided by this application.
  • Figure 4 shows a schematic diagram of image synthesis and display in a practical application according to the embodiment of the present application
  • FIG. 5 shows a flow chart of another embodiment of the video generation method provided by this application.
  • Figure 6 shows a flow chart of an embodiment of an information display method provided by this application
  • Figure 7 shows a schematic structural diagram of an embodiment of a video generation device provided by this application.
  • Figure 8 shows a schematic structural diagram of an embodiment of an information display device provided by this application.
  • Figure 9 shows a schematic structural diagram of an embodiment of a computing device provided by this application.
  • the technical solutions of the embodiments of this application can be applied to the scene of processing pictures provided by merchants, enterprise users, individual users or design solution providers to generate videos. Since videos are more vivid and vivid than pictures, they have better performance. The visual effect can achieve the purpose of publicity or promotion or beautification of the object.
  • the objects involved in the embodiments of this application may refer to people, animals, objects, etc.
  • the objects may also be virtual products provided in the online system for user interaction, such as purchasing, browsing, etc.
  • the virtual products may Corresponding to offline physical items, etc., when the online system is an online transaction system, the object can specifically refer to commodities, etc.
  • images or videos are often used to describe the object to better promote the object to users.
  • the inventor proposed the technical solution of the embodiment of the present application.
  • the original image can be modeled, and multiple target images can be generated through multiple transformation parameters.
  • the multiple target images can express the target object from different visual angles, and adding designed materials can significantly improve the visual effect and fully express the characteristics of the object.
  • the technical solution of the embodiment of the present application can be applied to the information processing system shown in Figure 1.
  • the information processing system can be a system with image processing functions, etc. In practical applications, the information processing system can, for example, provide object interactive processing operations.
  • An online system such as an online trading system that provides commodity trading, or other processing systems connected to the online trading system, so that the generated video can be published on the online trading system.
  • the information processing system may include a client 101 and a server 102.
  • a connection is established between the user terminal 101 and the server terminal 102 through a network.
  • the network provides a medium for the communication link between the user terminal 101 and the server terminal 102.
  • the network may include various connection types, such as wired, wireless communication links or optical fiber cables, etc.
  • the server can be connected to the user terminal through a mobile network.
  • the network standard of the mobile network can be any one of 2G (GSM), 2.5G (GPRS), 3G (WCDMA, TD-SCDMA, CDMA2000, UTMS), 4G (LTE), 4G+ (LTE+), 5G, WiMax, etc.
  • the user terminal can also establish a communication connection with the server through Bluetooth, WiFi, infrared, etc.
  • the client 101 can be a browser, an APP (Application, application), or a web application such as an H5 (HyperText Markup Language 5, Hypertext Markup Language 5th Edition) application, or a light application (also known as a small program, a small program). (lightweight application) or cloud application, etc., the client 101 can be deployed in an electronic device and needs to rely on the operation of the device or certain apps in the device to run, etc.
  • the electronic device can, for example, have a display and support information browsing, etc.
  • it can be a personal mobile terminal such as a mobile phone, a tablet, a personal computer, etc.
  • the client is mainly represented by a device image in Figure 1 .
  • Various other types of applications can also be configured in electronic devices, such as search, instant messaging, etc.
  • the server 102 may include one or more servers that provide various services, that is, it may be implemented as a distributed server cluster consisting of multiple servers, or as a single server. In addition, it may also be a server of a distributed system, or a server combined with a blockchain, or a cloud server, or an intelligent cloud computing server or intelligent cloud host with artificial intelligence technology.
  • the user can interact with the server 102 through the client 101 to receive or send messages, etc.
  • the server 102 can interact with the server 102 through the client 101 to receive or send messages, etc.
  • the server 102 can obtain the image processing request of the client 101, construct a corresponding three-dimensional model for at least one original image containing the target object in the request, and use multiple transformation parameters to convert at least A three-dimensional model is transformed into multiple target images, the material information corresponding to the target object is determined, a target video is generated based on the multiple target images and material information, and the target video is sent to the user terminal 101 for the user terminal 101 to output the target. video.
  • the video generation method provided in the embodiment of the present application is generally executed by the server 102, and the corresponding video generation device is generally provided in the server 102.
  • the information display method provided in the embodiment of the present application is generally executed by the service client. 101 is executed, and the corresponding information display device is generally provided in the user terminal 101.
  • the client 101 may also have similar functions to the server 102, thereby executing the video generation method provided by the embodiments of the present application.
  • the video generation method provided by the embodiment of the present application can also be jointly executed by the user terminal 101 and the server terminal 102.
  • Figure 2 is a flow chart of an embodiment of a video generation method provided by the embodiment of this application.
  • the technical solution of this embodiment can be executed by the server.
  • the method can include the following steps:
  • the target object may be a commodity provided by the online trading system, and a commodity description page corresponding to each commodity is provided in the online trading system.
  • the commodity description page usually introduces the commodity in detail in the form of pictures and texts, that is, the commodity description page includes a commodity picture. Therefore, as another optional method, it is possible to receive an image processing request from a user, determine the index information corresponding to the target object included in the image processing request, and then identify the original image containing the target object from the object description page corresponding to the target object based on the index information.
  • the index information can be linked to the object description page so that the original image containing the target object can be obtained from the object description page.
  • receiving the user's image processing request may be to send image processing prompt information to the user terminal so that the user terminal can display the image processing prompt information in the display interface.
  • the image processing request is in response to the user's request for the image. Sent by image processing operations triggered by processing prompt information.
  • the original image may be an image obtained by photographing and collecting the target object, or an image obtained by performing corresponding processing on the image obtained by photographing and collecting the target object.
  • the target object is the main object in the original image, that is, the main content in the image.
  • Three-dimensional modeling can be performed on at least one original image through three-dimensional reconstruction, so that a three-dimensional model corresponding to each original image can be obtained.
  • constructing a three-dimensional model corresponding to each original image may be based on the pixel depth value of the original image and using a three-dimensional reconstruction model to construct a corresponding three-dimensional model.
  • the three-dimensional reconstruction model may be obtained by training based on the pixel depth value of the sample image and the three-dimensional model corresponding to the sample image.
  • the above-mentioned three-dimensional reconstruction of at least one original image can obtain a three-dimensional model of each original image, so at least one three-dimensional model can be obtained.
  • Each three-dimensional model may be transformed according to the plurality of transformation parameters, thereby obtaining multiple target images corresponding to each three-dimensional model; of course, each three-dimensional model may also be transformed according to its respective among the plurality of transformation parameters.
  • the corresponding transformation parameters are transformed to obtain multiple target images corresponding to each three-dimensional model.
  • the embodiment of the present application obtains multiple target images by reconstructing the original image through three-dimensional reconstruction and using transformation parameters to adjust the three-dimensional model, so that the target object in the video synthesized from the multiple target images has a dynamic effect and improves the quality of the video. Visual effects make it possible to better express the target object.
  • generating a target video may be generating the target video by splicing the multiple target images.
  • the plurality of target images can be spliced and generated according to a certain arrangement order.
  • the arrangement order of the plurality of target images can be determined based on the arrangement order of at least one original image and the arrangement order of a plurality of transformation parameters.
  • generating a target video based on multiple target images may include: determining material information matching the target object; generating a target video based on multiple target images and material information. . Therefore, as another embodiment, as shown in the flow chart of the video generation method in Figure 3, the method may include the following steps:
  • the embodiment of the present application obtains multiple target images by reconstructing the original image through three-dimensional reconstruction and using transformation parameters to adjust the three-dimensional model, so that the target object in the video synthesized from the multiple target images has a dynamic effect, and combined with the target object
  • the matched material information generates the final target video, which improves the visual effect of the video and makes it better expression target object.
  • the method of generating the target video based on the multiple target images and the material information may be to synthesize the multiple target images with the material information, and generate the target video based on the multiple target images after the synthesis process.
  • the material information may include at least one material image
  • the synthesis process of at least one material image and multiple target images may be to determine the corresponding relationship between at least one material image and at least one target image, and combine any one The source image is combined into its corresponding target image.
  • determining the material information matching the target object may be determining at least one material image according to the object category to which the target object belongs.
  • the target category may be, for example, food, clothing, personal care, etc.
  • determining at least one material image may be by presetting a correspondence between the object category and the material image, inputting the object category according to the correspondence, and searching for the material image corresponding to the object category.
  • one target image can correspond to at least one material image.
  • at least one material image and at least one target image can correspond to one-to-one, that is, one material image only corresponds to one target image.
  • Multiple material images corresponding to the object category and matching the number of images can be determined by combining the object category and the number of images of the multiple target images.
  • the plurality of material images may be in the form of material videos, and the corresponding object categories are preset, etc.
  • the plurality of material images are image frames included in the material video.
  • the matching of the number of images between the multiple material images and the multiple target images may be the same as the number of images in the multiple target images, or the number difference is within a specified range, etc.
  • the multiple material images have a one-to-one correspondence with the multiple target images.
  • the arrangement order of images determines a one-to-one correspondence. For example, material images with the same arrangement position have a one-to-one correspondence with target images.
  • At least one target image can be firstly screened from the multiple target images according to the number of images of the material images, for example, they can be selected in sequence starting from the first image in the order of arrangement, etc.; if it is greater than, multiple material images can be screened from at least one material image according to the number of images of the multiple target images, for example, they can be selected in sequence starting from the first image in the order of arrangement, etc.
  • the material images corresponding to different object identifiers may be preset, so that at least one corresponding material image may be determined based on the object identifier; or, the object features of the target object may be identified, and at least one material image that meets the matching requirements may be determined based on the matching degree between the image features and the object features of the material image; wherein the image features and the object features are The matching degree of the object features can be calculated based on a matching model generated by pre-training, etc.
  • synthesizing any material image into its corresponding target image can be: according to any material image and its corresponding target image, based on the object position of the target object in the target image, determining the synthesis area; according to the synthesis area, adjusting the image size and synthesis direction of the material image; according to the synthesis direction, synthesizing the adjusted material image into the synthesis area.
  • the compositing methods for compositing any source image to its corresponding target image can include: filter, overlay, soft light, hard light, bright light, solid color mixing, opacity, multiply, color burn, and color dodge
  • filter, overlay, soft light, hard light, bright light, solid color mixing, opacity, multiply, color burn, and color dodge One or more of the above can be used to pre-set the synthesis methods corresponding to different material images, etc.
  • the material information may include copy information
  • the synthesis process of the copy information and multiple target images may be to synthesize the copy information into at least one target image among the multiple target images.
  • the copy information may be generated based on the object-related information of the target object.
  • the object-related information of the target object may include, for example, one or more of object description information, object evaluation information, object category information, and object images.
  • the object image may refer to, for example, the above-mentioned original image or other images containing the target object.
  • the object description information may refer to relevant information in the object description page, such as relevant information including object name, object price, object origin, etc.
  • the object evaluation information may be comments made by the user on the target object, etc.
  • the copy information can be determined based on the object-related information of the target object using a copy generation model.
  • the copywriting generation model can be trained based on the object-related information of the sample object and its corresponding sample copywriting information.
  • the copy information may be synthesized into at least one target image among the plurality of target images.
  • the copy information may be superimposed on the target image.
  • At least one target image among the plurality of target images to be synthesized with the copy information may be to select a certain number of target images starting from the first image according to the arrangement order of the plurality of target images, and synthesize the copy information to the above-selected in at least one target image.
  • the material information may include at least one material image and copy information. Based on multiple target images and material information, a target video is generated. Composite processing of material information and multiple target images can give priority to compositing at least one material image, and then synthesize the copy information. Of course, you can also prioritize the synthesis of copy information and multiple target images. There is no limit to this.
  • the material information may include audio data matching the target object.
  • generating a target video may include: splicing multiple target images to generate a candidate video, and fusing the audio data with the candidate video to obtain the target video.
  • the material information can also include at least one material image, copy information and audio data.
  • At least one material image and copy information can be synthesized with multiple target images, and then the multiple target images after synthesis can be spliced to generate a candidate video. , and then fuse the audio data with the candidate video to obtain the final target video.
  • FIG. 4 shows a schematic display diagram of synthesizing material images and copy information into at least one target image.
  • FIG. 4 for the original image 401, after constructing its corresponding three-dimensional model, multiple transformation parameters are used to transform the three-dimensional model into multiple target images according to different transformation parameters.
  • two target images are taken as an example. They are target images 402 and 403.
  • the target object is a bowl of porridge, the object category it belongs to is hot food, and the material image corresponding to this category is a steaming material image.
  • two material images are taken as an example.
  • the material image 404, the material image 405, and the copy information 406 are "steaming" are taken as an example for explanation.
  • the target image 402 is synthesized with the material image 404 and the copy information "Steaming" to obtain the target image 407.
  • the target image 403 is synthesized with the material image 405 and the copy information "Steaming" to obtain the target image 408.
  • the target image 407 and the target image 408 can be spliced in a certain sorting order to generate a target video. The specific method of determining the order of the multiple target images can be found in the foregoing description, and will not be repeated here.
  • using multiple transformation parameters to transform at least one three-dimensional model into multiple target images may be by determining multiple transformation parameters corresponding to at least one mirror movement effect, and using multiple transformation parameters to transform at least one three-dimensional model into multiple target images. Transform into multiple target images.
  • the multiple transformation parameters may be used to perform model adjustment and projection transformation on at least one three-dimensional model to generate multiple target images.
  • Each transformation parameter can be composed of a transformation matrix and a projection matrix.
  • the transformation matrix can include a rotation matrix. In addition, it can also include one or more of a translation matrix and a scaling matrix.
  • the projection matrix is used to project the three-dimensional model into a two-dimensional picture. , to obtain the target image.
  • Each mirror movement effect can correspond to multiple transformation parameters, and the display effect of multiple target images generated by the multiple transformation parameters corresponding to each mirror movement effect can express the mirror movement effect.
  • the at least one lens movement effect may include, for example: translation, rotation in any direction, front, back, left, right, up and down, Hitchcock zoom (Dolly Zoom), etc.
  • using multiple transformation parameters to transform at least one three-dimensional model into multiple target images may be to determine the virtual cameras corresponding to the at least one three-dimensional model; Using virtual cameras respectively corresponding to at least one three-dimensional model, the at least one three-dimensional model is projected into a plurality of target images based on the plurality of transformation parameters.
  • each three-dimensional model can be projected into multiple target images.
  • the three-dimensional model can be projected into multiple target images by transforming the internal and external parameters of the virtual camera and adjusting the camera's position, angle, focal length, aperture, etc.
  • the layers are like films containing elements such as text or graphics. They are stacked together in order to form the final effect of the image.
  • the occlusion relationship between adjacent layers may change, resulting in color damage.
  • multiple transformation parameters are used, Transform at least one 3D model into multiple
  • the target image may also include: determining multiple layers contained in the original image; determining a boundary area corresponding to one layer on the other layer among two adjacent layers; using the boundary area corresponding to the original image as The boundary areas corresponding to the plurality of target images respectively; any boundary area in any target image is filled with the target color.
  • the junction area may refer to the position area corresponding to one layer on another layer to generate occlusion on the other layer. Due to the transformation process of the three-dimensional model, the junction area may be exposed and no longer visible. is blocked, so it needs to be filled with color to ensure the display effect.
  • the corresponding junction area on each layer is the junction area corresponding to the original image. However, for multiple target images generated based on the original image, the layer will not change. Therefore, the junction area corresponding to the original image is also the junction area for each target. The intersection area corresponding to the image.
  • the entire interface area may be filled with the target color, and the target color may be a preset color.
  • the target color may be determined based on the color of the layer where the interface area is located. For example, it may be The fusion value of each pixel color of the layer where the junction area is located or the color with the largest proportion of each pixel color, etc. In addition, it can also be determined based on the pixel color of the surrounding area corresponding to the junction area.
  • filling any boundary area in any target image with the target color can be based on the transformation parameters corresponding to any target image, determining the target area in any boundary area in the target image that meets the filling requirements, and based on the corresponding transformation parameters of the target area.
  • the pixel color of the surrounding area determines the target color and fills the target area with the target color.
  • the filling requirement may refer to an unoccluded area, and may be determined based on the transformation parameters corresponding to the target object, etc.
  • the target color can be determined based on the pixel color of the surrounding area corresponding to the target area using a color filling model.
  • the color filling model can be trained in advance based on the pixel color of the sample area and its corresponding sample target color.
  • the surrounding area may also refer to an area within a certain distance from the target area, or an area composed of several pixels around the target area, etc.
  • each boundary area in each target image can be filled with color.
  • color filling can also be performed on the boundary area of the target area and the target image that require filling.
  • generating a target video based on multiple target images and material information may include: synthesizing the material information and multiple target images; and generating a target video based on the multiple target images after the synthesis process.
  • generating the target video may be to determine the arrangement order of the plurality of target images according to the arrangement order of at least one original image and the arrangement order of the plurality of transformation parameters; according to the arrangement order of the plurality of target images In the order of arrangement, multiple target images are spliced to generate a target video.
  • the arrangement order of the plurality of transformation parameters may be determined based on the arrangement order of at least one lens movement effect.
  • the arrangement order of at least one original image and the arrangement order of at least one lens movement effect can be preset or determined according to user needs.
  • the image processing request can include the arrangement order of at least one original image and at least one lens movement effect. sorting order, etc.
  • the order of the images determined according to the generation time is: target image A3, target image A4; the original image B obtains the target image B1 according to the effect A hypothesis, and obtains the target image B2 according to the effect B hypothesis, then the final order of the multiple target images is obtained It can be: target image A1, target image A2, target image A3, target image A4, target image B1, and target image B2.
  • target image A3, target image A4 the original image B obtains the target image B1 according to the effect A hypothesis
  • target image B2 according to the effect B hypothesis
  • the method can also include:
  • the target video is sent to the user terminal for the user terminal to output the target video.
  • the download prompt information may be sent to the user terminal so that the user terminal outputs the target video and the download prompt information at the same time.
  • the user terminal may also save the target video to a local file corresponding to the user terminal.
  • the release prompt information can also be sent to the user terminal, so that the user terminal outputs the update prompt information while outputting the target video.
  • the client can also send an update request to the server, and the server can update the object description page based on the update request and using the target video.
  • the publishing prompt information can also be sent to the client, so that the client can output the publishing prompt information while playing the target video.
  • the client can also send a publishing request to the server.
  • the server can publish the target video to the object promotion page based on the publishing request, where the target video and object description page can be established.
  • the link relationship is used to detect the trigger operation for the target video on the object promotion page, and jump to the object description page based on the link relationship to facilitate the user to perform interactive operations on the target object, etc.
  • the method can also include:
  • the server After the server generates the target video, it can directly use the target video to update the object description page. Of course, it can also send update prompt information to the client. After the user confirms the update, it sends an update request through the client, and the server then uses the target video. Updated object description page. Wherein, using the target video to update the object description page may include using the target video to replace the existing video in the object description page or adding the target video to the object description page, etc.
  • the method may also include: establishing a link relationship between the target video and the object description page; publishing the target video to the object promotion page to detect the target video on the object promotion page. Trigger the operation and jump to the object details page based on the link relationship.
  • the server can also first send publishing prompt information to the user, and then after receiving the publishing request, establish a link relationship between the target video and the object description page, and publish the target video to the object promotion page, etc.
  • the embodiment of the present application can be applied in an online transaction scenario.
  • the target object may refer to the target commodity provided by the online transaction system.
  • the following takes the target object as the target commodity as an example.
  • the technology of this application is The solution is explained, as shown in Figure 5, which is a flow chart of another embodiment of a video generation method provided by the embodiment of the present application.
  • the technical solution of this embodiment can be executed by the server.
  • the method can include the following steps: :
  • the user's image processing request may be received and the product image included in the image processing request may be obtained.
  • Product image of the product can be received from the user, determine the index information corresponding to the target product contained in the image processing request, and then identify the target product from the product description page corresponding to the target product based on the index information.
  • receiving the user's image processing request may be to send image processing prompt information to the user terminal so that the user terminal can display the image processing prompt information in the display interface.
  • the image processing request is in response to the user's request for the image. Sent by image processing operations triggered by processing prompt information.
  • Constructing a three-dimensional model corresponding to the product image may be based on the pixel depth value of the product image and using a three-dimensional reconstruction model to construct a corresponding three-dimensional model.
  • the three-dimensional reconstruction model is obtained by training based on the pixel depth value of the sample image and the three-dimensional model corresponding to the sample image.
  • the embodiment of the present application obtains multiple target images by reconstructing the product image through three-dimensional and using transformation parameters to adjust the three-dimensional model, so that the target object in the video synthesized from the multiple target images has a dynamic effect, and combined with the target object Matching material information generates the final target video, which improves the visual effect of the video and enables the target object to be better expressed.
  • the method may further include:
  • the download prompt information can also be sent to the client, so that the user terminal can output the download prompt information while playing the target video.
  • the client may also save the target video to a local file corresponding to the client;
  • the update prompt information can also be sent to the client, so that the user terminal can output the update prompt information while playing the target video.
  • the user terminal can also send an update request to the server.
  • the server can update the object description page based on the update request and use the target video.
  • the publishing prompt information can also be sent to the client, so that the client can output the publishing prompt information while playing the target video.
  • the client can also send a publishing request to the server.
  • the server can publish the target video to the product promotion page based on the publishing request, where the target video and product details page can be established.
  • the link relationship is used to detect the triggering operation for the target video on the product promotion page, and jump to the product details page based on the link relationship to facilitate users to purchase products, etc.
  • the product promotion page is also a multiple product aggregation page, used for product introduction and promotion. Good video quality can attract users to click on the target video, thereby helping to increase the product purchase rate and product conversion rate.
  • the method may also include: updating the product details page with the target video.
  • the server uses the target video to update the product details page.
  • it can also send an update prompt message to the user.
  • the method may also include: establishing a link relationship between the target video and the product details page; publishing the target video to the product promotion page to detect the target video on the product promotion page. Trigger the action and jump to the product details page based on the link relationship.
  • the server can also first send publishing prompt information to the user, and then after receiving the publishing request, establish a link relationship between the target video and the product details page, and publish the target video to the product promotion page.
  • Figure 6 is a flow chart of an embodiment of an information display method provided by an embodiment of the present application.
  • the technical solution of this embodiment can be executed by the user end.
  • the method can include the following steps:
  • 602 Display image processing prompt information on the display interface.
  • an image processing request is sent to the server, where the image processing request is used by the server to determine at least one original image containing the target object, construct a three-dimensional model corresponding to the at least one original image, transform the at least one three-dimensional model into multiple target images using multiple transformation parameters, determine the material information matching the target object, and generate a target video based on the multiple target images and the material information.
  • the server determines at least one original image containing the target object according to the image processing request, and then generates the target video.
  • the video generation method can be described in detail in the embodiment shown in Figure 3. The same or corresponding steps will not be repeated here. .
  • the target video generated by the server is sent to the user end, so that the user end can play the video image of the target video in the display interface, and when the target video contains audio data, the audio data can be played in combination with the audio playback component.
  • the method may further include: saving the target video to a local file corresponding to the user in response to the user's download operation.
  • the download prompt information sent by the server can be obtained, and the download prompt information can be displayed on the display interface.
  • the download operation can be triggered in response to the download prompt information.
  • the method after playing the target video on the display interface, the method can also include:
  • an update request is sent to the server, and the server can update the object description page based on the update request using the target video.
  • the update prompt information sent by the server may be obtained and displayed on the display interface.
  • the update operation may be initiated in response to the update prompt information.
  • the method after playing the target video on the display interface, the method can also include:
  • a publishing request is sent to the server, and the server can use the target video to publish to the object promotion page based on the publishing request.
  • the publishing prompt information sent by the server can be obtained, and the publishing prompt information can be displayed on the display interface.
  • the publishing operation can be initiated based on the publishing prompt information.
  • Figure 7 is a schematic structural diagram of an embodiment of a video generation device provided by an embodiment of the present application.
  • the device may include:
  • the first acquisition module 701 is used to acquire at least one original image containing a target object
  • the first three-dimensional construction module 702 used to construct a three-dimensional model corresponding to at least one original image
  • First projection module 703 used to transform at least one three-dimensional model into multiple target images using multiple transformation parameters
  • the first material confirmation module 704 used to determine material information matching the target object
  • the first video generation module 705 is used to generate a target video based on multiple target images and material information.
  • the first acquisition module acquires at least one original image containing the target object, which may be by receiving an image processing request from a user and acquiring at least one original image included in the image processing request.
  • the user's image processing request may be received, the index information corresponding to the target object contained in the image processing request is determined, and then based on the index information, the object description page corresponding to the target object is obtained. Identify the original image containing the target object.
  • the first projection module uses multiple transformation parameters to transform at least one three-dimensional model into multiple target images by determining multiple transformation parameters corresponding to at least one mirror movement effect, and using multiple transformation parameters to transform at least one three-dimensional model. for multiple target images.
  • a virtual camera corresponding to at least one three-dimensional model may be determined, and the virtual camera is used to project the at least one three-dimensional model into multiple target images according to multiple transformation parameters corresponding to the virtual camera.
  • the first video generation module may generate a target video based on multiple target images and material information by synthesizing multiple target images and the material information, and generating a target video based on the multiple target images after synthesis.
  • the material information may include at least one material image, and the synthesis process of at least one material image and multiple target images may be to determine a one-to-one correspondence between at least one material image and at least one target image, and synthesize any one material image into its in the corresponding target image.
  • the first material confirmation module determines the one-to-one correspondence between at least one material image and at least one target image, which may be by determining the at least one material image based on the object category to which the target object in the at least one target image belongs.
  • Synthesizing any material image into its corresponding target image may include determining the synthesis area based on any material image and its corresponding target image and based on the object position of the target object in the target image, and adjusting the synthesis area according to the synthesis area. According to the image size and synthesis direction of the material image, the adjusted material image is synthesized into the synthesis area according to the synthesis direction.
  • the material information may further include text information, and synthesizing the text information with the multiple target images may be synthesizing the text information into at least one target image among the multiple target images.
  • the first material confirmation module determines the copy information corresponding to the target object.
  • the copy information may be generated based on the object-related information of the target object, such as object description information, evaluation information, category information, etc.
  • the first video generation module is used to generate a target video based on multiple target images and material information, which may be to determine the order of the multiple target images according to the order of arrangement of at least one original image and the order of transformation parameters corresponding to at least one three-dimensional model.
  • Arrangement order splicing multiple target images to generate a target video according to the order in which the multiple target images are arranged.
  • the target video after generating a target video based on multiple target images, can also be sent to the user terminal for the user terminal to output the target video.
  • the video generation device shown in Figure 7 can be applied to an e-commerce scenario.
  • the first acquisition module can specifically acquire the product image of the target product
  • the first three-dimensional building module can specifically construct a three-dimensional model corresponding to the product image.
  • the video generation device shown in Figure 7 can execute the video generation method described in the embodiment shown in Figure 3, and its implementation principles and technical effects will not be described again.
  • the specific manner in which each module and unit of the information processing device in the above embodiment performs operations has been described in detail in the embodiment of the method, and will not be described in detail here.
  • Figure 8 is a schematic structural diagram of an embodiment of an information display device provided by an embodiment of the present application.
  • the device may include:
  • the first output module 801 provides a display interface
  • the first display module 802 displays image processing prompt information on the display interface
  • the first request processing module 803 In response to the image processing operation triggered by the image processing prompt information, send an image processing request to the server.
  • the image processing request is used by the server to determine the original image containing the target object;
  • the second display module 804 plays the target video on the display interface.
  • the information display device shown in Figure 8 can be implemented as the user terminal described above, and the information display device shown in Figure 8 can execute the information display method of the embodiment shown in Figure 6. Its implementation principles are as follows: The technical effects will not be described in detail. The specific manner in which each module and unit of the information display device in the above embodiment performs operations has been described in detail in the embodiment of the method, and will not be described in detail here.
  • An embodiment of the present application also provides a computing device, as shown in Figure 9.
  • the computing device may include a storage component 901 and a processing component 902;
  • the storage component 901 stores one or more computer instructions, wherein the one or more computer instructions are called by the processing component to implement the video generation described in the embodiment shown in Figure 2 or Figure 3 or Figure 5 method.
  • the device may also include other components, such as input/output interfaces, display components, communication components, etc.
  • the computing device may also include a display component to perform corresponding display operations.
  • the input/output interface provides an interface between the processing component and the peripheral interface module.
  • the above-mentioned peripheral interface module can be an output device, an input device, etc.
  • the communication component is configured to facilitate wired or wireless communication, etc., between the computing device and other devices.
  • the processing component 902 may include one or more processors to execute computer instructions to complete all or part of the steps in the above method.
  • the processing component can also be one or more application-specific integrated circuits (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), field programmable gate array (FPGA) , controller, microcontroller, microprocessor or other electronic component implementation for executing the above method.
  • ASIC application-specific integrated circuits
  • DSP digital signal processor
  • DSPD digital signal processing device
  • PLD programmable logic device
  • FPGA field programmable gate array
  • the storage component 901 is configured to store various types of data to support operations at the terminal.
  • the storage component can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), Electrically Erasable Programmable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or CD.
  • SRAM static random access memory
  • EEPROM Electrically Erasable Programmable Read Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or CD.
  • the display component may be an electroluminescent (EL) element, a liquid crystal display or a microdisplay with a similar structure, or a retinal direct display or similar laser scanning display.
  • EL electroluminescent
  • the display component may be an electroluminescent (EL) element, a liquid crystal display or a microdisplay with a similar structure, or a retinal direct display or similar laser scanning display.
  • EL electroluminescent
  • the above computing device implements the video generation method shown in Figure 2 or Figure 3 or Figure 5, it can be a physical device or an elastic computing host provided by a cloud computing platform, etc. It can be implemented as a distributed cluster composed of multiple servers or terminal devices, or as a single server or single terminal device.
  • the above computing device implements the information display method shown in Figure 6, it can be specifically implemented as an electronic device.
  • the electronic device can refer to a device used by the user and has functions such as computing, Internet access, and communication required by the user. For example, it can be a mobile phone. , tablets, personal computers, wearable devices, etc.
  • Embodiments of the present application also provide a computer-readable storage medium that stores a computer program.
  • the computer program When the computer program is executed by a computer, the video generation method of the embodiment shown in FIG. 2 or the video generation of the embodiment shown in FIG. 3 can be implemented.
  • the method is either the video generation method of the embodiment shown in FIG. 5 or the information display method of the embodiment shown in FIG. 6 .
  • the computer-readable medium may be included in the electronic device described in the above embodiments; it may also exist separately without being assembled into the electronic device.
  • Embodiments of the present application also provide a computer program product, which includes a computer program carried on a computer-readable storage medium.
  • the computer program can implement the video generation method of the embodiment shown in Figure 2 or Figure 3
  • the computer program may be downloaded and installed from the network, and/or installed from removable media.
  • the computer program is executed by the processor, various functions defined in the system of the present application are performed.
  • the processing components mentioned in the above corresponding embodiments may include, for example, one or more processors to execute computer instructions to complete all or part of the steps in the above method.
  • the processing component can also be one or more application-specific integrated circuits (ASIC), digital signal processor (DSP), digital signal processing device (DSPD), programmable logic device (PLD), field programmable gate array (FPGA) , controller, microcontroller, microprocessor or other electronic component implementation for executing the above method.
  • ASIC application-specific integrated circuits
  • DSP digital signal processor
  • DSPD digital signal processing device
  • PLD programmable logic device
  • FPGA field programmable gate array
  • the storage component is configured to store various types of data to support operations on the terminal.
  • the storage component can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable Programmable read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory magnetic memory
  • flash memory magnetic or optical disk.
  • the display component may be an electroluminescent (EL) element, a liquid crystal display or a microdisplay with a similar structure, or Retina direct displays or similar laser-scanned displays.
  • EL electroluminescent
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples of computer readable storage media may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard drive, random access memory (RAM), read only memory (ROM), removable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any of the above suitable The combination. As used herein, a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the device embodiments described above are only illustrative.
  • the units described as separate components may or may not be physically separated.
  • the components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. Persons of ordinary skill in the art can understand and implement the method without any creative effort.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Architecture (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本申请实施例提供一种视频生成方法、信息显示方法及计算设备。其中,获取包含目标对象的至少一个原始图像;构建所述至少一个原始图像分别对应的三维模型;利用多个变换参数,将至少一个三维模型变换为多个目标图像;确定所述目标对象对应的素材信息;基于所述多个目标图像以及所述素材信息,生成目标视频。本申请实施例提供的技术方案提高了视频的视觉效果,提高视频质量。

Description

视频生成方法、信息显示方法及计算设备
本申请要求于2022年09月22日提交中国专利局、申请号为202211160552.X、申请名称为“视频生成方法、信息显示方法及计算设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机应用技术领域,尤其涉及一种视频生成方法、一种信息显示方法及一种计算设备。
背景技术
视频相较于图片,由于更加形象和生动,具有更好的视觉效果等优势,逐渐成为了进行对象的推广、宣传、或美化的主要方式之一。
由于采用对对象进行拍摄的手段来生成视频,拍摄成本较高,现有技术中,存在对象图像的情况下,也可以采用将多张对象图像进行拼接的方式,来生成视频,但是这种方式生成的视频,视觉效果较差。
发明内容
本申请实施例提供一种视频生成方法、信息显示方法及计算设备,用以解决现有技术中视频的视觉效果较差的技术问题。
第一方面,本申请实施例中提供了一种视频生成方法,包括:
获取包含目标对象的至少一个原始图像;
构建所述至少一个原始图像分别对应的三维模型;
利用多个变换参数,将至少一个三维模型变换为多个目标图像;
确定所述目标对象匹配的素材信息;
基于所述多个目标图像以及所述素材信息,生成目标视频。
第二方面,本申请实施例中提供了一种视频生成方法,包括:
获取目标商品的商品图像;
构建所述商品图像对应的三维模型;
利用多个变换参数,将所述三维模型变换为多个目标图像;
确定所述目标商品匹配的素材信息;
基于所述素材信息及所述多个目标图像,生成目标视频。
第三方面,本申请实施例中提供了一种信息显示方法,包括:
提供显示界面;
在所述显示界面显示图像处理提示信息;
响应于针对所述图像处理提示信息触发的图像处理操作,向服务端发送图像处理请求;所述图像处理请求用于所述服务端确定包含目标对象的至少一个原始图像,构建所述至少一个原始图像分别对应的三维模型;利用多个变换参数,将至少一个三维模型变换为多个目标图像;确定所述目标对象匹配的素材信息;基于所述多个目标图像以及所述素材信息,生成目标视频;
在所述显示界面播放所述目标视频。
第四方面,本申请实施例中提供了一种计算设备,包括处理组件、存储组件;所述存储组件存储一个或多个计算机指令;所述一个或多个计算机指令用于被所述处理组件调用并执行,以实现如上述第一方面所述的视频生成方法或者如上述第二方面所述的视频生成方法或者如上述第三方面所述的信息显示方法。
第五方面,本申请实施例中提供了一种计算机存储介质,存储有计算机程序,所述计算程序被计算机执行时,实现如上述第一方面所述的视频生成方法或者如上述第二方面所述的视频生成方法或者如上述第三方面所述的信息显示方法。
本申请实施例中,获取包含目标对象的至少一个原始图像,构建该至少一个原始图像分别对应的三维模型,利用多个变换参数,将至少一个三维模型变换为多个目标图像,确定目标对象对应的素材信息,基于多个目标图像以及素材信息,生成目标视频,本申请实施例通过将原始图像经由三维重建,并利用变换参数对三维模型进行调整之后获得多个目标图像,使得由该多个目标图像合成的视频中目标对象具备了动态效果,且结合目标对象匹配的素材信息生成最终目标视频,提高了视频的视觉效果,使得可以更好的表达目标对象。
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的 一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本申请提供的一种信息处理系统一个实施例的结构示意图;
图2示出了本申请提供的视频生成方法一个实施例的流程图;
图3示出了本申请提供的视频生成方法又一个实施例的流程图;
图4示出了本申请实施例在一个实际应用中的图像合成显示示意图;
图5示出了本申请提供的视频生成方法又一个实施例的流程图;
图6示出了本申请提供的一种信息显示方法一个实施例的流程图;
图7示出了本申请提供的一种视频生成装置一个实施例的结构示意图;
图8示出了本申请提供的一种信息显示装置一个实施例的结构示意图;
图9示出了本申请提供的一种计算设备一个实施例的结构示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
在本申请的说明书和权利要求书及上述附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如101、102等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
本申请实施例的技术方案可以应用于对商家、企业用户、个人用户或者设计方案提供商等提供的图片进行处理,生成视频的场景中,由于视频相较于图片更加形象和生动,具有更好的视觉效果,因此可以实现对对象宣传或推广或者美化等目的。本申请实施例中所涉及的对象可以是指人、动物、物体等,当然,该对象也可以是线上系统中所提供可供用户交互,如购买、浏览等的虚拟产品,该虚拟产品可以对应有线下实体物品等,线上系统为网上交易系统时,该对象可以具体是指商品等,在线上系统中往往会采用图像或视频来描述对象,以向用户更好的推广对象。
为了降低拍摄生成视频的方式所带来的制作成本,目前通常采取将多张包含目标对象的原始图像进行拼接的方式来生成视频,但是这种方式生成的视频不够生动自然,视觉效果较差,无法有效表达目标对象的特性等。为了提高视觉效果,获得优质视频,并降低视 频制作成本等,发明人经过一系列研究,提出了本申请实施例的技术方案,在本申请实施例中,可以对原始图像进行建模,通过多个变换参数,可以生成多个目标图像,该多个目标图像可以实现从不同视觉角度表达目标对象,再加入设计的素材可以显著提升视觉效果,充分表达对象的特性。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例的技术方案可以适用于图1所示的信息处理系统中,该信息处理系统可以是具备图像处理功能的系统等,实际应用中,该信息处理系统例如可以为提供对象交互处理操作的线上系统,如提供商品交易的网上交易系统,或者也可以是与网上交易系统连接的其它处理系统,从而可以将生成的视频在网上交易系统进行发布等。该信息处理系统可以包括用户端101以及服务端102。
其中,用户端101与服务端102之间通过网络建立连接。网络为用户端101与服务端102之间提供了通信链路的介质。网络可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。可选地,服务端可通过移动网络和用户端通信连接,相应地,移动网络的网络制式可以为2G(GSM)、2.5G(GPRS)、3G(WCDMA、TD-SCDMA、CDMA2000、UTMS)、4G(LTE)、4G+(LTE+)、5G、WiMax等中的任意一种。可选地,用户端也可以通过蓝牙、WiFi、红外线等方式和服务端建立通信连接。
其中,用户端101可以为浏览器、APP(Application,应用程序)、或网页应用如H5(HyperText Markup Language5,超文本标记语言第5版)应用、或轻应用(也被称为小程序,一种轻量级应用程序)或云应用等,用户端101可以部署在电子设备中,需要依赖设备运行或者设备中的某些app而运行等,电子设备例如可以具有显示屏并支持信息浏览等,如可以是个人移动终端如手机、平板电脑、个人计算机等,为了便于理解,图1中主要以设备形象表示用户端。在电子设备中还可以配置各种其它类应用,如搜索类、即时通信类等。
服务端102可以包括提供各种服务的一个或多个服务器,也即其可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器,此外也可以为分布式系统的服务器,或者是结合了区块链的服务器,也可以是云服务器,或者是带人工智能技术的智能云计算服务器或智能云主机等。
用户可以通过用户端101与服务端102实现交互,以接收或发送消息等,在本申请实施例中的应用场景中例如可以实现获取包含目标对象的原始图像,并感知用户相应操作发送相应处理请求以实现将原始图像生成目标视频。
比如,在本申请实施例中,可以由服务端102获取用户端101的图像处理请求,对请求中相应的包含目标对象的至少一个原始图像构建对应的三维模型,利用多个变换参数,将至少一个三维模型变换为多个目标图,确定该目标对象对应的素材信息,基于多个目标图像以及素材信息,生成目标视频,将该目标视频发送至用户端101,以供用户端101输出该目标视频。
需要说明的是,本申请实施例中提供的视频生成方法一般由服务端102执行,相应的视频生成装置一般设置于服务端102中,本申请实施例中提供的信息显示方法一般由服务用户端101执行,相应的信息显示装置一般设置于用户端101中。但是,在本申请的其它实施例中,用户端101也可以与服务端102具有相似的功能,从而执行本申请实施例所提供的视频生成方法。在其它实施例中,本申请实施例所提供的视频生成方法还可以是由用户端101与服务端102共同执行。
应该理解,图1中的用户端和服务端的数目仅仅是示意性的。根据实现需要,可以具有任意数目的用户端和服务端。
以下对本申请实施例的技术方案的实现细节进行详细阐述。
图2为本申请实施例提供的一种视频生成方法一个实施例的流程图,本实施例的技术方案可以由服务端执行,该方法可以包括以下几个步骤:
201:获取包含目标对象的至少一个原始图像。
作为一种可选方式,可以是通过接收用户的图像处理请求,获取该图像处理请求中包括的至少一个原始图像。此外,在线上系统中,如网上交易系统中,该目标对象可以是网上交易系统所提供的商品,网上交易系统中设置有每个商品对应的商品描述页面,商品描述页面中通常以图文形式来详细介绍商品,意即商品描述页面中包括商品图片,因此作为另一种可选方式,可以是通过接收用户的图像处理请求,确定该图像处理请求中包含的目标对象对应的索引信息,然后基于索引信息,从目标对象对应的对象描述页面中识别包含目标对象的原始图像。该索引信息可以链接至对象描述页面从而可以从对象描述页面中获取包含目标对象的原始图像。
上述两种可选方式中,接收用户的图像处理请求可以是向用户端发送图像处理提示信息,以供用户端在显示界面中展示该图像处理提示信息,图像处理请求是响应于用户针对该图像处理提示信息触发的图像处理操作而发送的。
其中,原始图像可以是对目标对象进行拍摄采集获得的图像,或者对目标对象进行拍摄采集获得的图像进行相应处理之后获得的图像。该目标对象即为原始图像中的主体对象,也即为图像中的主要内容。
202:构建至少一个原始图像分别对应的三维模型。
可以通过三维重建方式,对至少一个原始图像分别进行三维建模,从而可以获得每个原始图像对应的三维模型。
可选地,构建每个原始图像对应的三维模型可以具体是基于该原始图像的像素深度值,利用三维重建模型构建对应的三维模型。其中,该三维重建模型可以是基于样本图像的像素深度值以及样本图像对应的三维模型训练获得。
203:利用多个变换参数,将至少一个三维模型变换为多个目标图像。
上述对至少一个原始图像分别进行三维重建,可以获得每个原始图像的三维模型,因此可以得到至少一个三维模型。
可以是将每个三维模型均按照该多个变换参数进行变换处理,从而获得每个三维模型对应的多个目标图像;当然,对每个三维模型也可以是按照在该多个变换参数中各自对应的变换参数进行变换处理,从而获得每个三维模型对应的多个目标图像。
204:基于多个目标图像,生成目标视频。
本申请实施例通过将原始图像经由三维重建,并利用变换参数对三维模型进行调整之后获得多个目标图像,使得由该多个目标图像合成的视频中目标对象具备了动态效果,提高了视频的视觉效果,使得可以更好的表达目标对象。
作为一种可选方式,基于多个目标图像,生成目标视频可以是将多个目标图像拼接生成目标视频。
该多个目标图像可以按照一定排列顺序拼接生成,该多个目标图像的排列顺序可以根据至少一个原始图像的排列顺序及多个变换参数的排列顺序而确定等。
作为另一种可选方式,为了进一步提高视觉效果,提高视频质量,基于多个目标图像,生成目标视频可以包括:确定目标对象匹配的素材信息;基于多个目标图像以及素材信息,生成目标视频。因此,作为又一个实施例,如图3所示的视频生成方法的流程图,该方法可以包括以下几个步骤:
301:获取包含目标对象的至少一个原始图像。
302:构建至少一个原始图像分别对应的三维模型。
303:利用多个变换参数,将至少一个三维模型变换为多个目标图像。
304:确定目标对象匹配的素材信息。
305:基于多个目标图像以及素材信息,生成目标视频。
本申请实施例通过将原始图像经由三维重建,并利用变换参数对三维模型进行调整之后获得多个目标图像,使得由该多个目标图像合成的视频中目标对象具备了动态效果,且结合目标对象匹配的素材信息生成最终目标视频,提高了视频的视觉效果,使得可以更好 的表达目标对象。
需要说明的是,本实施例与图2所示实施例不同之处在于,得到多个目标图像后,确定该目标对象匹配的素材信息,基于多个目标图像以及素材信息,生成目标视频。
其中,基于多个目标图像以及素材信息生成目标视频的方式可以是将多个目标图像与该素材信息进行合成处理,基于合成处理之后的多个目标图像,生成目标视频。
作为一种可选方式,素材信息可以包括至少一种素材图像,将至少一种素材图像与多个目标图像进行合成处理可以是确定至少一个素材图像与至少一个目标图像的对应关系,将任一个素材图像合成至与其对应的目标图像中。
其中,确定目标对象匹配的素材信息可以是根据目标对象所属的对象类目,确定至少一个素材图像。
该对象类目例如可以是食品类、服饰类、洗护类等。
根据目标对象所属的对象类目,确定至少一个素材图像可以是通过预先设置对象类目和素材图像的对应关系,依据对应关系,输入对象类目,查找到与对象类目对应的素材图像。
其中,一个目标图像可以对应至少一个素材图像,此外,为了方便处理,至少一个素材图像与至少一个目标图像可以一一对应,也即一个素材图像也仅对应一个目标图像。可以结合对象类目以及该多个目标图像的图像数量,确定对象类目所对应的,且与图像数量匹配的多个素材图像。该多个素材图像可以以素材视频形式,对应对象类目预先设定等,该多个素材图像即为素材视频所包含的图像帧。
其中,多个素材图像与多个目标图像的图像数量匹配可以是与多个目标图像的图像数量相同,或者数量差值在指定范围内等。多个素材图像的图像数量与多个目标图像的图像数量相同的情况下,该多个素材图像即与多个目标图像具有一一对应关系,可以根据多个素材图像的排列顺序以及多个目标图像的排列顺序确定一一对应关系,例如排列位置相同的素材图像与目标图像具有一一对应关系等。
在至少一个素材图像的图像数量小于或大于多个目标图像的图像数量的情况下,若小于的情况,可以从多个目标图像中首先按照素材图像的图像数量,筛选至少一个目标图像,例如可以按照排列顺序从首个图像开始依次选择等;若大于的情况下,可以按照多个目标图像的图像数量,从至少一个素材图像中,筛选多个素材图像,例如可以按照排列顺序从首个图像开始依次选择等。
当然,也可以预先设定不同对象标识所对应的素材图像,从而可以基于对象标识确定对应的至少一个素材图像;或者,也可以识别目标对象的对象特征,根据素材图像的图像特征与对象特征的匹配程度,确定满足匹配要求的至少一个素材图像;其中,图像特征与 对象特征的匹配程度可以基于预先训练生成的匹配模型计算获得等。
其中,将任一个素材图像合成至与其对应的目标图像可以是:根据任一素材图像及其对应的目标图像,基于该目标图像中目标对象的对象位置,确定合成区域;根据该合成区域,调整该素材图像的图像尺寸及合成方向;按照合成方向,将调整之后的素材图像合成至合成区域中。
此外,将任一个素材图像合成至与其对应的目标图像的合成方式可以包括:滤色、叠加、柔光、强光、亮光、实色混合、不透明度、正片叠底、颜色加深以及颜色减淡等中的一种或多种,可以预先设定不同素材图像对应的合成方式等。
作为另一种可选方式,素材信息可以包括文案信息,将文案信息与多个目标图像进行合成处理可以是将该文案信息合成至多个目标图像中的至少一个目标图像中。
其中,该文案信息可以是根据目标对象的对象相关信息生成,目标对象的对象相关信息例如可以包括对象描述信息、对象评价信息、对象类目信息、以及对象图像等中一个或多个。其中,对象图像例如可以是指上述原始图像或者包含目标对象的其它图像等。对象描述信息可以是指对象描述页面中的相关信息,例如包括对象名称、对象价格、对象产地等的相关信息。对象评价信息可以是用户针对目标对象所发表的言论等。
该文案信息可以利用文案生成模型基于目标对象的对象相关信息而确定。该文案生成模型可以基于样本对象的对象相关信息及其对应的样本文案信息而训练获得。
将该文案信息合成至多个目标图像中的至少一个目标图像中的合成方式有多种,例如可以具体是将文案信息叠加至目标图像。
其中,与文案信息进行合成处理的多个目标图像中的至少一个目标图像可以是,按照多个目标图像的排列顺序从首个图像开始选择一定数量的目标图像,将文案信息合成至上述所选择的至少一个目标图像中。
作为又一种可选方式,素材信息中可以包括至少一个素材图像以及文案信息。基于多个目标图像以及素材信息,生成目标视频。将素材信息与多个目标图像进行合成处理可以优先将至少一个素材图像合成,进而合成文案信息。当然也可以优先将文案信息与多个目标图像进行合成处理。对此不进行限定。
此外,作为又一种可选方式,素材信息中可以包括与目标对象匹配的音频数据,基于多个目标图像以及素材信息,生成目标视频可以包括:将多个目标图像拼接生成候选视频,并将音频数据与候选视频进行融合,获得目标视频。
当然,素材信息也可以包括至少一个素材图像、文案信息以及音频数据,可以将至少一个素材图像以及文案信息与多个目标图像进行合成处理,再将合成处理之后的多个目标图像拼接生成候选视频,之后将音频数据与候选视频进行融合,获得最终的目标视频。
为了便于理解,图4示出了将素材图像、文案信息合成至至少一个目标图像的显示示意图。图4中,对原始图像401,构建其对应的三维模型后,利用多个变换参数,将该三维模型按照不同的变换参数,变换为多个目标图像,图中以两个目标图像为例分别是目标图像402、目标图像403,目标对象为一碗粥,所属的对象类目为热食,与该类目对应的素材图像为冒热气素材图像,图4中以两个素材图像为例分别是素材图像404、素材图像405,文案信息406为“热气腾腾”为例进行说明。将目标图像402与素材图像404以及文案信息“热气腾腾”进行合成处理,获得目标图像407,将目标图像403与素材图像405以及文案信息“热气腾腾”进行合成处理,获得目标图像408,目标图像407以及目标图像408可以按照一定的排序顺序拼接生成目标视频,其中,多个目标图像的排列顺序的具体确定方式可以详见前文所述,此处不再重复赘述。
一些实施例中,利用多个变换参数,将至少一个三维模型变换为多个目标图像可以是通过确定至少一种运镜效果对应的多个变换参数,利用多个变换参数,将至少一个三维模型变换为多个目标图像。
该多个变换参数可以用于对至少一个三维模型进行模型调整以及投影变换,以生成多个目标图像。每个变换参数可以由变换矩阵以及投影矩阵构成,该变换矩阵可以包括旋转矩阵,此外还可以包括平移矩阵以及缩放矩阵中的一个或多个等,投影矩阵用以将三维模型投影为二维画面,以获得目标图像。
每一种运镜效果均可以对应多个变换参数,通过每个运镜效果对应的多个变换参数生成的多个目标图像的显示效果即可以表达该运镜效果。
其中,该至少一种运镜效果例如可以包括:前后左右上下任意方向的平移、旋转,以及希区柯克变焦(Dolly Zoom)等。
由于相机是一个将三维物体投影为二维图像的设备,一些实施例中,利用多个变换参数,将至少一个三维模型变换为多个目标图像可以是确定至少一个三维模型分别对应的虚拟相机;利用至少一个三维模型分别对应的虚拟相机,基于该多个变换参数,将至少一个三维模型投影为多个目标图像。
也即可以通过设定虚拟相机,利用每个三维模型对应的虚拟相机,按照每个三维模型对应的变换参数,将每个三维模型投影为多个目标图像。
例如可以基于变换参数,通过变换虚拟相机的内参、外参,调整相机的位置、角度、焦距、光圈等,可以将三维模型投影为多个目标图像。
由于原始图像由多个图层构成,其中,图层就像是含有文字或图形等元素的胶片,一张张按顺序叠放在一起,组合起来形成图像最终效果。对三维模型进行变换处理过程中,相邻图层之间的遮挡关系可能发生变化,从而可能出现颜色破损情况,为了提高视频质量,进一步保证视觉效果,一些实施例中,利用多个变换参数,将至少一个三维模型变换为多 个目标图像还可以包括:确定原始图像包含的多个图层;确定相邻两个图层中,一个图层在另一个图层上对应的交界区域;将所述原始图像对应的交界区域作为所述多个目标图像分别对应的交界区域;对任一个目标图像中的任一个交界区域填充目标颜色。
该交界区域可以是指一个图层在另一个图层上所对应的位置区域,以在另一个图层上产生遮挡,由于对三维模型进行变换处理,该交界区域可能会被暴露出来而不再被遮挡,因此需要对其进行颜色填充,以保证显示效果。每个图层上对应的交界区域即作为原始图像对应的交界区域,而基于原始图像生成的多个目标图像,图层不会发生变化,因此可以原始图像对应的交界区域也即为每个目标图像对应的交界区域。
其中,可以是将该交界区域全部填充该目标颜色,该目标颜色可以为预先设定颜色,此外,为了提高视觉效果,该目标颜色可以是基于交界区域所在图层的颜色而确定,例如可以是交界区域所在图层的各个像素颜色的融合值或者各个像素颜色中的占比最大颜色等,此外,也可以是基于交界区域对应的周边区域的像素颜色而确定,此外,作为另一种可选方式,对任一个目标图像中的任一个交界区域填充目标颜色可以是基于任一个目标图像对应的变换参数,确定目标图像中的任一个交界区域中满足填充需求的目标区域,基于目标区域对应的周边区域的像素颜色,确定目标颜色,对目标区域填充该目标颜色。
该填充需求可以是指未被遮挡区域,可以根据该目标对象所对应的变换参数而确定等。
该目标颜色可以利用颜色填充模型基于该目标区域对应的周边区域的像素颜色而确定。颜色填充模型可以预先基于样本区域的像素颜色及其对应的样本目标颜色而训练获得。
该周边区域也可以是指距离目标区域一定距离范围内的区域,或者目标区域周围若干像素点所构成的区域等。
其中,可以对每个目标图像中的每个交界区域均进行颜色填充,当然,也可以是针对存在填充需求的目标区域的交界区域及目标图像进行颜色填充。
一些实施例中,基于多个目标图像以及素材信息,生成目标视频可以包括:将素材信息与多个目标图像进行合成处理;基于合成处理之后的多个目标图像,生成目标视频。
其中,基于合成处理之后的所述多个目标图像,生成目标视频可以是按照至少一个原始图像的排列顺序及多个变换参数的排列顺序,确定多个目标图像的排列顺序;按照多个目标图像的排列顺序,将多个目标图像拼接生成目标视频。
其中,该多个变换参数基于至少一种运镜效果确定的情况下,该多个变换参数的排列顺序可以根据至少一种运镜效果的排列顺序而确定等。该照至少一个原始图像的排列顺序以及至少一个运镜效果的排列顺序可以预先设定或者根据用户需求而确定等,例如图像处理请求中可以包括至少一个原始图像的排列顺序以及至少一个运镜效果的排列顺序等。
为了便于理解,假设有2个原始图像,2个原始图像的排列顺序为:原始图像A、原 始图像B;2个运镜效果的排列顺序为:效果A、效果B。原始图像A按照效果A假设获2个目标图像,2个目标图像按照生成时间确定的排列顺序为:目标图像A1、目标图像A2,原始图像A按照效果B假设获2个目标图像,2个目标图像按照生成时间确定的排列顺序为:目标图像A3、目标图像A4;原始图像B按照效果A假设获得目标图像B1,按照效果B假设获得目标图像B2,则最终得到的多个目标图像的排列顺序可以为:目标图像A1、目标图像A2,目标图像A3、目标图像A4、目标图像B1、目标图像B2。当然,上述仅是举例说明的一种可能实现方式,本申请并不限定于此。
作为一种可选方式,生成目标视频之后,该方法还可以包括:
该目标视频发送至用户端,以供用户端输出该目标视频。
一些实施例中,还可以将下载提示信息发送至用户端,以供用户端输出目标视频的同时,输出该下载提示信息。用户端响应于针对该下载提示信息触发的下载操作,还可以将目标视频保存至用户端对应的本地文件。
此外,还可以将发布提示信息发送至用户端,以供用户端输出目标视频的同时,输出该更新提示信息。用户端响应于针对该更新提示信息触发的更新操作,还可以向服务端发送更新请求,服务端可以基于该更新请求,利用该目标视频,更新对象描述页面。
此外,还可以将发布提示信息发送至用户端,以供用户端播放目标视频的同时,输出发布提示信息。用户端响应于针对该发布提示信息触发的发布操作,还可以向服务端发送发布请求,服务端可以基于该发布请求,将目标视频发布至对象推广页面,其中,目标视频与对象描述页面可以建立链接关系,以在对象推广页面检测到针对目标视频的触发操作,基于链接关系跳转至所述对象描述页面,以方便用户针对目标对象执行交互操作等。
作为另一种可选方式,生成目标视频之后,该方法还可以包括:
利用该目标视频,更新对象描述页面。
也即服务端生成目标视频之后,可以直接利用该目标视频更新对象描述页面,当然也可以向用户端发送更新提示信息,用户确认更新之后通过用户端发送更新请求,服务端再利用该目标视频,更新对象描述页面。其中,利用目标视频更新对象描述页面,可以是利用目标视频替换对象描述页面中的已有视频或者将目标视频添加至对象描述页面中等。
作为又一种可选方式,生成目标视频之后,该方法还可以包括:建立目标视频与对象描述页面的链接关系;将目标视频发布至对象推广页面,以在对象推广页面检测到针对目标视频的触发操作,基于链接关系跳转至对象详情页面。
此外,服务端也可以首先向用户端发送发布提示信息,再接收到发布请求之后,再执行建立目标视频与对象描述页面的链接关系,并将目标视频发布至对象推广页面等。
本申请实施例在一个实际应用中,可以应用网上交易场景中,在网上交易场景中,目标对象可以是指网上交易系统所提供的目标商品,下面以目标对象为目标商品为例,对本申请技术方案进行说明,如图5所述,为本申请实施例提供的一种视频生成方法又一个实施例的流程图,本实施例的技术方案可以由服务端执行,该方法可以包括以下几个步骤:
501:获取目标商品的商品图像。
作为一种可选方式,可以是通过接收用户的图像处理请求,获取该图像处理请求中包括的商品图像。
作为另一种可选方式,可以是通过接收用户的图像处理请求,确定该图像处理请求中包含的目标商品对应的索引信息,然后基于索引信息,从目标商品对应的商品描述页面中识别包含目标商品的商品图像。
上述两种可选方式中,接收用户的图像处理请求可以是向用户端发送图像处理提示信息,以供用户端在显示界面中展示该图像处理提示信息,图像处理请求是响应于用户针对该图像处理提示信息触发的图像处理操作而发送的。
502:构建商品图像对应的三维模型。
构建商品图像对应的三维模型可以是基于该商品图像的像素深度值,利用三维重建模型构建对应的三维模型。其中,该三维重建模型是基于样本图像的像素深度值以及样本图像对应的三维模型训练获得。
503:利用多个变换参数,将三维模型变换为多个目标图像。
504:确定目标商品匹配的素材信息。
505:基于素材信息及多个目标图像,生成目标视频。
本申请实施例通过将商品图像经由三维重建,并利用变换参数对三维模型进行调整之后获得多个目标图像,使得由该多个目标图像合成的视频中目标对象具备了动态效果,且结合目标对象匹配的素材信息,生成最终目标视频,提高了视频的视觉效果,使得可以更好的表达目标对象。
需要说明的是,图5所示实施例与图3所示实施例不同之处在于,目标对象具体为目标商品,其它相同或相应步骤可以详见前文图3所示实施例中所述,在此将不再赘述。
作为一种可选方式,生成目标视频之后,该方法还可以包括:
发送目标视频至用户端,以供用户端进行播放。
此外,还可以将下载提示信息发送至用户端,以供用户端播放目标视频的同时,输出该下载提示信息。用户端响应于针对该下载提示信息触发的下载操作,还可以将将该目标视频保存至用户端对应的本地文件;
此外,还可以将更新提示信息发送至用户端,以供用户端播放目标视频的同时,输出更新提示信息。用户端响应于针对该更新提示信息触发的更新操作,还可以向服务端发送更新请求,服务端可以基于该更新请求,利用该目标视频,更新对象描述页面。
此外,还可以将发布提示信息发送至用户端,以供用户端播放目标视频的同时,输出发布提示信息。用户端响应于针对该发布提示信息触发的发布操作,还可以向服务端发送发布请求,服务端可以基于该发布请求,将目标视频发布至商品推广页面,其中,目标视频与商品详情页面可以建立链接关系,以在商品推广页面检测到针对目标视频的触发操作,基于链接关系跳转至商品详情页面,以方便用户进行商品购买等。该商品推广页面也即多种商品聚合页面,用以进行商品介绍和推广等,良好的视频质量可以吸引用户对目标视频的点击,从而有助于提升商品购买率,提升商品转化率。
作为另一种可选方式,生成目标视频之后,该方法还可以包括:利用目标视频更新商品详情页面。
也即服务端生成目标视频之后,利用目标视频更新商品详情页面,当然也可以向用户端发送更新提示信息,接收到更新请求之后,再利用目标视频更新商品详情页面。
作为又一种可选方式,生成目标视频之后,该方法还可以包括:建立目标视频与商品详情页面的链接关系;将目标视频发布至商品推广页面,以在商品推广页面检测到针对目标视频的触发操作,基于链接关系跳转至商品详情页面。
此外,服务端也可以首先向用户端发送发布提示信息,再接收到发布请求之后,再执行建立目标视频与商品详情页面的链接关系,将目标视频发布至商品推广页面。
图6为本申请实施例提供的一种信息显示方法一个实施例的流程图,本实施例的技术方案可以由用户端执行,该方法可以包括以下几个步骤:
601:提供显示界面。
602:在显示界面显示图像处理提示信息。
603:响应于针对图像处理提示信息触发的图像处理操作,向服务端发送图像处理请求,图像处理请求用于服务端确定包含目标对象的至少一个原始图像,构建至少一个原始图像分别对应的三维模型,利用多个变换参数,将至少一个三维模型变换为多个目标图像,确定目标对象匹配的素材信息,基于多个目标图像以及素材信息,生成目标视频。
其中,服务端根据图像处理请求,确定包含目标对象的至少一个原始图像,继而生成目标视频的视频生成方法可以详见前文图3所示实施例中所述,相同或相应步骤在此不再赘述。
604:在显示界面播放目标视频。
服务端生成的目标视频发送至用户端,从而用户端可以在显示界面中播放目标视频的视频画面,并在目标视频包含音频数据的情况,可以结合音频播放组件播放音频数据。
作为一种可选方式,在显示界面播放目标视频之后,该方法还可以包括:基于响应于用户下载操作,将该目标视频保存至用户端对应的本地文件。
其中,可以获取服务端发送的下载提示信息,并在显示界面显示该下载提示信息,该下载操作可以是针对该下载提示信息而触发。
作为另一种可选方式,在显示界面播放目标视频之后,该方法还可以包括:
响应于用户更新操作,向服务端发送更新请求,服务端可以基于更新请求,利用该目标视频,更新对象描述页面。
可选地,可以获取服务端发送的更新提示信息,并在显示界面显示该更新提示信息,该更新操作可以是针对该更新提示信息而出发。
作为又一种可选方式,在显示界面播放目标视频之后,该方法还可以包括:
响应于用户发布操作,向服务端发送发布请求,服务端可以基于发布请求,利用该目标视频发布至对象推广页面。
可选地,可以获取服务端发送的发布提示信息,并在显示界面显示该发布提示信息,该发布操作可以是针对该发布提示信息而出发。
其中,服务端的具体执行操作可以详见前文相应实施例中所述,此处不再赘述。
图7为本申请实施例提供的一种视频生成装置一个实施例的结构示意图,该装置可以包括:
第一获取模块701:用于获取包含目标对象的至少一个原始图像;
第一三维构建模块702:用于构建至少一个原始图像分别对应的三维模型;
第一投影模块703:用于利用多个变换参数,将至少一个三维模型变换为多个目标图像;
第一素材确认模块704:用于确定目标对象匹配的素材信息;
第一视频生成模块705:用于基于多个目标图像以及素材信息,生成目标视频。
第一获取模块获取包含目标对象的至少一个原始图像,可以是通过接收用户的图像处理请求,获取该图像处理请求中包括的至少一个原始图像。
作为另一种可选方式,可以是通过接收用户的图像处理请求,确定该图像处理请求中包含的目标对象对应的索引信息,然后基于索引信息,从目标对象对应的对象描述页面中 识别包含目标对象的原始图像。
第一投影模块利用多个变换参数,将至少一个三维模型变换为多个目标图像可以是通过确定至少一种运镜效果对应的多个变换参数,利用多个变换参数,将至少一个三维模型变换为多个目标图像。
作为另一种可选方式,可以是确定至少一个三维模型对应的虚拟相机,利用该虚拟相机,按照该虚拟相机对应的多个变换参数将至少一个三维模型投影为多个目标图像。
第一视频生成模块基于多个目标图像以及素材信息生成目标视频的方式可以是将多个目标图像与该素材信息进行合成处理,基于合成处理之后的多个目标图像,生成目标视频。
素材信息可以包括至少一种素材图像,将至少一种素材图像与多个目标图像进行合成处理可以是确定至少一个素材图像与至少一个目标图像的一一对应关系,将任一个素材图像合成至与其对应的目标图像中。
其中,由第一素材确认模块确定至少一个素材图像与至少一个目标图像的一一对应关系,可以是通过至少一个目标图像中的目标对象所属的对象类目,确定至少一个素材图像。
其中,将任一个素材图像合成至与其对应的目标图像可以是,根据任一素材图像及其对应的目标图像,基于该目标图像中目标对象的对象位置,确定合成区域,根据该合成区域,调整该素材图像的图像尺寸及合成方向,按照合成方向,将调整之后的素材图像合成至合成区域中。
素材信息还可以包括文案信息,将文案信息与多个目标图像进行合成处理可以是将该文案信息合成至多个目标图像中的至少一个目标图像中。
其中,第一素材确认模块确定目标对象对应的文案信息,该文案信息可以是根据目标对象的对象相关信息生成,目标对象的对象相关信息例如对象描述信息、评价信息、类目信息等。
第一视频生成模块用于基于多个目标图像以及素材信息,生成目标视频,可以是按照至少一个原始图像的排列顺序及至少一个三维模型分别对应的变换参数的排列顺序,确定多个目标图像的排列顺序,按照多个目标图像的排列顺序,将多个目标图像拼接生成目标视频。
在某些实施例中,基于多个目标图像,生成目标视频后,还可以将该目标视频发送至用户端,以供用户端输出该目标视频。
图7所示的视频生成装置可以应用于电子商务场景,在电子商务场景下,第一获取模块可以具体是获取目标商品的商品图像,第一三维构建模块可以具体构建商品图像对应的三维模型。
图7所述的视频生成装置可以执行图3所示实施例所述的视频生成方法,其实现原理和技术效果不再赘述。对于上述实施例中的信息处理装置其中各个模块、单元执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
图8为本申请实施例提供的一种信息显示装置一个实施例的结构示意图,该装置可以包括:
第一输出模块801:提供显示界面;
第一显示模块802:在显示界面显示图像处理提示信息;
第一请求处理模块803:响应于针对图像处理提示信息触发的图像处理操作,向服务端发送图像处理请求,图像处理请求用于服务端确定包含目标对象的原始图像;
第二显示模块804:在显示界面播放目标视频。
在一个可能的设计中,图8所述的信息显示装置可以实现为前文所述的用户端,图8所示的信息显示装置可以执行图6所示实施例的信息显示方法,其实现原理和技术效果不再赘述。对于上述实施例中的信息显示装置其中各个模块、单元执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
本申请实施例还提供了一种计算设备,如图9所示,该计算设备中可以包括存储组件901以及处理组件902;
所述存储组件901存储一条或多条计算机指令,其中,所述一条或多条计算机指令供所述处理组件调用,以实现如图2或图3或图5所示实施例所述的视频生成方法。
当然,设备必然还可以包括其他部件,例如输入/输出接口,显示组件、通信组件等。
在该计算设备中的处理组件用以实现如图6所示的信息显示方法的情况下,该计算设备还可以包括显示组件,以执行对应的显示操作。
输入/输出接口为处理组件和外围接口模块之间提供接口,上述外围接口模块可以是输出设备、输入设备等。通信组件被配置为便于计算设备和其他设备之间有线或无线方式的通信等。
其中,处理组件902可以包括一个或多个处理器来执行计算机指令,以完成上述的方法中的全部或部分步骤。当然处理组件也可以为一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
存储组件901被配置为存储各种类型的数据以支持在终端的操作。存储组件可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM), 电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。显示组件可以为电致发光(EL)元件、液晶显示器或具有类似结构的微型显示器、或者视网膜可直接显示或类似的激光扫描式显示器。
显示组件可以为电致发光(EL)元件、液晶显示器或具有类似结构的微型显示器、或者视网膜可直接显示或类似的激光扫描式显示器。
需要说明的是,上述计算设备实现图2或图3或图5所示的视频生成方法的情况下,其可以为物理设备或者云计算平台提供的弹性计算主机等。其可以实现成多个服务器或终端设备组成的分布式集群,也可以实现成单个服务器或单个终端设备。上述计算设备实现图6所示信息显示方法的情况下,其可以具体实现为电子设备,电子设备可以是指用户使用的,具有用户所需计算、上网、通信等功能的设备,例如可以是手机、平板电脑、个人电脑、穿戴设备等。
本申请实施例还提供了一种计算机可读存储介质,存储有计算机程序,该计算机程序被计算机执行时可以实现上述图2所示实施例的视频生成方法或图3所示实施例的视频生成方法或者图5所示实施例的视频生成方法或者图6所示实施例的信息显示方法。该计算机可读介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
本申请实施例还提供了一种计算机程序产品,其包括承载在计算机可读存储介质上的计算机程序,该计算机程序被计算机执行时可以实现上述图2所示实施例的视频生成方法或图3所示实施例的视频生成方法或者图5所示实施例的视频生成方法或者图6所示实施例的信息显示方法。
在这样的实施例中,计算机程序可以是从网络上被下载和安装,和/或从可拆卸介质被安装。在该计算机程序被处理器执行时,执行本申请的系统中限定的各种功能。
前文相应实施例中所涉及的处理组件例如可以包括一个或多个处理器来执行计算机指令,以完成上述的方法中的全部或部分步骤。当然处理组件也可以为一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
存储组件被配置为存储各种类型的数据以支持在终端的操作。存储组件可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
显示组件可以为电致发光(EL)元件、液晶显示器或具有类似结构的微型显示器、或者 视网膜可直接显示或类似的激光扫描式显示器。
计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (14)

  1. 一种视频生成方法,其特征在于,包括:
    获取包含目标对象的至少一个原始图像;
    构建所述至少一个原始图像分别对应的三维模型;
    利用多个变换参数,将至少一个三维模型变换为多个目标图像;
    确定所述目标对象匹配的素材信息;
    基于所述多个目标图像以及所述素材信息,生成目标视频。
  2. 根据权利要求1所述的方法,其特征在于,基于所述多个目标图像以及所述素材信息,生成目标视频包括:
    将所述素材信息与所述多个目标图像进行合成处理;
    基于合成处理之后的多个目标图像,生成目标视频。
  3. 根据权利要求2所述的方法,其特征在于,所述素材信息包括至少一个素材图像;所述将所述素材信息与所述多个目标图像进行合成处理包括:
    确定所述至少一个素材图像与至少一个目标图像的对应关系;
    将任一个素材图像合成至与其对应的目标图像中。
  4. 根据权利要求2所述的方法,其特征在于,所述素材信息包括文案信息;所述将所述素材信息与所述多个目标图像进行合成处理包括:
    将所述目标文案合成至所述多个目标图像中的至少一个目标图像中。
  5. 根据权利要求3所述的方法,其特征在于,所述确定所述目标对象匹配的素材信息包括:
    根据所述目标对象所属的对象类目,确定至少一个素材图像。
  6. 根据权利要求4所述的方法,其特征在于,所述确定所述目标对象匹配的素材信息包括:
    根据所述目标对象的对象相关信息,生成所述目标对象的文案信息;所述对象相关信息包括对象描述信息、对象评价信息、对象类目信息以及对象图像中的一个或多个。
  7. 根据权利要求1所述的方法,其特征在于,所述利用多个变换参数,将至少一个三维模型变换为多个目标图像包括:
    确定至少一种运镜效果对应的多个变换参数;
    利用所述多个变换参数,将所述至少一个三维模型变换为多个目标图像。
  8. 根据权利要求1所述的方法,其特征在于,还包括:
    确定所述原始图像包含的多个图层;
    确定相邻两个图层中,一个图层在另一图层上对应的交界区域;
    将所述原始图像对应的交界区域作为所述多个目标图像分别对应的交界区域;
    对任一个目标图像中的任一个交界区域填充目标颜色。
  9. 一种视频生成方法,其特征在于,包括:
    获取目标商品的商品图像;
    构建所述商品图像对应的三维模型;
    利用多个变换参数,将所述三维模型变换为多个目标图像;
    确定所述目标商品匹配的素材信息;
    基于所述素材信息及所述多个目标图像,生成目标视频。
  10. 根据权利要求9所述的方法,其特征在于,所述获取目标商品的商品图像包括:
    接收用户的图像处理请求,所述图像处理请求包括所述商品图像;
    或者,
    接收用户的图像处理请求,确定所述图像处理请求包含的目标商品对应的索引信息,基于所述索引信息,从所述目标商品的商品详情页面中识别将所述目标商品作为图像主体的商品图像。
  11. 根据权利要求9所述的方法,其特征在于,所述生成目标视频包括:
    接收用户发布请求,利用所述目标视频更新商品详情页面;
    或者,
    建立所述目标视频与商品详情页面的链接关系,将目标视频发布至商品推广页面,以在所述商品推广页面检测到针对所述目标视频的触发操作,基于所述链接关系跳转至所述商品详情页面。
  12. 一种信息显示方法,其特征在于,包括:
    提供显示界面;
    在所述显示界面显示图像处理提示信息;
    响应于针对所述图像处理提示信息触发的图像处理操作,向服务端发送图像处理请求;所述图像处理请求用于所述服务端确定包含目标对象的至少一个原始图像,构建所述至少一个原始图像分别对应的三维模型;利用多个变换参数,将至少一个三维模型变换为多个 目标图像;确定所述目标对象匹配的素材信息;基于所述多个目标图像以及所述素材信息,生成目标视频;
    在所述显示界面播放所述目标视频。
  13. 一种计算设备,其特征在于,包括处理组件、存储组件;所述存储组件存储一个或多个计算机指令;所述一个或多个计算机指令用于被所述处理组件调用并执行,以实现如权利要求1~8任一项所述的视频生成方法或者如权利要求9~11任一项所述的视频生成方法或者如权利要求12所述的信息显示方法。
  14. 一种计算机存储介质,其特征在于,存储有计算机程序,所述计算程序被计算机执行时,实现如权利要求1~8任一项所述的视频生成方法或者如权利要求9~11任一项所述的视频生成方法或者如权利要求12所述的信息显示方法。
PCT/CN2023/071967 2022-09-22 2023-01-12 视频生成方法、信息显示方法及计算设备 WO2024060474A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211160552.X 2022-09-22
CN202211160552.XA CN115908694A (zh) 2022-09-22 2022-09-22 视频生成方法、信息显示方法及计算设备

Publications (1)

Publication Number Publication Date
WO2024060474A1 true WO2024060474A1 (zh) 2024-03-28

Family

ID=86490142

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/071967 WO2024060474A1 (zh) 2022-09-22 2023-01-12 视频生成方法、信息显示方法及计算设备

Country Status (2)

Country Link
CN (1) CN115908694A (zh)
WO (1) WO2024060474A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110064388A1 (en) * 2006-07-11 2011-03-17 Pandoodle Corp. User Customized Animated Video and Method For Making the Same
CN113132815A (zh) * 2021-04-22 2021-07-16 北京房江湖科技有限公司 视频生成方法和装置、计算机可读存储介质、电子设备
CN113240781A (zh) * 2021-05-20 2021-08-10 东营友帮建安有限公司 基于语音驱动及图像识别的影视动画制作方法、系统
CN113891079A (zh) * 2021-11-11 2022-01-04 深圳市木愚科技有限公司 自动化教学视频生成方法、装置、计算机设备及存储介质
CN115022674A (zh) * 2022-05-26 2022-09-06 阿里巴巴(中国)有限公司 生成虚拟人物播报视频的方法、系统及可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110064388A1 (en) * 2006-07-11 2011-03-17 Pandoodle Corp. User Customized Animated Video and Method For Making the Same
CN113132815A (zh) * 2021-04-22 2021-07-16 北京房江湖科技有限公司 视频生成方法和装置、计算机可读存储介质、电子设备
CN113240781A (zh) * 2021-05-20 2021-08-10 东营友帮建安有限公司 基于语音驱动及图像识别的影视动画制作方法、系统
CN113891079A (zh) * 2021-11-11 2022-01-04 深圳市木愚科技有限公司 自动化教学视频生成方法、装置、计算机设备及存储介质
CN115022674A (zh) * 2022-05-26 2022-09-06 阿里巴巴(中国)有限公司 生成虚拟人物播报视频的方法、系统及可读存储介质

Also Published As

Publication number Publication date
CN115908694A (zh) 2023-04-04

Similar Documents

Publication Publication Date Title
KR101872173B1 (ko) 모바일 디바이스 상에서 안내되는 사진 및 비디오
WO2021008166A1 (zh) 用于虚拟试衣的方法和装置
US20150310662A1 (en) Procedural authoring
CN103838842A (zh) 一种浏览器中加载新标签页的方法和装置
US11836867B2 (en) Techniques for virtual visualization of a product in a physical scene
WO2017024964A1 (zh) 一种物品关联图片快速预览的方法以及装置
CN112868224B (zh) 捕获和编辑动态深度图像的方法、装置和存储介质
KR102151964B1 (ko) 제품 상세정보 콘텐츠를 위한 제품 사진 촬영 서비스 제공 방법
US11348325B2 (en) Generating photorealistic viewable images using augmented reality techniques
WO2016089550A1 (en) System and method for image processing and virtual fitting
WO2023185809A1 (zh) 视频数据生成方法、装置、电子设备及存储介质
US20160343064A1 (en) Online merchandizing systems and methods that use 360 product view photography with user-initiated product feature movement
CN111142967B (zh) 一种增强现实显示的方法、装置、电子设备和存储介质
Zhang A systematic review of the use of augmented reality (AR) and virtual reality (VR) in online retailing
TWM626899U (zh) 呈現立體空間模型的電子裝置
WO2024060474A1 (zh) 视频生成方法、信息显示方法及计算设备
JP2019022207A (ja) 閲覧システム、画像配信装置、画像配信方法、プログラム
US11170474B2 (en) Systems and methods for presenting latent figures in digital imagery
US10600062B2 (en) Retail website user interface, systems, and methods for displaying trending looks by location
WO2018126440A1 (zh) 一种三维图像的生成方法和系统
US20210192606A1 (en) Virtual Online Dressing Room
Okada et al. Web Viewers for Educational VR Contents.
CN114758374A (zh) 一种表情的生成方法、计算设备及存储介质
CN114067084A (zh) 图像展示方法及装置
US20170270599A1 (en) Retail website user interface, systems, and methods for displaying trending looks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23866780

Country of ref document: EP

Kind code of ref document: A1