WO2021218547A1 - Method for superimposing live image of person onto real scene, and electronic device - Google Patents

Method for superimposing live image of person onto real scene, and electronic device Download PDF

Info

Publication number
WO2021218547A1
WO2021218547A1 PCT/CN2021/084372 CN2021084372W WO2021218547A1 WO 2021218547 A1 WO2021218547 A1 WO 2021218547A1 CN 2021084372 W CN2021084372 W CN 2021084372W WO 2021218547 A1 WO2021218547 A1 WO 2021218547A1
Authority
WO
WIPO (PCT)
Prior art keywords
live
image
person
posture
display medium
Prior art date
Application number
PCT/CN2021/084372
Other languages
French (fr)
Chinese (zh)
Inventor
李江亮
周硙
方俊
Original Assignee
北京外号信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京外号信息技术有限公司 filed Critical 北京外号信息技术有限公司
Publication of WO2021218547A1 publication Critical patent/WO2021218547A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed

Definitions

  • the present invention relates to the field of augmented reality technology, and in particular to a method and electronic device for superimposing live image of a person in a real scene observed by a device.
  • Etc. The information conveyed accounts for more than 50% of the total amount of information in the entire communication process, and facial expressions and looks are very important parts. In the case of wearing masks and other protective facilities, most of the information conveyed through facial expressions and length is blocked and cannot be transmitted, which affects the effect of face-to-face communication.
  • the present application provides a method and electronic device for superimposing a live image of a person in a real scene observed by the device.
  • One aspect of the present invention relates to a method for superimposing live image of a person in a real scene, including: determining the position and posture of a device in space, wherein the device has an image acquisition device and a display medium; The spatial position of the live character image setting; based on the position and posture of the device and the spatial position of the live character image, determine the presentation position of the live character image on the display medium of the device; display on the device Presenting the real scene collected by the image acquisition device of the device on a medium; and receiving the live image of the person and superimposing the live image of the person at the presentation position on the display medium.
  • the live character image received by the device is a live character image with a transparent background or a live character image without a background; or, the device processes the received live character image to generate a transparent background Live images of people or live images of people without background.
  • the method further includes: determining a live image of a person to be presented to the device.
  • the position of the device in the space is used to determine the live image of the person to be presented to the device.
  • the live character image to be presented to the device is determined by the position and posture of the device in space.
  • the method further includes: obtaining the posture in the space set for the live broadcast image of the person.
  • the method further includes: determining the presentation posture of the live person image on the display medium of the device based on the position and posture of the device and the posture of the live person image.
  • the front of the live person image is always facing the device.
  • the method further includes: collecting video, audio, or text input of the user of the device; and sending the video, audio, or text input to a live broadcaster who provides the live image of the person.
  • the method further includes: after superimposing the live person image on the display medium of the device, determining the live person image according to the new position and posture of the device and the spatial position of the live person image The new presentation position of the live character image on the display medium of the device.
  • the method further includes: after superimposing the live character image on the display medium of the device, the presentation position of the live character image on the display medium remains unchanged.
  • the method further includes: after superimposing the live character image on the display medium of the device, making the live character image appear on the display medium according to the instruction of the user of the device The location remains the same.
  • the determining the position and posture of the device in space includes: scanning the optical communication device deployed in the real scene by the device to determine the initial position and posture of the device in space, and continuously tracking The position and posture of the device in space change.
  • the method further includes: the device obtains the identification information of the optical communication device, and determines the live person image to be presented for the device through the identification information.
  • At least two live person images are superimposed on the display medium of the device.
  • the live image of a person is a two-dimensional image of a person or a three-dimensional image of a person.
  • the method further includes: before receiving the live person image, instructing a live broadcaster associated with the live person image to provide the live person image.
  • Another aspect of the present invention relates to a storage medium in which a computer program is stored, and when the computer program is executed by a processor, it can be used to implement the above-mentioned method.
  • Another aspect of the present invention relates to an electronic device, which includes a processor and a memory, and a computer program is stored in the memory.
  • the computer program When the computer program is executed by the processor, the computer program can be used to implement the above-mentioned method.
  • a live interaction method based on the position in the real scene or bound with the position in the real scene is realized, so that the device user can experience the non-contact scene service similar to the live service, and There is no need for service personnel and users to have face-to-face close verbal communication, which can greatly reduce the risk of cross-infection when there is an epidemic, and help related industries to resume work and production smoothly.
  • the same service personnel can serve users in different locations, which can break geographic limitations, save labor costs, and improve service efficiency.
  • Fig. 1 shows a method for superimposing live image of a person in a real scene observed by a device according to an embodiment
  • Figure 2 shows a schematic diagram of a user watching live images of people in a real scene
  • Fig. 3 shows a live broadcaster and camera equipment used to provide live images of people in the real scene shown in Fig. 2;
  • Figure 4 shows a schematic image presented on the display medium of the user's device
  • Fig. 5 is an example real image for showing the actual effect of the present invention.
  • Figure 6 shows an exemplary optical label
  • Figure 7 shows an exemplary optical label network.
  • Fig. 1 shows a method for superimposing a live image of a person in a real scene observed by a device according to an embodiment.
  • the device may be, for example, a device carried or controlled by a user (for example, a mobile phone, a tablet computer, smart glasses, AR/VR glasses, an AR/VR helmet, a smart watch, etc.), and has an image acquisition device (for example, a camera) and a display medium ( For example screen).
  • the method may include the following steps:
  • Step 1001 Determine the position and posture of the device in space.
  • Various feasible methods can be used to determine the location and posture of the device in space, for example, visual signs can be arranged in the space and the location and posture of the device can be determined by analyzing the images of the visual signs collected by the device; real scenes can be established
  • the position and attitude of the equipment can be determined by analyzing the scene images collected by the equipment; the position and attitude of the equipment can be determined by using high-precision gyroscopes; it can be arranged in space that can transmit radio signals Beacons, and determine the location and attitude of the device by analyzing the radio signals received by the device; you can use satellite positioning signals to determine the location of the device and use a gyroscope to determine the attitude of the device; and a combination of the above methods; etc. Wait.
  • Step 1002 Obtain the spatial position set for the live character image.
  • live broadcasters For example, service personnel in government halls, bank branches, exhibition halls, scenic spots, shopping malls, supermarkets, airports, stations, etc. (may be referred to as "live broadcasters" in this article) can provide live images of people in real time. Images are used to provide content explanations to device users, answer device users’ inquiries, and so on. Through the use of live images of people, service personnel can explain to users remotely and in real time, and can answer user inquiries, etc., without the need to have close face-to-face communication with users, and there is no need to limit service personnel to a certain Fixed location.
  • the live character can be represented or defined by the spatial position of a point on the live character image, the spatial positions of multiple points (for example, multiple points on the outline of the live character image), or the spatial position of the entire live character image area
  • the spatial position of the image (that is, the presentation position of the live character image in space).
  • the live character image is an image with a rectangular shape
  • the position coordinates of the center point of the rectangular image can be used to define the spatial position of the rectangular image.
  • a certain corner of the rectangular image for example, the upper left corner , Lower left corner, upper right corner, lower right corner
  • two opposite corners of the rectangular image for example, the upper left corner and the lower right corner, or the lower left corner and the upper right corner
  • the position coordinates in space define the spatial position of the rectangular image, and so on.
  • the live person image that can be presented to the device can be determined by the location of the device in space and an optional posture.
  • the equipment can scan a certain visual sign installed in the exhibition hall to determine the location of the equipment in the exhibition hall and optional posture. Through the location of the equipment in the exhibition hall and the optional posture, it can be inquired to determine the current It can be a live video of people presented by the device (for example, a live video of people used for the introduction of a certain exhibit).
  • other information may be used to determine the live person image to be presented for the device, for example, the identification information of the visual sign obtained by the device may be used to query to determine the live person image that can currently be presented for the device.
  • multiple live person images that can be presented to the device may be obtained, and the device user can select from them to determine the live person image currently to be presented. For example, for a device user currently in the government affairs hall, the user can be reminded that there are multiple live images of people involved in a variety of services for presentation, and the user can choose what they are interested in according to their needs (for example, according to the business they want to handle) Live images of people.
  • the live person images can be filtered based on information related to the device or the device user (for example, the user’s age, gender, occupation, etc.), so that the device user’s preferences may be presented to him on the live broadcast. image.
  • the device may send a corresponding instruction or message to the live broadcaster for providing the live person image, So that the live broadcaster can start the live broadcast and send the live broadcast image of the person to the device.
  • one live broadcaster may be associated with multiple live person images.
  • one live broadcaster may be responsible for multiple live person images corresponding to multiple exhibits in the exhibition hall.
  • the instruction or message sent to the live broadcaster can identify the corresponding live person image (for example, the instruction or message sent to the broadcaster includes the identification information of the corresponding live person image), so that the broadcaster Knowing, for example, enables the live broadcaster to know which exhibit should be provided with the corresponding live person image.
  • a live person image may be associated with multiple live broadcasters, and any idle broadcaster among the multiple live broadcasters may provide the live broadcast person image.
  • the device user may select his favorite live broadcaster, or the live broadcaster who responds to the device user's request first may provide the live broadcast character image.
  • the posture in the space set for the live character image to be presented can also be obtained, which can be used, for example, to define the orientation of the live character image in the space.
  • Step 1003 Based on the location and posture of the device and the spatial position of the live person image, determine the presentation position of the live person image on the display medium of the device.
  • the current field of view of the image acquisition device of the device can actually be determined. Further, based on the spatial position of the live character image, it can be determined whether the live character image is within the current field of view of the image acquisition device of the device, and where it is located within the field of view, so as to determine the display medium of the live character image on the device The presentation position on the.
  • the posture of the live character image presented on the display medium of the device may be further determined based on the position and posture of the device and the posture of the live character image.
  • a certain direction of the live character image can always face the device of the user who observes the live character image.
  • the front of the live image of the person can always face the user's device, so that even if the device user is in a different position or changes location, he can feel that the person in the live person image is always facing himself explain.
  • Step 1004 Present the real scene collected by the image acquisition device of the device on the display medium of the device.
  • the device can collect the real scene in real time through its image acquisition device, and present the image of the real scene on the display medium of the device.
  • Step 1005 Receive the live person image and superimpose the live person image at the presentation position on the display medium of the device.
  • the live character image can actually be superimposed on the appropriate position in the real scene observed by the device, so that the live character image closely integrated with the real scene can be provided to the device user, for example, to explain to the device user, Respond to inquiries, etc.
  • the live character image received by the device may be a live character image with a transparent background (for example, a live character image with an alpha transparent channel) or a live character image without a background.
  • the live person image may be processed after collecting the live person image or during the process of transmitting the live person image to generate a live person image with a transparent background, and send it to the device.
  • the device may receive a live character image with an opaque background and process the live character image to generate a live character image with a transparent background or a live character image without a background.
  • a monochrome background such as a green cloth
  • the live image of people superimposed in the real scene can look like only people, without the original background when shooting people.
  • the user observes the live character image through the display medium of the device only the character will be observed, but the original background of the character will not be observed, as if the character is actually located in the real scene, so that a better user experience can be achieved .
  • At least one of the image, sound, or text input of the device user may be collected through the device and sent to the broadcaster to So that both parties can interact in real time.
  • Figure 2 shows a schematic diagram of a user watching a live image of a person in a real scene.
  • the real scene includes a shelf 202, and the user 201 holds the device 203 and watches the live image of a person who is arranged or embedded in the real scene through the display medium of the device 203.
  • the deployment position of the live person image in the real scene is, for example,
  • the dashed box 204 is shown.
  • the position of the entire dashed frame 204 in space can be defined by the spatial position of one or more points on the dashed frame 204.
  • the dashed frame 204 may have a preset or default posture, for example, the default dashed frame 204 is perpendicular to the ground.
  • FIG. 3 shows a live broadcaster 302 for providing live person images in the real scene shown in FIG. 2, and a camera device 301 for collecting images of the live broadcaster 302 to generate live broadcast person images.
  • Figure 4 shows a schematic image presented on the display medium of the device 203 of the user 201, in which an image of a real scene (including the shelf 202) is obtained by the image acquisition device of the device 203 and presented on the device 203 On the display medium.
  • the device 203 also receives the live person image provided by the camera device 301 of the live broadcaster 302, and displays it on the display medium of the device 203 according to the location and posture of the device 203 and the spatial position set for the live person image. The position is superimposed on the live broadcast character image containing the live broadcaster 302 with a transparent background, thereby realizing the perfect integration of the live broadcaster 302 and the real scene.
  • Fig. 5 is an example real image for showing the actual effect of the present invention.
  • the real scene shown by the real image includes a shelf.
  • the real scene presented on the screen of the user's mobile phone can be superimposed with a live character image containing an instructor with a transparent background. In this way, the user feels as if there is a real explainer introducing various products to him in front of the shelf.
  • two or more characters may be included in the live character image, and the two or more characters may interact verbally or physically to provide the user with a more detailed explanation.
  • At least two live person images may be arranged for the real scene, and at least two live person images may be superimposed on the display medium of the device. At least two live person images can be presented simultaneously or sequentially on the display medium of the device.
  • the live image of a person may be a two-dimensional image of a person.
  • the live character image may be a three-dimensional character image.
  • multiple camera devices located around the people can be used to shoot from multiple different angles to provide three-dimensional images of people.
  • the size of the live character image can also be set or adjusted, for example, adjusted so that the character therein has a size similar to that of a real person.
  • the position and posture change of the device can be tracked, and the live character image can be determined in real time according to the new position and posture of the device and the spatial position of the live character image The new presentation position on the display medium of the device.
  • the new presentation posture of the live character image on the display medium of the device can be determined in real time. This method can achieve a good augmented reality effect, making the device user feel that the live broadcaster is actually in the real scene.
  • the live character image after the live character image is superimposed on the display medium of the device, the live character image can be made to have a fixed presentation position and/or presentation posture on the display medium.
  • the live character image may have a fixed presentation position and/or presentation posture on the display medium according to the instruction of the device user.
  • the live image of the person can be watched in a desired presentation position and/or presentation posture through the display medium of the device.
  • the device user can change the position and/or posture of the device in space, so that the live character image superimposed on the device display medium has the presentation position desired by the device user And/or presentation posture.
  • the device user can send instructions (for example, by clicking a button presented on the device display medium) to make the current presentation position and/or presentation posture of the live character image remain unchanged thereafter, even if the device is in Change position or posture in space.
  • the position and posture of the device in the space can be determined by the optical communication device arranged in the space.
  • Optical communication devices are also called optical tags, and these two terms can be used interchangeably in this article.
  • Optical tags can transmit information through different light-emitting methods, which have the advantages of long recognition distance and relaxed requirements for visible light conditions, and the information transmitted by optical tags can change over time, which can provide large information capacity and flexible configuration capabilities.
  • An optical tag usually includes a controller and at least one light source, and the controller can drive the light source through different driving modes to transmit different information to the outside.
  • Fig. 6 shows an exemplary optical label 100, which includes three light sources (respectively a first light source 101, a second light source 102, and a third light source 103).
  • the optical tag 100 also includes a controller (not shown in FIG. 6), which is used to select a corresponding driving mode for each light source according to the information to be transmitted. For example, in different driving modes, the controller can use different driving signals to control the light emitting mode of the light source, so that when a device with imaging function is used to photograph the light label 100, the imaging of the light source therein can show a different appearance.
  • the optical label shown in FIG. 6 is only used as an example, and the optical label may have a different shape from the example shown in FIG. 6 and may have a different number and/or different shape of light sources from the example shown in FIG. 6.
  • each optical tag can be assigned an identification information (ID), which is used to uniquely identify or identify the optical tag by the manufacturer, manager, or user of the optical tag .
  • ID identification information
  • the light source can be driven by the controller in the optical tag to transmit the identification information outward, and the user can use the device to collect the image of the optical tag to obtain the identification information transmitted by the optical tag, so that the corresponding identification information can be accessed based on the identification information.
  • Services for example, accessing a webpage associated with the identification information, obtaining other information associated with the identification information (for example, location information of a light tag corresponding to the identification information), and so on.
  • the device can obtain an image containing the optical label through image acquisition device on the optical label, and identify the information conveyed by the optical label by analyzing the imaging of the optical label (or each light source in the optical label) in the image.
  • the information related to each optical tag can be stored in the server.
  • a large number of optical labels can also be constructed into an optical label network.
  • Fig. 7 shows an exemplary optical label network, which includes a plurality of optical labels and at least one server.
  • the identification information (ID) or other information of each optical label can be saved on the server, such as service information related to the optical label, description information or attributes related to the optical label, such as location information, model information, and Physical size information, physical shape information, posture or orientation information, etc.
  • the optical label may also have unified or default physical size information and physical shape information.
  • the device can use the identified identification information of the optical tag to query the server to obtain other information related to the optical tag.
  • the location information of the optical tag may refer to the actual location of the optical tag in the physical world, which may be indicated by geographic coordinate information.
  • the server may be a software program running on a computing device, a computing device, or a cluster composed of multiple computing devices.
  • the optical tag may be offline, that is, the optical tag does not need to communicate with the server.
  • online optical tags that can communicate with the server are also feasible.
  • the device can determine its position relative to the optical label by collecting an image including the optical label and analyzing the image (for example, analyzing the imaging size of the optical label in the image, perspective distortion, etc.), and the relative position This can include the distance and orientation of the device relative to the optical tag.
  • the device can also determine its posture relative to the optical label by collecting an image including the optical label and analyzing the image. For example, when the imaging position or imaging area of the optical tag is located in the center of the imaging field of the device, it can be considered that the device is currently facing the optical tag.
  • the device can identify the identification information transmitted by the optical tag by scanning the optical tag, and can obtain (for example, by querying) the position and posture information of the optical tag in the real scene coordinate system through the identification information.
  • the real scene coordinate system may be, for example, a certain place coordinate system (for example, a coordinate system established for a certain room, building, park, etc.) or a world coordinate system.
  • the position or posture information of the device in the real scene coordinate system can be determined. Therefore, the determined position or posture of the device in space may be the position or posture of the device relative to the optical tag, but may also be the position or posture of the device in the real scene coordinate system.
  • the device can identify the identification information transmitted by the optical tag by scanning the optical tag, and use the identification information to determine the scene information of the real scene where the optical tag is located.
  • the scene information may be, for example, three-dimensional model information of the real scene, The point cloud information of the real scene, the information of the auxiliary signs around the light tag, and other information, etc. Thereafter, based on the determined scene information and the image of the real scene collected by the device, the position and/or posture of the device in the real scene can be determined through visual positioning.
  • the device After scanning the optical tag to determine the position and/or posture of the device in space, the device may be translated and/or rotated.
  • various sensors built into the device e.g., acceleration sensor, magnetic Sensors, direction sensors, gravity sensors, gyroscopes, cameras, etc.
  • the device can rescan the light tag when it is in the field of view of its camera to correct or re-determine its position or posture information.
  • the device can obtain the identification information of the light tag. After that, the device can query and determine the live character image to be presented through the identification information, and obtain the spatial position set for the live character image. For example, the device can scan a light tag installed on a shelf in a supermarket and identify the identification information of the light tag. Through the identification information of the light tag, it can be queried and determined that the live image of people currently to be presented to the device is used to introduce the shelf. The live character image of the product of the product, and the spatial location of the live character image can be obtained.
  • live broadcasters service personnel are described as live broadcasters, but it is understood that this application is not limited to this, and the live broadcasters can be anyone who wants to provide live broadcast character images to other people, for example, speakers , Presenters, video conference participants, teachers, live broadcasters using various live broadcast apps, etc.
  • Live images of people can also be images synthesized or generated by a computer.
  • a plane image or three-dimensional model of character A may be stored in advance, and then the character A can be synthesized or generated through real-time action features, voice features, etc. of character A, and the plane image or three-dimensional model of character A. A's live video. In this way, only real-time action features or voice features of character A can be transmitted, without the need to transmit real-time video of character A, thereby reducing the system's demand for transmission bandwidth and improving efficiency.
  • a plane image or three-dimensional model of character A may be stored in advance, and then the character A can be synthesized or generated by using the real-time action characteristics, voice features, etc.
  • the character in the live character image may not be a real character, but a virtual character, such as an animated character.
  • the present invention can be implemented in the form of a computer program.
  • the computer program can be stored in various storage media (for example, a hard disk, an optical disk, a flash memory, etc.), and when the computer program is executed by a processor, it can be used to implement the method of the present invention.
  • the present invention may be implemented in the form of an electronic device.
  • the electronic device includes a processor and a memory, and a computer program is stored in the memory.
  • the computer program When the computer program is executed by the processor, it can be used to implement the method of the present invention.
  • references herein to "each embodiment”, “some embodiments”, “one embodiment”, or “an embodiment”, etc. refer to the specific features, structures, or properties described in connection with the embodiments that are included in In at least one embodiment. Therefore, the appearances of the phrases “in various embodiments”, “in some embodiments”, “in one embodiment”, or “in an embodiment” in various places throughout this document do not necessarily refer to the same implementation example.
  • specific features, structures, or properties can be combined in any suitable manner in one or more embodiments. Therefore, a specific feature, structure, or property shown or described in combination with one embodiment can be combined in whole or in part with the feature, structure, or property of one or more other embodiments without limitation, as long as the combination is not incompatible. Logical or not working.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Processing Or Creating Images (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Studio Circuits (AREA)

Abstract

Provided are a method for superimposing a live image of a person onto a real scene, and an electronic device. The method comprises: determining the position and attitude of a device in a space, wherein the device has an image collector and a display medium; obtaining a spatial position configured for a live image of a person; on the basis of the position and attitude of the device and the spatial position of the live image of a person, determining a presentation position of the live image of a person on the display medium of the device; presenting, on the display medium of the device, a real scene collected by the image collector of the device; and receiving the live image of a person and superimposing the live image of a person at the presentation position onto the display medium.

Description

用于在现实场景中叠加直播人物影像的方法和电子设备Method and electronic equipment for superimposing live image of people in real scene 技术领域Technical field
本发明涉及增强现实技术领域,尤其涉及一种用于在通过设备观察到的现实场景中叠加直播人物影像的方法和电子设备。The present invention relates to the field of augmented reality technology, and in particular to a method and electronic device for superimposing live image of a person in a real scene observed by a device.
背景技术Background technique
本部分的陈述仅仅是为了提供与本申请的技术方案有关的背景信息,以帮助理解,其对于本申请的技术方案而言并不一定构成现有技术。The statements in this section are only for providing background information related to the technical solutions of the present application to help understanding, and they do not necessarily constitute prior art for the technical solutions of the present application.
在政务大厅、银行网点、展览馆、景区、商场、超市、机场、车站等场所,经常需要安排一些服务人员在一些特定地点为用户提供现场讲解或咨询服务,例如,在政务大厅为用户提供政策咨询服务,在银行网点为用户提供各种理财产品的介绍,在超市货架旁边为用户介绍各种商品,等等。In government affairs halls, bank branches, exhibition halls, scenic spots, shopping malls, supermarkets, airports, stations and other places, it is often necessary to arrange some service personnel to provide users with on-site explanations or consulting services in certain locations, for example, to provide users with policies in the government affairs hall Consultancy services, provide users with introductions of various wealth management products at bank branches, introduce various products to users next to supermarket shelves, and so on.
然而,以这种传统的服务方式,需要服务人员和用户之间面对面的近距离(通常1米左右甚至更近)口头交流,这在存在传染疫情时会极大地增加交叉感染的风险,并影响许多产业(特别是需要现场讲解人员或服务人员的产业)的顺利复工复产。尽管强制服务人员和用户佩戴口罩等防护设施可以降低该风险,但这相对而言会影响交流的顺畅度并需要花费额外的防护成本。另外,心理学研究表明,在面对面的对话交流过程中,信息的传递实际上是在语言和非语言两个层次上同时进行的,通常,通过非语言(例如,面部表情、长相、姿态、手势等)传达的信息占整个交流过程总信息量的比例超过50%,而其中面部表情和长相是非常重要的部分。而在佩戴口罩等防护设施的情况下,通过面部表情和长相等传达的信息大部分被阻隔而无法被传递,从而影响面对面交流的效果。However, this traditional service method requires close face-to-face (usually about 1 meter or even closer) verbal communication between service personnel and users, which will greatly increase the risk of cross-infection and affect Many industries (especially those that require on-site explanation personnel or service personnel) have resumed work smoothly. Although forcing service personnel and users to wear protective facilities such as masks can reduce this risk, it will affect the smoothness of communication and require additional protection costs. In addition, psychological research has shown that in the process of face-to-face dialogue, the transmission of information is actually carried out on both verbal and non-verbal levels, usually through non-verbal (for example, facial expressions, looks, postures, gestures). Etc.) The information conveyed accounts for more than 50% of the total amount of information in the entire communication process, and facial expressions and looks are very important parts. In the case of wearing masks and other protective facilities, most of the information conveyed through facial expressions and length is blocked and cannot be transmitted, which affects the effect of face-to-face communication.
另外,以上述传统的服务方式,通常同一服务人员仅能负责一个地点的用户。以银行网点为例,即使在银行网点A的工作人员a当前空闲,在银行网点B的工作人员b很忙碌,工作人员a也不能服务当前正在银行网点B处等待的其他用户。因此,上述传统服务方式是低效并且成本高昂的。 随着老龄化社会的快速到来以及人力成本的不断提高,上述传统服务方式的弊端也会越来越明显。In addition, with the above-mentioned traditional service methods, usually the same service person can only be responsible for users in one place. Taking a bank branch as an example, even if staff a at bank branch A is currently idle and staff b at bank branch B is busy, staff a cannot serve other users currently waiting at bank branch B. Therefore, the above-mentioned traditional service methods are inefficient and costly. With the rapid arrival of an aging society and the continuous increase of labor costs, the drawbacks of the above-mentioned traditional service methods will become more and more obvious.
为了解决上述问题中的至少一个,本申请提供了一种用于在设备观察到的现实场景中叠加直播人物影像的方法和电子设备。In order to solve at least one of the above-mentioned problems, the present application provides a method and electronic device for superimposing a live image of a person in a real scene observed by the device.
发明内容Summary of the invention
本发明的一个方面涉及一种用于在现实场景中叠加直播人物影像的方法,包括:确定设备在空间中的位置和姿态,其中,所述设备具有图像采集器件和显示媒介;获得为所述直播人物影像设置的空间位置;基于所述设备的位置和姿态以及所述直播人物影像的空间位置,确定所述直播人物影像在所述设备的显示媒介上的呈现位置;在所述设备的显示媒介上呈现所述设备的图像采集器件采集的现实场景;以及接收所述直播人物影像并在所述显示媒介上的所述呈现位置处叠加所述直播人物影像。One aspect of the present invention relates to a method for superimposing live image of a person in a real scene, including: determining the position and posture of a device in space, wherein the device has an image acquisition device and a display medium; The spatial position of the live character image setting; based on the position and posture of the device and the spatial position of the live character image, determine the presentation position of the live character image on the display medium of the device; display on the device Presenting the real scene collected by the image acquisition device of the device on a medium; and receiving the live image of the person and superimposing the live image of the person at the presentation position on the display medium.
在一个实施例中,所述设备接收的所述直播人物影像是背景透明的直播人物影像或者无背景的直播人物影像;或者,所述设备处理所接收的所述直播人物影像以生成背景透明的直播人物影像或者无背景的直播人物影像。In one embodiment, the live character image received by the device is a live character image with a transparent background or a live character image without a background; or, the device processes the received live character image to generate a transparent background Live images of people or live images of people without background.
在一个实施例中,所述方法还包括:确定要为所述设备呈现的直播人物影像。In one embodiment, the method further includes: determining a live image of a person to be presented to the device.
在一个实施例中,通过所述设备在空间中的位置来确定要为所述设备呈现的直播人物影像。In one embodiment, the position of the device in the space is used to determine the live image of the person to be presented to the device.
在一个实施例中,通过所述设备在空间中的位置和姿态来确定要为所述设备呈现的直播人物影像。In one embodiment, the live character image to be presented to the device is determined by the position and posture of the device in space.
在一个实施例中,所述方法还包括:获得为所述直播人物影像设置的在空间中的姿态。In an embodiment, the method further includes: obtaining the posture in the space set for the live broadcast image of the person.
在一个实施例中,所述方法还包括:基于所述设备的位置和姿态以及所述直播人物影像的姿态,确定所述直播人物影像在所述设备的显示媒介上的呈现姿态。In one embodiment, the method further includes: determining the presentation posture of the live person image on the display medium of the device based on the position and posture of the device and the posture of the live person image.
在一个实施例中,使得所述直播人物影像的正面始终朝向所述设备。In one embodiment, the front of the live person image is always facing the device.
在一个实施例中,所述方法还包括:采集所述设备的用户的影像、声音或文字输入;以及将所述影像、声音或文字输入发送给提供所述直播人物影像的直播者。In one embodiment, the method further includes: collecting video, audio, or text input of the user of the device; and sending the video, audio, or text input to a live broadcaster who provides the live image of the person.
在一个实施例中,所述方法还包括:在所述设备的显示媒介上叠加所述直播人物影像之后,根据所述设备的新的位置和姿态以及所述直播人物影像的空间位置,确定所述直播人物影像在所述设备的显示媒介上的新的呈现位置。In one embodiment, the method further includes: after superimposing the live person image on the display medium of the device, determining the live person image according to the new position and posture of the device and the spatial position of the live person image The new presentation position of the live character image on the display medium of the device.
在一个实施例中,所述方法还包括:在所述设备的显示媒介上叠加所述直播人物影像之后,所述直播人物影像在所述显示媒介上的呈现位置保持不变。In an embodiment, the method further includes: after superimposing the live character image on the display medium of the device, the presentation position of the live character image on the display medium remains unchanged.
在一个实施例中,所述方法还包括:在所述设备的显示媒介上叠加所述直播人物影像之后,根据所述设备的用户的指示使得所述直播人物影像在所述显示媒介上的呈现位置保持不变。In one embodiment, the method further includes: after superimposing the live character image on the display medium of the device, making the live character image appear on the display medium according to the instruction of the user of the device The location remains the same.
在一个实施例中,所述确定设备在空间中的位置和姿态包括:通过所述设备扫描部署在现实场景中的光通信装置来确定所述设备在空间中的初始位置和姿态,并持续跟踪所述设备在空间中的位置和姿态变化。In one embodiment, the determining the position and posture of the device in space includes: scanning the optical communication device deployed in the real scene by the device to determine the initial position and posture of the device in space, and continuously tracking The position and posture of the device in space change.
在一个实施例中,所述方法还包括:所述设备获得所述光通信装置的标识信息,并通过所述标识信息确定要为所述设备呈现的直播人物影像。In one embodiment, the method further includes: the device obtains the identification information of the optical communication device, and determines the live person image to be presented for the device through the identification information.
在一个实施例中,在所述设备的显示媒介上叠加至少两个直播人物影像。In one embodiment, at least two live person images are superimposed on the display medium of the device.
在一个实施例中,所述直播人物影像是二维人物影像或者三维人物影像。In one embodiment, the live image of a person is a two-dimensional image of a person or a three-dimensional image of a person.
在一个实施例中,所述方法还包括:在接收所述直播人物影像之前,指示与所述直播人物影像关联的直播者提供所述直播人物影像。In one embodiment, the method further includes: before receiving the live person image, instructing a live broadcaster associated with the live person image to provide the live person image.
本发明的另一个方面涉及一种存储介质,其中存储有计算机程序,在所述计算机程序被处理器执行时,能够用于实现上述的方法。Another aspect of the present invention relates to a storage medium in which a computer program is stored, and when the computer program is executed by a processor, it can be used to implement the above-mentioned method.
本发明的再一个方面涉及一种电子设备,其包括处理器和存储器,所述存储器中存储有计算机程序,在所述计算机程序被处理器执行时,能够用于实现上述的方法。Another aspect of the present invention relates to an electronic device, which includes a processor and a memory, and a computer program is stored in the memory. When the computer program is executed by the processor, the computer program can be used to implement the above-mentioned method.
通过本发明的方案,实现了一种基于现实场景中的位置或者与现实场景中的位置绑定的直播交互方法,使得设备用户能够体验到类似于真人现场服务的非接触式场景服务,而并不需要服务人员和用户进行面对面的近距离口头交流,从而在存在传染疫情时可以极大地降低交叉感染的风险,并帮助相关产业顺利复工复产。另外,通过该方案,同一服务人员可以为不同位置的用户服务,从而可以打破地理局限性、节省人力成本、提高服 务效率。Through the solution of the present invention, a live interaction method based on the position in the real scene or bound with the position in the real scene is realized, so that the device user can experience the non-contact scene service similar to the live service, and There is no need for service personnel and users to have face-to-face close verbal communication, which can greatly reduce the risk of cross-infection when there is an epidemic, and help related industries to resume work and production smoothly. In addition, through this solution, the same service personnel can serve users in different locations, which can break geographic limitations, save labor costs, and improve service efficiency.
附图说明Description of the drawings
以下参照附图对本发明的实施例作进一步说明,其中:The embodiments of the present invention will be further described below with reference to the accompanying drawings, in which:
图1示出了根据一个实施例的用于在通过设备观察到的现实场景中叠加直播人物影像的方法;Fig. 1 shows a method for superimposing live image of a person in a real scene observed by a device according to an embodiment;
图2示出了用户在现实场景中观看直播人物影像的示意图;Figure 2 shows a schematic diagram of a user watching live images of people in a real scene;
图3示出了用于提供图2所示的现实场景中的直播人物影像的直播者和摄像设备;Fig. 3 shows a live broadcaster and camera equipment used to provide live images of people in the real scene shown in Fig. 2;
图4示出了在用户的设备的显示媒介上呈现的示意图像;Figure 4 shows a schematic image presented on the display medium of the user's device;
图5是用于示出本发明的实际效果的一个示例真实图像;Fig. 5 is an example real image for showing the actual effect of the present invention;
图6示出了一种示例性的光标签;Figure 6 shows an exemplary optical label;
图7示出了一种示例性的光标签网络。Figure 7 shows an exemplary optical label network.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图通过具体实施例对本发明进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限制本发明。In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail through specific embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.
图1示出了根据一个实施例的用于在通过设备观察到的现实场景中叠加直播人物影像的方法。设备例如可以是用户携带或控制的设备(例如,手机、平板电脑、智能眼镜、AR/VR眼镜、AR/VR头盔、智能手表等等),并且具有图像采集器件(例如摄像头)和显示媒介(例如屏幕)。该方法可以包括如下步骤:Fig. 1 shows a method for superimposing a live image of a person in a real scene observed by a device according to an embodiment. The device may be, for example, a device carried or controlled by a user (for example, a mobile phone, a tablet computer, smart glasses, AR/VR glasses, an AR/VR helmet, a smart watch, etc.), and has an image acquisition device (for example, a camera) and a display medium ( For example screen). The method may include the following steps:
步骤1001:确定设备在空间中的位置和姿态。Step 1001: Determine the position and posture of the device in space.
可以使用各种可行的方式来确定设备在空间中的位置和姿态,例如,可以在空间中布置视觉标志并通过分析设备采集到的视觉标志的图像来确定设备的位置和姿态;可以建立现实场景的三维模型或者点云数据,并通过分析设备采集到的场景图像来确定设备的位置和姿态;可以使用高精度陀螺仪等来确定设备的位置和姿态;可以在空间中布置可以发射无线电信号的信标,并通过分析设备接收到的无线电信号来确定设备的位置和姿态;可以使用卫星定位信号来确定设备的位置并使用陀螺仪等来确定设备 的姿态;以及上述各种方式的结合;等等。Various feasible methods can be used to determine the location and posture of the device in space, for example, visual signs can be arranged in the space and the location and posture of the device can be determined by analyzing the images of the visual signs collected by the device; real scenes can be established The position and attitude of the equipment can be determined by analyzing the scene images collected by the equipment; the position and attitude of the equipment can be determined by using high-precision gyroscopes; it can be arranged in space that can transmit radio signals Beacons, and determine the location and attitude of the device by analyzing the radio signals received by the device; you can use satellite positioning signals to determine the location of the device and use a gyroscope to determine the attitude of the device; and a combination of the above methods; etc. Wait.
步骤1002:获得为该直播人物影像设置的空间位置。Step 1002: Obtain the spatial position set for the live character image.
例如,可以由政务大厅、银行网点、展览馆、景区、商场、超市、机场、车站等的服务人员(在本文中可以被称为“直播者”)来实时地提供直播人物影像,该直播人物影像用于向设备用户提供内容讲解、答复设备用户的咨询、等等。通过使用直播人物影像,使得服务人员可以远程地且实时地向用户讲解,并能回答用户的咨询等,而并不需要与用户进行近距离的面对面交流,也不需要将服务人员局限于某个固定地点。For example, service personnel in government halls, bank branches, exhibition halls, scenic spots, shopping malls, supermarkets, airports, stations, etc. (may be referred to as "live broadcasters" in this article) can provide live images of people in real time. Images are used to provide content explanations to device users, answer device users’ inquiries, and so on. Through the use of live images of people, service personnel can explain to users remotely and in real time, and can answer user inquiries, etc., without the need to have close face-to-face communication with users, and there is no need to limit service personnel to a certain Fixed location.
可以例如由直播人物影像上的一个点的空间位置、多个点(例如,直播人物影像的轮廓上的多个点)的空间位置、或者整个直播人物影像区域的空间位置来表示或者定义直播人物影像的空间位置(也即,直播人物影像在空间中的呈现位置)。例如,如果直播人物影像是一个具有矩形形状的影像,可以例如使用矩形影像的中心点在空间中的位置坐标来定义矩形影像的空间位置,可以例如使用矩形影像的某个角(例如,左上角、左下角、右上角、右下角)在空间中的位置坐标来定义矩形影像的空间位置,可以例如使用矩形影像的两个对角(例如,左上角与右下角,或者左下角与右上角)在空间中的位置坐标来定义矩形影像的空间位置,等等。For example, the live character can be represented or defined by the spatial position of a point on the live character image, the spatial positions of multiple points (for example, multiple points on the outline of the live character image), or the spatial position of the entire live character image area The spatial position of the image (that is, the presentation position of the live character image in space). For example, if the live character image is an image with a rectangular shape, for example, the position coordinates of the center point of the rectangular image can be used to define the spatial position of the rectangular image. For example, a certain corner of the rectangular image (for example, the upper left corner , Lower left corner, upper right corner, lower right corner) in space to define the spatial position of the rectangular image. For example, two opposite corners of the rectangular image (for example, the upper left corner and the lower right corner, or the lower left corner and the upper right corner) can be used. The position coordinates in space define the spatial position of the rectangular image, and so on.
在获得为直播人物影像设置的空间位置之前,可以使用各种方式确定要为设备呈现的直播人物影像。在一个实施例中,可以通过设备在空间中的位置以及可选的姿态来确定可以为设备呈现的直播人物影像。例如,设备可以扫描安装于展览馆中的某个视觉标志以确定设备在展览馆中的位置以及可选的姿态,通过设备在展览馆中的位置以及可选的姿态,可以进行查询以确定当前可以为设备呈现的直播人物影像(例如,用于某个展品的介绍的直播人物影像)。Before obtaining the spatial location set for the live broadcast character image, various methods can be used to determine the live broadcast character image to be presented to the device. In one embodiment, the live person image that can be presented to the device can be determined by the location of the device in space and an optional posture. For example, the equipment can scan a certain visual sign installed in the exhibition hall to determine the location of the equipment in the exhibition hall and optional posture. Through the location of the equipment in the exhibition hall and the optional posture, it can be inquired to determine the current It can be a live video of people presented by the device (for example, a live video of people used for the introduction of a certain exhibit).
在一个实施例中,可以通过其他信息来确定要为设备呈现的直播人物影像,例如,可以通过设备获得的视觉标志的标识信息来进行查询以确定当前可以为设备呈现的直播人物影像。In one embodiment, other information may be used to determine the live person image to be presented for the device, for example, the identification information of the visual sign obtained by the device may be used to query to determine the live person image that can currently be presented for the device.
在一个实施例中,可能获得多个可以为设备呈现的直播人物影像,并且可以由设备用户从中选择以确定当前要呈现的直播人物影像。例如,对于当前在政务大厅中的设备用户,可以提示用户目前有涉及多种业务的多个直播人物影像可供呈现,用户可以根据需要(例如,根据其想要办理的业务)选择其感兴趣的直播人物影像。In one embodiment, multiple live person images that can be presented to the device may be obtained, and the device user can select from them to determine the live person image currently to be presented. For example, for a device user currently in the government affairs hall, the user can be reminded that there are multiple live images of people involved in a variety of services for presentation, and the user can choose what they are interested in according to their needs (for example, according to the business they want to handle) Live images of people.
在一个实施例中,可以基于与设备或者设备用户相关的信息(例如,用户的年龄、性别、职业等信息)筛选直播人物影像,从而可以根据设备用户的偏好向其呈现其可能喜欢的直播人物影像。In one embodiment, the live person images can be filtered based on information related to the device or the device user (for example, the user’s age, gender, occupation, etc.), so that the device user’s preferences may be presented to him on the live broadcast. image.
在一个实施例中,在确定要为设备用户呈现的直播人物影像之后或者在接收该直播人物影像之前,可以例如通过设备将相应的指示或者消息发送给用于提供该直播人物影像的直播者,以使得直播者可以开启直播并向设备发送直播人物影像。In one embodiment, after determining the live person image to be presented to the device user or before receiving the live person image, for example, the device may send a corresponding instruction or message to the live broadcaster for providing the live person image, So that the live broadcaster can start the live broadcast and send the live broadcast image of the person to the device.
在一个实施例中,一个直播者可以与多个直播人物影像相关联,例如,一个直播者可以负责与展厅中的多个展品对应的多个直播人物影像。在这种情况下,发送给直播者的指示或者消息中可以标识出相应的直播人物影像(例如,在发送给直播者的指示或者消息中包含相应直播人物影像的标识信息),以使得直播者知悉,例如,使得直播者知悉当前应该为哪个展品提供相应的直播人物影像。In one embodiment, one live broadcaster may be associated with multiple live person images. For example, one live broadcaster may be responsible for multiple live person images corresponding to multiple exhibits in the exhibition hall. In this case, the instruction or message sent to the live broadcaster can identify the corresponding live person image (for example, the instruction or message sent to the broadcaster includes the identification information of the corresponding live person image), so that the broadcaster Knowing, for example, enables the live broadcaster to know which exhibit should be provided with the corresponding live person image.
在一个实施例中,一个直播人物影像可以与多个直播者相关联,多个直播者中的任何一个空闲的直播者可以提供该直播人物影像。在一个实施例中,可以由设备用户来选择其喜欢的直播者,或者可以由最早对设备用户的请求作出应答的直播者来提供该直播人物影像。In one embodiment, a live person image may be associated with multiple live broadcasters, and any idle broadcaster among the multiple live broadcasters may provide the live broadcast person image. In an embodiment, the device user may select his favorite live broadcaster, or the live broadcaster who responds to the device user's request first may provide the live broadcast character image.
在一个实施例中,还可以获得为要呈现的直播人物影像设置的在空间中的姿态,其例如可以用于定义直播人物影像在空间中的朝向等。In one embodiment, the posture in the space set for the live character image to be presented can also be obtained, which can be used, for example, to define the orientation of the live character image in the space.
步骤1003:基于设备的位置和姿态以及直播人物影像的空间位置,确定直播人物影像在设备的显示媒介上的呈现位置。Step 1003: Based on the location and posture of the device and the spatial position of the live person image, determine the presentation position of the live person image on the display medium of the device.
在确定了设备在空间中的位置和姿态之后,实际上可以确定设备的图像采集器件的当前视野范围。进一步地,基于直播人物影像的空间位置可以确定该直播人物影像是否位于设备的图像采集器件的当前视野范围内,以及位于该视野范围内的什么位置,从而可以确定直播人物影像在设备的显示媒介上的呈现位置。After determining the position and posture of the device in space, the current field of view of the image acquisition device of the device can actually be determined. Further, based on the spatial position of the live character image, it can be determined whether the live character image is within the current field of view of the image acquisition device of the device, and where it is located within the field of view, so as to determine the display medium of the live character image on the device The presentation position on the.
在一个实施例中,在直播人物影像具有空间中的姿态的情况下,可以进一步基于设备的位置和姿态以及直播人物影像的姿态来确定在设备的显示媒介上呈现的直播人物影像的姿态。In one embodiment, in the case where the live character image has a posture in space, the posture of the live character image presented on the display medium of the device may be further determined based on the position and posture of the device and the posture of the live character image.
在一个实施例中,可以使得直播人物影像的某个方向始终面向观察该直播人物影像的用户的设备。例如,对于二维直播人物影像,可以使得直播人物影像的正面始终朝向用户的设备,如此,即使设备用户处于不同的 位置或者改变位置,也能感觉到直播人物影像中的人物始终在面向自己进行讲解。In one embodiment, a certain direction of the live character image can always face the device of the user who observes the live character image. For example, for a two-dimensional live image of a person, the front of the live image of the person can always face the user's device, so that even if the device user is in a different position or changes location, he can feel that the person in the live person image is always facing himself explain.
步骤1004:在设备的显示媒介上呈现设备的图像采集器件采集的现实场景。Step 1004: Present the real scene collected by the image acquisition device of the device on the display medium of the device.
设备可以通过其图像采集器件实时地采集现实场景,并将现实场景的图像呈现到设备的显示媒介上。The device can collect the real scene in real time through its image acquisition device, and present the image of the real scene on the display medium of the device.
步骤1005:接收直播人物影像并在设备的显示媒介上的所述呈现位置处叠加直播人物影像。Step 1005: Receive the live person image and superimpose the live person image at the presentation position on the display medium of the device.
通过这种方式,实际上可以将直播人物影像叠加到通过设备观察到的现实场景中的合适位置,从而可以向设备用户提供与现实场景紧密结合的直播人物影像,以例如向设备用户进行讲解、答复咨询等。In this way, the live character image can actually be superimposed on the appropriate position in the real scene observed by the device, so that the live character image closely integrated with the real scene can be provided to the device user, for example, to explain to the device user, Respond to inquiries, etc.
在一个实施例中,设备接收的直播人物影像可以是背景透明的直播人物影像(例如,带alpha透明通道的直播人物影像)或者是无背景的直播人物影像。例如,可以在采集直播人物影像之后或者在传输直播人物影像的过程中处理该直播人物影像以产生背景透明的直播人物影像,并将其发送给设备。在一个实施例中,设备可以接收包含不透明背景的直播人物影像并处理该直播人物影像以生成背景透明的直播人物影像或者无背景的直播人物影像。为了便于产生背景透明的直播人物影像或者无背景的直播人物影像,可以在拍摄直播人物影像时为人物布置单色背景,例如绿布。通过这种方式,可以使得叠加于现实场景中的直播人物影像看起来只有人物,而不具备拍摄人物时的原始背景。如此,当用户通过设备的显示媒介观察直播人物影像时,仅会观察到人物,而不会观察到人物的原始背景,就好像人物真实地位于现实场景中一样,从而可以实现更好的用户体验。In one embodiment, the live character image received by the device may be a live character image with a transparent background (for example, a live character image with an alpha transparent channel) or a live character image without a background. For example, the live person image may be processed after collecting the live person image or during the process of transmitting the live person image to generate a live person image with a transparent background, and send it to the device. In one embodiment, the device may receive a live character image with an opaque background and process the live character image to generate a live character image with a transparent background or a live character image without a background. In order to facilitate the generation of a live character image with a transparent background or a live character image without a background, a monochrome background, such as a green cloth, can be arranged for the characters when the live character image is shot. In this way, the live image of people superimposed in the real scene can look like only people, without the original background when shooting people. In this way, when the user observes the live character image through the display medium of the device, only the character will be observed, but the original background of the character will not be observed, as if the character is actually located in the real scene, so that a better user experience can be achieved .
在一个实施例中,为了实现设备用户与直播者之间的更好的交流,可以通过设备采集设备用户的影像、声音、或者文字输入中的至少一项,并将其发送给直播者,以使得双方可以实时交互。In one embodiment, in order to achieve better communication between the device user and the broadcaster, at least one of the image, sound, or text input of the device user may be collected through the device and sent to the broadcaster to So that both parties can interact in real time.
图2示出了用户在现实场景中观看直播人物影像的示意图。在该现实场景中包括货架202,用户201持有设备203并通过设备203的显示媒介观看布置于或者嵌入于该现实场景中的直播人物影像,该直播人物影像在现实场景中的部署位置例如由虚线框204所示。可以由虚线框204上的一个或多个点的空间位置来定义整个虚线框204在空间中的位置。虚线框204可以具有预设的或者默认的姿态,例如默认虚线框204与地面垂直。Figure 2 shows a schematic diagram of a user watching a live image of a person in a real scene. The real scene includes a shelf 202, and the user 201 holds the device 203 and watches the live image of a person who is arranged or embedded in the real scene through the display medium of the device 203. The deployment position of the live person image in the real scene is, for example, The dashed box 204 is shown. The position of the entire dashed frame 204 in space can be defined by the spatial position of one or more points on the dashed frame 204. The dashed frame 204 may have a preset or default posture, for example, the default dashed frame 204 is perpendicular to the ground.
图3示出了用于提供图2所示的现实场景中的直播人物影像的直播者302,以及用于采集直播者302的影像以生成直播人物影像的摄像设备301。FIG. 3 shows a live broadcaster 302 for providing live person images in the real scene shown in FIG. 2, and a camera device 301 for collecting images of the live broadcaster 302 to generate live broadcast person images.
图4示出了在用户201的设备203的显示媒介上呈现的示意图像,其中,通过设备203的图像采集器件获得了现实场景的图像(其中包括货架202),并将其呈现在设备203的显示媒介上。另外,设备203还接收到由直播者302的摄像设备301提供的直播人物影像,并根据设备203的位置和姿态以及为该直播人物影像设置的空间位置,在设备203的显示媒介上的相应呈现位置处叠加背景透明的包含直播者302的直播人物影像,从而实现了直播者302与现实场景的完美融合。Figure 4 shows a schematic image presented on the display medium of the device 203 of the user 201, in which an image of a real scene (including the shelf 202) is obtained by the image acquisition device of the device 203 and presented on the device 203 On the display medium. In addition, the device 203 also receives the live person image provided by the camera device 301 of the live broadcaster 302, and displays it on the display medium of the device 203 according to the location and posture of the device 203 and the spatial position set for the live person image. The position is superimposed on the live broadcast character image containing the live broadcaster 302 with a transparent background, thereby realizing the perfect integration of the live broadcaster 302 and the real scene.
图5是用于示出本发明的实际效果的一个示例真实图像。该真实图像所示出的现实场景中包括货架,当用户使用手机观察该现实场景时,可以在用户手机屏幕所呈现的现实场景中叠加背景透明的包含讲解员的直播人物影像。如此,用户感觉到就好像有一个真实的讲解员在货架前给其介绍各种商品一样。Fig. 5 is an example real image for showing the actual effect of the present invention. The real scene shown by the real image includes a shelf. When the user uses a mobile phone to observe the real scene, the real scene presented on the screen of the user's mobile phone can be superimposed with a live character image containing an instructor with a transparent background. In this way, the user feels as if there is a real explainer introducing various products to him in front of the shelf.
在一个实施例中,在直播人物影像中可以包括两个或者两个以上的人物,并且该两个或者两个以上的人物可以进行语言或肢体互动,以向用户提供更详细的讲解。In one embodiment, two or more characters may be included in the live character image, and the two or more characters may interact verbally or physically to provide the user with a more detailed explanation.
在一个实施例中,可以为现实场景布置至少两个直播人物影像,并且可以在设备的显示媒介上叠加至少两个直播人物影像。至少两个直播人物影像可以同时呈现或者依次呈现在设备的显示媒介上。In one embodiment, at least two live person images may be arranged for the real scene, and at least two live person images may be superimposed on the display medium of the device. At least two live person images can be presented simultaneously or sequentially on the display medium of the device.
在一个实施例中,直播人物影像可以是二维人物影像。在一个实施例中,直播人物影像可以是三维人物影像。例如,在拍摄人物影像时,可以使用位于人物周围的多个摄像设备从多个不同角度拍摄,从而提供三维人物影像。In one embodiment, the live image of a person may be a two-dimensional image of a person. In one embodiment, the live character image may be a three-dimensional character image. For example, when shooting images of people, multiple camera devices located around the people can be used to shoot from multiple different angles to provide three-dimensional images of people.
在一个实施例中,还可以设置或者调整直播人物影像的尺寸,例如调整以使得其中的人物具有与真人类似的大小。In one embodiment, the size of the live character image can also be set or adjusted, for example, adjusted so that the character therein has a size similar to that of a real person.
在一个实施例中,在设备的显示媒介上叠加直播人物影像之后,可以跟踪设备的位置和姿态变化,并根据设备的新的位置和姿态以及直播人物影像的空间位置,实时地确定直播人物影像在设备的显示媒介上的新的呈现位置。类似地,也可以根据设备的新的位置和姿态以及为直播人物影像设置的在空间中的姿态,实时地确定直播人物影像在设备的显示媒介上的新的呈现姿态。这种方式可以实现很好的增强现实效果,使设备用户感觉 到直播者好像真实地位于现实场景中。In one embodiment, after superimposing the live character image on the display medium of the device, the position and posture change of the device can be tracked, and the live character image can be determined in real time according to the new position and posture of the device and the spatial position of the live character image The new presentation position on the display medium of the device. Similarly, according to the new position and posture of the device and the posture in the space set for the live character image, the new presentation posture of the live character image on the display medium of the device can be determined in real time. This method can achieve a good augmented reality effect, making the device user feel that the live broadcaster is actually in the real scene.
在一个实施例中,在设备的显示媒介上叠加了直播人物影像之后,可以使得直播人物影像在显示媒介上具有固定的呈现位置和/或呈现姿态。In one embodiment, after the live character image is superimposed on the display medium of the device, the live character image can be made to have a fixed presentation position and/or presentation posture on the display medium.
在一个实施例中,当在设备的显示媒介上叠加了直播人物影像之后,可以根据设备用户的指示使得直播人物影像在显示媒介上具有固定的呈现位置和/或呈现姿态。如此,即使设备用户移动(例如,离开当前位置)时,也可以通过设备的显示媒介以期望的呈现位置和/或呈现姿态观看直播人物影像。例如,当在设备的显示媒介上叠加了直播人物影像之后,设备用户可以改变设备在空间中的位置和/或姿态,从而使得叠加在设备显示媒介上的直播人物影像具有设备用户期望的呈现位置和/或呈现姿态,此时,设备用户可以发送指示(例如通过点击在设备显示媒介上呈现的按钮)来使得直播人物影像的当前呈现位置和/或呈现姿态在此后保持不变,即使设备在空间中改变位置或姿态。In one embodiment, after the live character image is superimposed on the display medium of the device, the live character image may have a fixed presentation position and/or presentation posture on the display medium according to the instruction of the device user. In this way, even when the device user moves (for example, leaves the current location), the live image of the person can be watched in a desired presentation position and/or presentation posture through the display medium of the device. For example, after a live character image is superimposed on the display medium of the device, the device user can change the position and/or posture of the device in space, so that the live character image superimposed on the device display medium has the presentation position desired by the device user And/or presentation posture. At this time, the device user can send instructions (for example, by clicking a button presented on the device display medium) to make the current presentation position and/or presentation posture of the live character image remain unchanged thereafter, even if the device is in Change position or posture in space.
在一个实施例中,可以通过布置在空间中的光通信装置来确定设备在空间中的位置和姿态。光通信装置也称为光标签,这两个术语在本文中可以互换使用。光标签能够通过不同的发光方式来传递信息,其具有识别距离远、可见光条件要求宽松的优势,并且光标签所传递的信息可以随时间变化,从而可以提供大的信息容量和灵活的配置能力。In one embodiment, the position and posture of the device in the space can be determined by the optical communication device arranged in the space. Optical communication devices are also called optical tags, and these two terms can be used interchangeably in this article. Optical tags can transmit information through different light-emitting methods, which have the advantages of long recognition distance and relaxed requirements for visible light conditions, and the information transmitted by optical tags can change over time, which can provide large information capacity and flexible configuration capabilities.
光标签中通常可以包括控制器和至少一个光源,该控制器可以通过不同的驱动模式来驱动光源,以向外传递不同的信息。图6示出了一种示例性的光标签100,其包括三个光源(分别是第一光源101、第二光源102、第三光源103)。光标签100还包括控制器(在图6中未示出),其用于根据要传递的信息为每个光源选择相应的驱动模式。例如,在不同的驱动模式下,控制器可以使用不同的驱动信号来控制光源的发光方式,从而使得当使用具有成像功能的设备拍摄光标签100时,其中的光源的成像可以呈现出不同的外观(例如,不同的颜色、图案、亮度、等等)。通过分析光标签100中的光源的成像,可以解析出各个光源此刻的驱动模式,从而解析出光标签100此刻传递的信息。可以理解,图6所示的光标签仅仅用作示例,光标签可以具有与图6所示的示例不同的形状,并且可以具有与图6所示的示例不同数量和/或不同形状的光源。An optical tag usually includes a controller and at least one light source, and the controller can drive the light source through different driving modes to transmit different information to the outside. Fig. 6 shows an exemplary optical label 100, which includes three light sources (respectively a first light source 101, a second light source 102, and a third light source 103). The optical tag 100 also includes a controller (not shown in FIG. 6), which is used to select a corresponding driving mode for each light source according to the information to be transmitted. For example, in different driving modes, the controller can use different driving signals to control the light emitting mode of the light source, so that when a device with imaging function is used to photograph the light label 100, the imaging of the light source therein can show a different appearance. (For example, different colors, patterns, brightness, etc.). By analyzing the imaging of the light source in the optical label 100, the driving mode of each light source at the moment can be analyzed, so as to analyze the information transmitted by the optical label 100 at the moment. It can be understood that the optical label shown in FIG. 6 is only used as an example, and the optical label may have a different shape from the example shown in FIG. 6 and may have a different number and/or different shape of light sources from the example shown in FIG. 6.
为了基于光标签向用户提供相应的服务,每个光标签可以被分配一个标识信息(ID),该标识信息用于由光标签的制造者、管理者或使用者等 唯一地识别或标识光标签。通常,可由光标签中的控制器驱动光源以向外传递该标识信息,而用户可以使用设备对光标签进行图像采集来获得该光标签传递的标识信息,从而可以基于该标识信息来访问相应的服务,例如,访问与标识信息相关联的网页、获取与标识信息相关联的其他信息(例如,与该标识信息对应的光标签的位置信息)等等。设备可以通过图像采集器件对光标签进行图像采集来获得包含光标签的图像,并通过分析图像中的光标签(或光标签中的各个光源)的成像以识别出光标签传递的信息。In order to provide users with corresponding services based on optical tags, each optical tag can be assigned an identification information (ID), which is used to uniquely identify or identify the optical tag by the manufacturer, manager, or user of the optical tag . Generally, the light source can be driven by the controller in the optical tag to transmit the identification information outward, and the user can use the device to collect the image of the optical tag to obtain the identification information transmitted by the optical tag, so that the corresponding identification information can be accessed based on the identification information. Services, for example, accessing a webpage associated with the identification information, obtaining other information associated with the identification information (for example, location information of a light tag corresponding to the identification information), and so on. The device can obtain an image containing the optical label through image acquisition device on the optical label, and identify the information conveyed by the optical label by analyzing the imaging of the optical label (or each light source in the optical label) in the image.
可以将与每个光标签相关的信息存储于服务器中。在现实中,还可以将大量的光标签构建成一个光标签网络。图7示出了一种示例性的光标签网络,该光标签网络包括多个光标签和至少一个服务器。可以在服务器上保存每个光标签的标识信息(ID)或其他信息,例如与该光标签相关的服务信息、与该光标签相关的描述信息或属性,如光标签的位置信息、型号信息、物理尺寸信息、物理形状信息、姿态或朝向信息等。光标签也可以具有统一的或默认的物理尺寸信息和物理形状信息等。设备可以使用识别出的光标签的标识信息来从服务器查询获得与该光标签有关的其他信息。光标签的位置信息可以是指该光标签在物理世界中的实际位置,其可以通过地理坐标信息来指示。服务器可以是在计算装置上运行的软件程序、一台计算装置或者由多台计算装置构成的集群。光标签可以是离线的,也即,光标签不需要与服务器进行通信。当然,可以理解,能够与服务器进行通信的在线光标签也是可行的。The information related to each optical tag can be stored in the server. In reality, a large number of optical labels can also be constructed into an optical label network. Fig. 7 shows an exemplary optical label network, which includes a plurality of optical labels and at least one server. The identification information (ID) or other information of each optical label can be saved on the server, such as service information related to the optical label, description information or attributes related to the optical label, such as location information, model information, and Physical size information, physical shape information, posture or orientation information, etc. The optical label may also have unified or default physical size information and physical shape information. The device can use the identified identification information of the optical tag to query the server to obtain other information related to the optical tag. The location information of the optical tag may refer to the actual location of the optical tag in the physical world, which may be indicated by geographic coordinate information. The server may be a software program running on a computing device, a computing device, or a cluster composed of multiple computing devices. The optical tag may be offline, that is, the optical tag does not need to communicate with the server. Of course, it can be understood that online optical tags that can communicate with the server are also feasible.
在一个实施例中,设备可以通过采集包括光标签的图像并分析该图像(例如,分析图像中的光标签的成像的大小、透视变形等)来确定其相对于光标签的位置,该相对位置可以包括设备相对于光标签的距离和方向。在一个实施例中,设备还可以通过采集包括光标签的图像并分析该图像来确定其相对于光标签的姿态。例如,当光标签的成像位置或成像区域位于设备成像视野的中心时,可以认为设备当前正对着光标签。In one embodiment, the device can determine its position relative to the optical label by collecting an image including the optical label and analyzing the image (for example, analyzing the imaging size of the optical label in the image, perspective distortion, etc.), and the relative position This can include the distance and orientation of the device relative to the optical tag. In one embodiment, the device can also determine its posture relative to the optical label by collecting an image including the optical label and analyzing the image. For example, when the imaging position or imaging area of the optical tag is located in the center of the imaging field of the device, it can be considered that the device is currently facing the optical tag.
在一些实施例中,设备可以通过扫描光标签来识别光标签传递的标识信息,并可以通过该标识信息来获得(例如通过查询)光标签在现实场景坐标系中的位置和姿态信息。现实场景坐标系例如可以是某个场所坐标系(例如,针对某个房间、建筑物、园区等建立的坐标系)或者世界坐标系中。如此,基于光标签在现实场景坐标系中的位置和姿态信息以及设备相 对于光标签的位置或姿态信息,可以确定设备在现实场景坐标系中的位置或姿态信息。因此,所确定的设备在空间中的位置或姿态可以是设备相对于光标签的位置或姿态,但也可以是设备在现实场景坐标系中的位置或姿态。In some embodiments, the device can identify the identification information transmitted by the optical tag by scanning the optical tag, and can obtain (for example, by querying) the position and posture information of the optical tag in the real scene coordinate system through the identification information. The real scene coordinate system may be, for example, a certain place coordinate system (for example, a coordinate system established for a certain room, building, park, etc.) or a world coordinate system. In this way, based on the position and posture information of the light tag in the real scene coordinate system and the position or posture information of the device relative to the light tag, the position or posture information of the device in the real scene coordinate system can be determined. Therefore, the determined position or posture of the device in space may be the position or posture of the device relative to the optical tag, but may also be the position or posture of the device in the real scene coordinate system.
在一个实施例中,设备可以通过扫描光标签来识别光标签传递的标识信息,并通过该标识信息确定该光标签所在现实场景的场景信息,该场景信息例如可以是现实场景的三维模型信息、现实场景的点云信息、光标签周围的辅助标志的信息以及其他信息等。之后,基于所确定的场景信息以及设备所采集的现实场景的图像可以通过视觉定位来确定设备在现实场景中的位置和/或姿态。In one embodiment, the device can identify the identification information transmitted by the optical tag by scanning the optical tag, and use the identification information to determine the scene information of the real scene where the optical tag is located. The scene information may be, for example, three-dimensional model information of the real scene, The point cloud information of the real scene, the information of the auxiliary signs around the light tag, and other information, etc. Thereafter, based on the determined scene information and the image of the real scene collected by the device, the position and/or posture of the device in the real scene can be determined through visual positioning.
在通过扫描光标签确定设备在空间中的位置和/或姿态之后,设备可能会发生平移和/或旋转,在这种情况下,可以例如使用设备内置的各种传感器(例如,加速度传感器、磁力传感器、方向传感器、重力传感器、陀螺仪、摄像头等)通过本领域已知的方法(例如,惯性导航、视觉里程计、SLAM、VSLAM、SFM等)来测量或跟踪其位置变化和/或姿态变化,从而确定设备的实时位置和/或姿态。在一个实施例中,设备可以在光标签处于其摄像头视野中时重新扫描光标签以校正或者重新确定其位置或姿态信息。After scanning the optical tag to determine the position and/or posture of the device in space, the device may be translated and/or rotated. In this case, various sensors built into the device (e.g., acceleration sensor, magnetic Sensors, direction sensors, gravity sensors, gyroscopes, cameras, etc.) measure or track changes in position and/or attitude through methods known in the art (for example, inertial navigation, visual odometry, SLAM, VSLAM, SFM, etc.) , So as to determine the real-time position and/or posture of the device. In one embodiment, the device can rescan the light tag when it is in the field of view of its camera to correct or re-determine its position or posture information.
在一个实施例中,设备可以获得光标签的标识信息,之后,设备可以通过该标识信息来查询确定要呈现的直播人物影像,并获得为该直播人物影像设置的空间位置。例如,设备可以扫描安装于超市某个货架的光标签并识别该光标签的标识信息,通过该光标签的标识信息,可以查询确定当前要为设备呈现的直播人物影像是用于介绍该货架上的商品的直播人物影像,并可以获得该直播人物影像的空间位置。In one embodiment, the device can obtain the identification information of the light tag. After that, the device can query and determine the live character image to be presented through the identification information, and obtain the spatial position set for the live character image. For example, the device can scan a light tag installed on a shelf in a supermarket and identify the identification information of the light tag. Through the identification information of the light tag, it can be queried and determined that the live image of people currently to be presented to the device is used to introduce the shelf. The live character image of the product of the product, and the spatial location of the live character image can be obtained.
在本申请的一些实施例中以服务人员作为直播者进行了描述,但可以理解,本申请并不局限于此,直播者可以是希望向其他人提供直播人物影像的任何人,例如,演讲者、讲解者、视频会议参与者、教师、使用各种直播APP的直播者、等等。In some embodiments of this application, service personnel are described as live broadcasters, but it is understood that this application is not limited to this, and the live broadcasters can be anyone who wants to provide live broadcast character images to other people, for example, speakers , Presenters, video conference participants, teachers, live broadcasters using various live broadcast apps, etc.
直播人物影像也可以是由计算机合成或者生成的影像。例如,在一个实施例中,可以预先存储人物A的平面影像或三维模型,然后通过人物A的实时的动作特征、语音特征等以及所述人物A的平面影像或三维模型, 来合成或生成人物A的直播影像。通过这种方式,可以仅仅传输人物A的实时的动作特征或语音特征,而不需要传输人物A的实时视频,从而可以降低系统对传输带宽的需求,提高效率。在一个实施例中,还可以预先存储人物A的平面影像或三维模型,然后通过人物B的实时的动作特征、语音特征等以及所述人物A的平面影像或三维模型,来合成或生成人物A的直播影像。通过这种方式,可以使得直播人物影像中的人物(例如人物A)与实际的直播者(例如人物B)不同。另外,通过这种方式,直播人物影像中的人物可以不是真实人物,而是一个虚拟人物,例如动画人物。Live images of people can also be images synthesized or generated by a computer. For example, in one embodiment, a plane image or three-dimensional model of character A may be stored in advance, and then the character A can be synthesized or generated through real-time action features, voice features, etc. of character A, and the plane image or three-dimensional model of character A. A's live video. In this way, only real-time action features or voice features of character A can be transmitted, without the need to transmit real-time video of character A, thereby reducing the system's demand for transmission bandwidth and improving efficiency. In one embodiment, a plane image or three-dimensional model of character A may be stored in advance, and then the character A can be synthesized or generated by using the real-time action characteristics, voice features, etc. of character B and the plane image or three-dimensional model of character A. Live video. In this way, the character (for example, character A) in the live broadcast of the character image can be made different from the actual broadcaster (for example, character B). In addition, in this way, the character in the live character image may not be a real character, but a virtual character, such as an animated character.
在本发明的一个实施例中,可以以计算机程序的形式来实现本发明。计算机程序可以存储于各种存储介质(例如,硬盘、光盘、闪存等)中,当该计算机程序被处理器执行时,能够用于实现本发明的方法。In an embodiment of the present invention, the present invention can be implemented in the form of a computer program. The computer program can be stored in various storage media (for example, a hard disk, an optical disk, a flash memory, etc.), and when the computer program is executed by a processor, it can be used to implement the method of the present invention.
在本发明的另一个实施例中,可以以电子设备的形式来实现本发明。该电子设备包括处理器和存储器,在存储器中存储有计算机程序,当该计算机程序被处理器执行时,能够用于实现本发明的方法。In another embodiment of the present invention, the present invention may be implemented in the form of an electronic device. The electronic device includes a processor and a memory, and a computer program is stored in the memory. When the computer program is executed by the processor, it can be used to implement the method of the present invention.
本文中针对“各个实施例”、“一些实施例”、“一个实施例”、或“实施例”等的参考指代的是结合所述实施例所描述的特定特征、结构、或性质包括在至少一个实施例中。因此,短语“在各个实施例中”、“在一些实施例中”、“在一个实施例中”、或“在实施例中”等在整个本文中各处的出现并非必须指代相同的实施例。此外,特定特征、结构、或性质可以在一个或多个实施例中以任何合适方式组合。因此,结合一个实施例中所示出或描述的特定特征、结构或性质可以整体地或部分地与一个或多个其他实施例的特征、结构、或性质无限制地组合,只要该组合不是不符合逻辑的或不能工作。本文中出现的类似于“根据A”、“基于A”、“通过A”或“使用A”的表述意指非排他性的,也即,“根据A”可以涵盖“仅仅根据A”,也可以涵盖“根据A和B”,除非特别声明其含义为“仅仅根据A”。在本申请中为了清楚说明,以一定的顺序描述了一些示意性的操作步骤,但本领域技术人员可以理解,这些操作步骤中的每一个并非是必不可少的,其中的一些步骤可以被省略或者被其他步骤替代。这些操作步骤也并非必须以所示的方式依次执行,相反,这些操作步骤中的一些可以根据实际需要以不同的顺序执行,或者并行执行,只要新的执行方式不是不符合逻辑的或不能工作。References herein to "each embodiment", "some embodiments", "one embodiment", or "an embodiment", etc. refer to the specific features, structures, or properties described in connection with the embodiments that are included in In at least one embodiment. Therefore, the appearances of the phrases "in various embodiments", "in some embodiments", "in one embodiment", or "in an embodiment" in various places throughout this document do not necessarily refer to the same implementation example. In addition, specific features, structures, or properties can be combined in any suitable manner in one or more embodiments. Therefore, a specific feature, structure, or property shown or described in combination with one embodiment can be combined in whole or in part with the feature, structure, or property of one or more other embodiments without limitation, as long as the combination is not incompatible. Logical or not working. Expressions similar to "according to A", "based on A", "through A" or "using A" appearing in this article mean non-exclusive, that is, "according to A" can cover "only according to A" or Covers "according to A and B" unless it is specifically stated to mean "only according to A". For clarity in this application, some illustrative operating steps are described in a certain order, but those skilled in the art can understand that each of these operating steps is not indispensable, and some of the steps can be omitted. Or replaced by other steps. These operation steps do not have to be executed sequentially in the manner shown. On the contrary, some of these operation steps can be executed in a different order according to actual needs, or executed in parallel, as long as the new execution method is not illogical or unable to work.
由此描述了本发明的至少一个实施例的几个方面,可以理解,对本领 域技术人员来说容易地进行各种改变、修改和改进。这种改变、修改和改进意于在本发明的精神和范围内。虽然本发明已经通过优选实施例进行了描述,然而本发明并非局限于这里所描述的实施例,在不脱离本发明范围的情况下还包括所作出的各种改变以及变化。Thus, several aspects of at least one embodiment of the present invention have been described, and it can be understood that various changes, modifications and improvements can be easily made by those skilled in the art. Such changes, modifications and improvements are intended to be within the spirit and scope of the present invention. Although the present invention has been described through preferred embodiments, the present invention is not limited to the embodiments described here, and also includes various changes and changes made without departing from the scope of the present invention.

Claims (20)

  1. 一种用于在现实场景中叠加直播人物影像的方法,包括:A method for superimposing live images of people in real scenes, including:
    确定设备在空间中的位置和姿态,其中,所述设备具有图像采集器件和显示媒介;Determine the position and posture of the device in space, wherein the device has an image acquisition device and a display medium;
    获得为所述直播人物影像设置的空间位置;Obtaining the spatial position set for the live broadcast image of the person;
    基于所述设备的位置和姿态以及所述直播人物影像的空间位置,确定所述直播人物影像在所述设备的显示媒介上的呈现位置;Determine the presentation position of the live person image on the display medium of the device based on the position and posture of the device and the spatial position of the live person image;
    在所述设备的显示媒介上呈现所述设备的图像采集器件采集的现实场景;以及Presenting the real scene collected by the image acquisition device of the device on the display medium of the device; and
    接收所述直播人物影像并在所述显示媒介上的所述呈现位置处叠加所述直播人物影像。Receiving the live person image and superimposing the live person image on the presentation position on the display medium.
  2. 根据权利要求1所述的方法,其中,The method of claim 1, wherein:
    所述设备接收的所述直播人物影像是背景透明的直播人物影像或者无背景的直播人物影像;或者The live character image received by the device is a live character image with a transparent background or a live character image without a background; or
    所述设备处理所接收的所述直播人物影像以生成背景透明的直播人物影像或者无背景的直播人物影像。The device processes the received live character image to generate a live character image with a transparent background or a live character image without a background.
  3. 根据权利要求1或2所述的方法,还包括:确定要为所述设备呈现的直播人物影像。The method according to claim 1 or 2, further comprising: determining a live image of a person to be presented for the device.
  4. 根据权利要求3所述的方法,其中,通过所述设备在空间中的位置来确定要为所述设备呈现的直播人物影像。The method according to claim 3, wherein the live image of the person to be presented to the device is determined by the location of the device in the space.
  5. 根据权利要求4所述的方法,其中,通过所述设备在空间中的位置和姿态来确定要为所述设备呈现的直播人物影像。The method according to claim 4, wherein the live image of the person to be presented to the device is determined by the position and posture of the device in the space.
  6. 根据权利要求1或2所述的方法,还包括:The method according to claim 1 or 2, further comprising:
    获得为所述直播人物影像设置的在空间中的姿态。Obtain the posture in the space set for the live image of the person.
  7. 根据权利要求6所述的方法,还包括:The method according to claim 6, further comprising:
    基于所述设备的位置和姿态以及所述直播人物影像的姿态,确定所述直播人物影像在所述设备的显示媒介上的呈现姿态。Based on the position and posture of the device and the posture of the live person image, the presentation posture of the live person image on the display medium of the device is determined.
  8. 根据权利要求1或2所述的方法,其中,使得所述直播人物影像的正面始终朝向所述设备。The method according to claim 1 or 2, wherein the front face of the live human image is always facing the device.
  9. 根据权利要求1或2所述的方法,还包括:The method according to claim 1 or 2, further comprising:
    采集所述设备的用户的影像、声音或文字输入;以及Collect the image, sound or text input of the user of the device; and
    将所述影像、声音或文字输入发送给提供所述直播人物影像的直播者。The image, sound or text input is sent to the live broadcaster who provides the live broadcast image of the person.
  10. 根据权利要求1或2所述的方法,还包括:The method according to claim 1 or 2, further comprising:
    在所述设备的显示媒介上叠加所述直播人物影像之后,根据所述设备的新的位置和姿态以及所述直播人物影像的空间位置,确定所述直播人物影像在所述设备的显示媒介上的新的呈现位置。After the live character image is superimposed on the display medium of the device, it is determined that the live character image is on the display medium of the device according to the new position and posture of the device and the spatial position of the live character image The new presentation position of.
  11. 根据权利要求1或2所述的方法,还包括:The method according to claim 1 or 2, further comprising:
    在所述设备的显示媒介上叠加所述直播人物影像之后,所述直播人物影像在所述显示媒介上的呈现位置保持不变。After the live character image is superimposed on the display medium of the device, the presentation position of the live character image on the display medium remains unchanged.
  12. 根据权利要求1或2所述的方法,还包括:The method according to claim 1 or 2, further comprising:
    在所述设备的显示媒介上叠加所述直播人物影像之后,根据所述设备的用户的指示使得所述直播人物影像在所述显示媒介上的呈现位置保持不变。After the live broadcast character image is superimposed on the display medium of the device, the presentation position of the live broadcast character image on the display medium remains unchanged according to the instruction of the user of the device.
  13. 根据权利要求1或2所述的方法,其中,所述确定设备在空间中的位置和姿态包括:The method according to claim 1 or 2, wherein the determining the position and posture of the device in space comprises:
    通过所述设备扫描部署在现实场景中的光通信装置来确定所述设备在空间中的初始位置和姿态,并持续跟踪所述设备在空间中的位置和姿态变化。The device scans the optical communication device deployed in the real scene to determine the initial position and posture of the device in space, and continuously track the change of the device’s position and posture in the space.
  14. 根据权利要求13所述的方法,还包括:The method according to claim 13, further comprising:
    所述设备获得所述光通信装置的标识信息,并通过所述标识信息确定要为所述设备呈现的直播人物影像。The device obtains the identification information of the optical communication device, and uses the identification information to determine the live image of the person to be presented for the device.
  15. 根据权利要求1或2所述的方法,其中,所述直播人物影像包括由计算机合成或生成的直播人物影像。The method according to claim 1 or 2, wherein the live person image comprises a live person image synthesized or generated by a computer.
  16. 根据权利要求1或2所述的方法,其中,所述直播人物影像是二维人物影像或者三维人物影像。The method according to claim 1 or 2, wherein the live image of a person is a two-dimensional image of a person or a three-dimensional image of a person.
  17. 根据权利要求1或2所述的方法,还包括:The method according to claim 1 or 2, further comprising:
    在接收所述直播人物影像之前,指示与所述直播人物影像关联的直播者提供所述直播人物影像。Before receiving the live person image, instruct the live broadcaster associated with the live person image to provide the live person image.
  18. 一种存储介质,其中存储有计算机程序,在所述计算机程序被处理器执行时,能够用于实现权利要求1-17中任一项所述的方法。A storage medium in which a computer program is stored, and when the computer program is executed by a processor, it can be used to implement the method according to any one of claims 1-17.
  19. 一种电子设备,包括处理器和存储器,所述存储器中存储有计算机程序,在所述计算机程序被处理器执行时,能够用于实现权利要求1-17中任一项所述的方法。An electronic device comprising a processor and a memory, and a computer program is stored in the memory, and when the computer program is executed by the processor, it can be used to implement the method according to any one of claims 1-17.
  20. 一种计算机程序产品,当其被处理器执行时,能够用于实现权利要求1-17中任一项所述的方法。A computer program product, when it is executed by a processor, can be used to implement the method described in any one of claims 1-17.
PCT/CN2021/084372 2020-04-26 2021-03-31 Method for superimposing live image of person onto real scene, and electronic device WO2021218547A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010336313.X 2020-04-26
CN202010336313.XA CN111242704B (en) 2020-04-26 2020-04-26 Method and electronic equipment for superposing live character images in real scene

Publications (1)

Publication Number Publication Date
WO2021218547A1 true WO2021218547A1 (en) 2021-11-04

Family

ID=70871392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084372 WO2021218547A1 (en) 2020-04-26 2021-03-31 Method for superimposing live image of person onto real scene, and electronic device

Country Status (3)

Country Link
CN (1) CN111242704B (en)
TW (1) TWI795762B (en)
WO (1) WO2021218547A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327055A (en) * 2021-12-23 2022-04-12 佩林(北京)科技有限公司 3D real-time scene interaction system based on meta-universe VR/AR and AI technologies
CN117456611A (en) * 2023-12-22 2024-01-26 拓世科技集团有限公司 Virtual character training method and system based on artificial intelligence

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528699B (en) * 2020-12-08 2024-03-19 北京外号信息技术有限公司 Method and system for obtaining identification information of devices or users thereof in a scene
TWI807598B (en) * 2021-02-04 2023-07-01 仁寶電腦工業股份有限公司 Generating method of conference image and image conference system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130169626A1 (en) * 2011-06-02 2013-07-04 Alexandru Balan Distributed asynchronous localization and mapping for augmented reality
CN105334963A (en) * 2015-10-29 2016-02-17 广州华多网络科技有限公司 Method and system for displaying virtual article
CN105491396A (en) * 2015-10-10 2016-04-13 腾讯科技(北京)有限公司 Multimedia information processing method and server
CN108132490A (en) * 2016-12-22 2018-06-08 大辅科技(北京)有限公司 Detection system and detection method based on alignment system and AR/MR
CN109195020A (en) * 2018-10-11 2019-01-11 三星电子(中国)研发中心 A kind of the game live broadcasting method and system of AR enhancing
CN111242107A (en) * 2020-04-26 2020-06-05 北京外号信息技术有限公司 Method and electronic device for setting virtual object in space

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2920683A1 (en) * 2012-11-15 2015-09-23 Iversen, Steen Svendstorp Method of providing a digitally represented visual instruction from a specialist to a user in need of said visual instruction, and a system therefor
US9576607B2 (en) * 2015-01-21 2017-02-21 Google Inc. Techniques for creating a composite image
CN106303555B (en) * 2016-08-05 2019-12-03 深圳市摩登世纪科技有限公司 A kind of live broadcasting method based on mixed reality, device and system
US10089793B2 (en) * 2016-09-02 2018-10-02 Russell Holmes Systems and methods for providing real-time composite video from multiple source devices featuring augmented reality elements
US11030440B2 (en) * 2016-12-30 2021-06-08 Facebook, Inc. Systems and methods for providing augmented reality overlays
US10666929B2 (en) * 2017-07-06 2020-05-26 Matterport, Inc. Hardware system for inverse graphics capture
CN109788359B (en) * 2017-11-14 2021-10-26 腾讯科技(深圳)有限公司 Video data processing method and related device
US11057667B2 (en) * 2017-11-17 2021-07-06 Gfycat, Inc. Selection of a prerecorded media file for superimposing into a video
CN107864225A (en) * 2017-12-21 2018-03-30 北京小米移动软件有限公司 Information-pushing method, device and electronic equipment based on AR
US10438064B2 (en) * 2018-01-02 2019-10-08 Microsoft Technology Licensing, Llc Live pictures in mixed reality
TWI744536B (en) * 2018-06-19 2021-11-01 宏正自動科技股份有限公司 Live streaming system and method for live streaming
US10748342B2 (en) * 2018-06-19 2020-08-18 Google Llc Interaction system for augmented reality objects
CN109255839B (en) * 2018-08-16 2023-04-28 北京小米移动软件有限公司 Scene adjustment method and device
CN110858376A (en) * 2018-08-22 2020-03-03 阿里巴巴集团控股有限公司 Service providing method, device, system and storage medium
US10762678B2 (en) * 2018-10-04 2020-09-01 Accenture Global Solutions Limited Representing an immersive content feed using extended reality based on relevancy
CN109218709B (en) * 2018-10-18 2022-02-15 北京小米移动软件有限公司 Holographic content adjusting method and device and computer readable storage medium
US10482678B1 (en) * 2018-12-14 2019-11-19 Capital One Services, Llc Systems and methods for displaying video from a remote beacon device
CN110275617A (en) * 2019-06-21 2019-09-24 姚自栋 Switching method and system, the storage medium and terminal of mixed reality scene

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130169626A1 (en) * 2011-06-02 2013-07-04 Alexandru Balan Distributed asynchronous localization and mapping for augmented reality
CN105491396A (en) * 2015-10-10 2016-04-13 腾讯科技(北京)有限公司 Multimedia information processing method and server
CN105334963A (en) * 2015-10-29 2016-02-17 广州华多网络科技有限公司 Method and system for displaying virtual article
CN108132490A (en) * 2016-12-22 2018-06-08 大辅科技(北京)有限公司 Detection system and detection method based on alignment system and AR/MR
CN109195020A (en) * 2018-10-11 2019-01-11 三星电子(中国)研发中心 A kind of the game live broadcasting method and system of AR enhancing
CN111242107A (en) * 2020-04-26 2020-06-05 北京外号信息技术有限公司 Method and electronic device for setting virtual object in space

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327055A (en) * 2021-12-23 2022-04-12 佩林(北京)科技有限公司 3D real-time scene interaction system based on meta-universe VR/AR and AI technologies
CN117456611A (en) * 2023-12-22 2024-01-26 拓世科技集团有限公司 Virtual character training method and system based on artificial intelligence
CN117456611B (en) * 2023-12-22 2024-03-29 拓世科技集团有限公司 Virtual character training method and system based on artificial intelligence

Also Published As

Publication number Publication date
CN111242704A (en) 2020-06-05
TWI795762B (en) 2023-03-11
TW202205176A (en) 2022-02-01
CN111242704B (en) 2020-12-08

Similar Documents

Publication Publication Date Title
WO2021218547A1 (en) Method for superimposing live image of person onto real scene, and electronic device
US10958871B2 (en) System and methods for facilitating virtual presence
US10089769B2 (en) Augmented display of information in a device view of a display screen
US20120192088A1 (en) Method and system for physical mapping in a virtual world
WO2017163113A1 (en) Methods and systems for generating and using simulated 3d images
CN103733196A (en) Method and apparatus for enabling a searchable history of real-world user experiences
CN112684893A (en) Information display method and device, electronic equipment and storage medium
US20230316659A1 (en) Traveling in time and space continuum
US20230308762A1 (en) Display terminal, information processing system, communication system, displaying method, information processing method, communication method, and recording medium
US11928774B2 (en) Multi-screen presentation in a virtual videoconferencing environment
CN109885172A (en) A kind of object interaction display method and system based on augmented reality AR
CN117010965A (en) Interaction method, device, equipment and medium based on information stream advertisement
KR20220090251A (en) Method and system for providing fusion of concert and exhibition based concexhibition service for art gallery
US11776227B1 (en) Avatar background alteration
EP4250744A1 (en) Display terminal, communication system, method for displaying, method for communicating, and carrier means
US20230368399A1 (en) Display terminal, communication system, and non-transitory recording medium
KR20140098321A (en) System for sharing contents and method for the same
JP2024033277A (en) Communication system, information processing system, video reproduction method, and program
TW202143666A (en) Information displaying method based on optical communitation device, electric apparatus, and computer readable storage medium
KR20140044730A (en) Method and system for augmented reality based smart classroom environment
CN110896398A (en) Method and device for inquiring cemetery
Liu Designing real-time vision-based augmented reality environments for mobile applications.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21796695

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21796695

Country of ref document: EP

Kind code of ref document: A1