WO2022169413A1 - 图像处理方法、装置、电子设备及程序产品 - Google Patents

图像处理方法、装置、电子设备及程序产品 Download PDF

Info

Publication number
WO2022169413A1
WO2022169413A1 PCT/SG2022/050033 SG2022050033W WO2022169413A1 WO 2022169413 A1 WO2022169413 A1 WO 2022169413A1 SG 2022050033 W SG2022050033 W SG 2022050033W WO 2022169413 A1 WO2022169413 A1 WO 2022169413A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
display
target
target object
real
Prior art date
Application number
PCT/SG2022/050033
Other languages
English (en)
French (fr)
Inventor
王全
Original Assignee
脸萌有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 脸萌有限公司 filed Critical 脸萌有限公司
Publication of WO2022169413A1 publication Critical patent/WO2022169413A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/004Annotating, labelling

Definitions

  • Embodiments of the present disclosure relate to the field of computers, and in particular, to an image processing method, apparatus, electronic device, and program product.
  • BACKGROUND Augmented reality Augmented Reality, AR for short
  • Augmented reality Augmented Reality, AR for short
  • Information presentation using augmented reality becomes a possible information presentation manner.
  • an embodiment of the present disclosure provides an image processing method, including: obtaining a captured image of a real scene; determining an image of a target object in the captured image of the real scene, where the image of the target object is an image including a target object; a display object associated with a target object; determining a target display position of the display object in the live-action shot image according to the target object image; and displaying the display object on the target in the live-action shot image Display location.
  • an embodiment of the present disclosure provides an image processing apparatus, including: a photographing and display module for obtaining a real-scene photographed image; an identification processing module for determining a target object image in the real-scene photographed image, where the target object
  • the image includes a target object; it is also used to obtain a display object associated with the target object; according to the target object image, determine a target display position of the display object in the real-life shot image; the shooting display module further for displaying the display object at the target display position in the live-action shot image.
  • an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory; the memory stores computer execution instructions; The at least one processor executes the computer-executable instructions stored in the memory, so that the at least one processor executes the image processing methods described above in the first aspect and various possible aspects of the first aspect.
  • an embodiment of the present disclosure provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the first aspect and the first The image processing method is described in terms of various possible designs.
  • embodiments of the present disclosure provide a computer program product, including computer instructions, which, when executed by a processor, implement the image processing method described in the first aspect and various possible designs of the first aspect.
  • an embodiment of the present disclosure provides a computer program. When the computer instructions are executed by a processor, the image processing method described in the first aspect and various possible designs of the first aspect is implemented.
  • the image processing method, device, electronic device, and program product provided by the embodiments of the present disclosure, by obtaining a real-scene captured image; determining a target object image in the real-scene captured image, where the target object image is an image including the target object; a display object associated with the target object; determining a target display position of the display object in the live-action shot image according to the target object image; and displaying the display object in all locations in the live-action shot image
  • the target display position is described; the technical solution provided in this embodiment makes it possible to directly display the display object associated with the target object by using the augmented reality display technology, because it is unnecessary to rebuild the virtual model, thereby saving computing resources and improving the display. Efficiency also enables users to get a better interactive experience and visual experience.
  • FIG. 1 is a schematic diagram of a network architecture on which the present disclosure is based;
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure;
  • FIG. 3 is a first interface of an image processing method provided by the disclosed embodiment. Schematic diagram;
  • FIG. 4 is a schematic diagram of a second interface of an image processing method provided by a disclosed embodiment;
  • FIG. 1 is a schematic diagram of a network architecture on which the present disclosure is based;
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure;
  • FIG. 3 is a first interface of an image processing method provided by the disclosed embodiment. Schematic diagram;
  • FIG. 4 is a schematic diagram of a second interface of an image processing method provided by a disclosed embodiment;
  • FIG. 5 is a schematic diagram of a third interface of an image processing method provided by a disclosed embodiment; A schematic diagram of a fourth interface of an image processing method; FIG. 7 is a schematic diagram of a fifth interface of an image processing method provided by a disclosed embodiment; FIG. 8 is a schematic diagram of a sixth interface of an image processing method provided by the disclosed embodiment; A structural block diagram of an image processing apparatus provided by an embodiment of the present disclosure; FIG. 10 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present disclosure.
  • the terminal when performing augmented reality display, the terminal will first photograph the real scene of the real scene to obtain the current real scene photographed image. Then, using the augmented reality technology, a display object including a virtual object is superimposed on the real-life shot image, and the superimposed image is presented to the user. When superimposing and displaying these virtual objects, it is first necessary to construct associated virtual objects based on the objects in the real scene, and then based on the constructed virtual objects, the display objects are displayed on the captured images of the real scene.
  • the construction of a virtual object can generally be implemented by a simultaneous localization and map (Simultaneous Localization and Mapping, SLAM for short) technology.
  • the environment is first scanned in real time by the terminal to construct a virtual object (such as a three-dimensional virtual model) based on the environment, then the display object to be displayed is loaded into the virtual object, and finally, the virtual object loaded with the display object is superimposed and displayed In a real scene, the display is completed.
  • a virtual object such as a three-dimensional virtual model
  • the augmented reality display technology based on SLAM technology needs to regenerate and construct virtual objects including 3D virtual models every time it is displayed, and the generation process and construction process require a lot of computing resources and time costs. That is, the augmented reality display technology based on the SLAM technology requires a longer display time when displaying a display object, and the display efficiency is low.
  • the inventor creatively found after research that, when presenting a display object associated with an object, a method with lower technical cost can be adopted.
  • the image recognition technology at the image level can be directly used to realize the display of the display object in the real scene image. Specifically, by obtaining a real-scene captured image; determining a target object image in the real-scene captured image, where the target object image is an image including the target object; acquiring a display object associated with the target object; image, determining a target display position of the display object in the live-action shot image; and displaying the display object at the target display position in the live-action shot image.
  • FIG. 1 is a schematic diagram of a network architecture on which the present disclosure is based.
  • the network architecture shown in FIG. 1 may specifically include a terminal 1 and a server 2 .
  • the terminal 1 may be a user's mobile phone, a smart home device, a tablet computer, a wearable device, or other hardware equipment that can be used to capture a real scene and display the captured real scene, and an image processing device may be integrated or installed in the terminal 1, and the image processing
  • the device is hardware or software for executing the image processing method of the present disclosure, and the image processing device can provide the terminal 1 with an augmented reality display page, and the terminal 1 uses its screen or display components to display the enhancement provided by the image processing device to the user Realistic display of the showcase page.
  • the server 2 may specifically be a server or server cluster set in the cloud, and the server or server cluster may store image data related to the image processing method provided by the present disclosure, display object data, and the like.
  • the image processing apparatus may also use the network components of the terminal 1 to interact with the server 2, acquire image data and display object data stored in the server 2, and perform corresponding processing and processing. exhibit.
  • the architecture shown in FIG. 1 is applicable to the field of information presentation, in other words, it can be used for presentation of display objects associated with objects in real scenes under various scenarios.
  • the image processing method provided by the present disclosure can be applied to scenes based on augmented reality display, for example, combining landmark buildings with augmented reality display technology to perform "New Year (Spring Festival) blessing” or "refueling a place”
  • the display of display objects including "New Year (Spring Festival) blessing” or “refueling for a place” associated with the "landmark building” can be realized by the image processing method provided by the present disclosure. .
  • the image processing method provided by the present disclosure can be used to realize the display object of the "treasure chest” that associates the "treasure hunt” game with clue items in the real scene displayed in the real scene; or, in some games of the "cultivation” category based on augmented reality display technology, the image processing method provided by the present disclosure can be used to associate the "cultivation” game with the farmland markers in the real scene The display object "Farmland” is displayed in the real scene.
  • the image processing method provided by the present disclosure can also be applied to an advertising scenario based on augmented reality display.
  • the image processing method provided by the present disclosure can be used to realize the associated products for these commodities. Presentation of display objects such as reviews or product introductions, thereby providing users with more information about the product and improving user experience.
  • the image processing method provided by the present disclosure can also be used to display the display objects, thereby presenting more information about the scene to the user and increasing the user's interactive experience.
  • the image processing method provided by the present disclosure can be executed together to complete the process in various daily life scenarios. for the presentation of information.
  • FIG. 2 is a schematic flowchart of an image processing method provided by an embodiment of the present disclosure.
  • the image processing method provided by the embodiment of the present disclosure includes: Step 101, obtaining a real-life captured image; Step 102, determining a target object image in the real-life captured image, where the target object image is an image including a target object ; Step 103: Acquire a display object associated with the target object; Step 104, Determine the target display position of the display object in the real-life shot image according to the target object image; Step 105, Display the display object The object is displayed at the target display position in the live-action image.
  • the execution body of the processing method provided in this embodiment is the aforementioned image processing apparatus, and in some embodiments of the present disclosure, it specifically refers to a client or a display terminal that can be installed or integrated on a terminal .
  • the image processing apparatus can be operated through the terminal, so that the image processing apparatus can respond to the operation triggered by the user.
  • the terminal will obtain a real-life photographed image, which may be an image obtained by the terminal calling its own photographing component to photograph the current environment, or may be a real-time image of the real scene obtained by the image processing device through other means.
  • FIG. 3 is a schematic diagram of a first interface of an image processing method provided by the disclosed embodiments.
  • the target object mentioned in the present disclosure can be any object that appears in the real scene scene, including but not limited to the above-mentioned scene. "Landmark Buildings”, “Commodities” and “Event QR Codes”, etc. As shown in FIG.
  • the target object here can be a “drink bottle”, and by performing identification processing on the real-life shot image 300 , it is determined that the real-life shot image 300 includes the target object of the “drink bottle”.
  • the terminal determines the target object it may be implemented based on feature matching of the features of the current real-scene captured image and the features of the reference image of the target object.
  • the reference image is generally an image including the target object preset in the image database of the server, and the reference image generally includes a standard image of the target object, such as a front view of a landmark building, or a front view of a commodity image, or, image including all QR code information, etc.
  • FIG. 4 is a schematic diagram of a second interface of an image processing method provided by the disclosed embodiments.
  • an image including the target object that is, a reference image 302
  • the terminal can determine the target object image 301 in the real shot image 300 based on the reference image 302.
  • the image database of the server there are often stored images of a large number of objects to be referenced, and these to-be-referenced images are all pre-stored in in the image database.
  • this process may be implemented based on feature matching: first, extract the global feature of the live-action captured image; The global feature is matched; finally, the to-be-referenced image that matches the feature of the live shot image is determined as the reference image.
  • the global feature of the captured image of the real scene is extracted, and the global feature is used for matching to determine the reference image, which is beneficial to improve the degree of matching with the captured image of the real scene and improve the user experience.
  • the extraction of the global feature can be implemented by using an existing machine learning model, such as a convolutional neural network and the like.
  • the real-scene captured image can be extracted to represent the global information of the image or the global feature of the image. for subsequent matching.
  • the matching process includes but is not limited to feature comparison processing, image similarity based on distance between features calculation etc.
  • a to-be-referenced image with the most similar global features to the currently captured live-action image can be found from the image database, as a reference image used in this processing.
  • the above-mentioned acquisition of reference images may also be acquired in other more efficient ways, such as based on the current location information of the terminal, so as to filter the images to be referenced in the image database of the server to obtain corresponding reference images .
  • the current location information of the terminal can also be used as a screening condition for acquiring each image to be referenced, that is, the terminal can upload the current location information to the server. , and receive the reference image associated with the current location information returned by the server.
  • the terminal can upload the current location information “a street garden in Beijing” to the server.
  • the server can determine the current location based on the current location information provided by the terminal.
  • the position corresponds to the reference image 402 of the "farmland two-dimensional code”
  • the server sends the reference image 402 to the terminal for processing by the terminal to determine the target object image 401 of the "farmland two-dimensional code", and displays it on the screen of the terminal
  • the "farmland” display object 403 corresponding to the "farmland two-dimensional code” is displayed on the display.
  • the current location information may include geographic location information, such as geographic coordinates, longitude and latitude, etc.; and may also include directional location information, such as facing 45 degrees northwest.
  • the server determines the corresponding reference image from several pre-stored images to be referenced based on the current location information of the terminal
  • the combination of the geographic location information and the direction location information can be used to push the real-time shooting images currently shot by the terminal. target object, and then send the corresponding reference image to the terminal.
  • the above two methods (based on feature matching and based on location information) can be simultaneously applied to the acquisition process of the reference image, for example, it can be received based on the current location information of the terminal first.
  • the process of recognizing the target object involved in the present disclosure can be specifically implemented based on an existing recognition model, recognition component or recognition server that can be used to recognize whether the target object is included in the image. .
  • the terminal can send the real-scene captured image to the identification module, identification component or identification server, and the corresponding identification module, identification component or identification server performs identification processing on whether the real-scene captured image includes a target object, and processes the identification process.
  • the result is returned to the terminal.
  • an image including a target object that is, an image of the target object
  • the step of acquiring a display object associated with the target object is also performed.
  • the display object refers to a static object or a dynamic object that can be presented on the screen of the terminal, such as text information, image information, virtual special effects, etc. associated with the target object.
  • the text information of the blessing words or the image information of the cheering can be used as It is the display object associated with the landmark building; and in the aforementioned "treasure hunt” game, the virtual special effect of the treasure chest can be used as the display object of a certain clue item in the real scene.
  • the acquisition of the display object may be achieved through interaction with the server, that is, the terminal will upload the image of the target object to the server after confirming the image of the target object; and receive the display object returned by the server , to perform the display of the display object.
  • the display object associated with the target object it is generally preset, that is, by establishing a mapping relationship between the display object and the target object in the server, so that when the server receives a After the image of the target object, the display object corresponding to the target object may be sent to the terminal based on the mapping relationship, so that the terminal can display the display object.
  • the terminal will also determine the target display position of the display object in the real-life shot image according to the image of the target object, which can be specifically performed through steps 1041 and 1041.
  • Step 1042 Step 1041: Perform model adaptation on the image of the target object and the reference image; Step 1042: Based on a preset display position in the reference image, determine the position of the display object in the real-life shot image Target display location.
  • the reference image includes a preset display position.
  • the target display position of the display object in the live-action shot image is determined by model matching, which prevents the display object from being displayed abruptly in the live-action shot image, and improves the display effect of the display object in the live-action shot image.
  • it is not only necessary to acquire the object content of the display object, but also need to determine which position of the display object to display in the live shot image.
  • FIG. 6 is a schematic diagram of a fourth interface of an image processing method provided by the disclosed embodiments.
  • the reference image includes an image with all information of the target object, and the reference image is shown in combination with FIG. 6 .
  • a preset display position 604 will also be included in 602, and the preset display position 604 will be used to represent the display position of the display object associated with the target object in the reference image 602.
  • the model adaptation for the image in step 1041 may be implemented by image feature matching: First, the terminal may determine an image affine change matrix according to the reference image and the target object image.
  • the affine transformation matrix can be obtained in a variety of ways. In an optional embodiment, the affine transformation matrix can be achieved through image feature matching technology: first, extract the local features of the target object image; The local features of the image and the local features of the reference image are matched to obtain the mapping relationship between the local features; finally, the image affine change matrix is determined according to the mapping relationship between the local features.
  • the extraction of local features may be implemented by using an existing machine learning model, such as a convolutional neural network and the like.
  • the target object image in order to identify the feature correspondence between the target object image and the reference image as much as possible, the target object image can be extracted to represent the image part when extracting features. Information or local features of local features of the image for subsequent matching.
  • the local features of the target object image and the reference image can be SIFT (Distinctive Image Features from Scale-Invariant Keypoints) features with high robustness, and the data dimension of the SIFT feature can be 128 dimensions.
  • the nearest neighbor search method based on brute-force/linear-scan can be used to achieve, wherein, for The distance between matching features can be selected based on the standard L2 Eclidean Distance, that is, the distance of the current closest matching feature point is smaller than the distance of the previous matching feature point by a ratio.
  • an affine matrix between images will be established based on the mapping relationship, that is, using the framework of Ran Sac to obtain the image affine change by random sampling and multiple cycles. matrix.
  • the target object image can be obtained by performing a dot product operation on the reference image and the image affine change matrix.
  • the terminal performs affine transformation on the preset display position in the reference image according to the image affine transformation matrix, to obtain the target display position of the display object in the real-life shot image.
  • the method of reference coordinate points can be used to realize.
  • FIG. 7 is a schematic diagram of a fifth interface of an image processing method provided by the disclosed embodiment, as shown in FIG. As shown in FIG. 7, the reference reference coordinate point 6041 of the preset display position may be included in the reference image.
  • the reference reference coordinate point 6041 may be subjected to affine transformation according to the image affine transformation matrix to obtain the transformed target reference coordinate point. 6051; Then, according to the target reference coordinate point 6051, determine the target display position 605 o of the display object 603 in the real-life shot image, wherein the reference reference coordinate point 6041 including the preset display position 604 in the reference image 602 is a preset set. Finally, the terminal displays the display object at the target display position in the real-life shot image. Wherein, the image processing apparatus may superimpose and display the display object on the target display position in the real-life shot image based on an augmented reality display technology.
  • the terminal may also respond to the user's triggering operation on the display object displayed in the live-action captured image; and the triggering result on the display object , is displayed in the live-action shot image.
  • FIG. 8 is a schematic diagram of a sixth interface of an image processing method provided by the disclosed embodiment. As shown in FIG.
  • a user can trigger the shooting function of the terminal, so that the terminal can call the shooting component to obtain a real-life shooting image 800, and the terminal can
  • the display object associated with the target object will be displayed 803 , which is "Click here to send New Year's greetings" in the picture.
  • the real-life captured image 800 may further display the operation result 806 of the trigger operation on the display object 803 in the real-scenes captured image 800 ; optionally, the real-life captured image 800 may also display the corresponding landmark name , Shanghai as shown in Figure 8.
  • a real-life captured image is obtained; an image of a target object is determined in the real-life captured image, where the target object image is an image including a target object; and a display associated with the target object is obtained object; determining the target display position of the display object in the live-action shot image according to the target object image; and displaying the display object at the target display position in the live-action shot image;
  • the image processing apparatus includes: a photographing and display module 10 and an identification processing module 20; wherein, the photographing and display module 10 is used to obtain a real-scene photographed image; and the identification processing module 20 is used to photograph in the real scene
  • An image of a target object is determined in the image, and the image of the target object includes a target object; it is also used to acquire a display object associated with the target object; according to the image of the target object, it is determined that the display object is in the real-life shot image the target display position; the shooting and display module 10 is further configured to display the display object at the target display position in the live shooting image.
  • the identification processing module 20 determines the target object image in the real-life captured image, it is specifically configured to: acquire a reference image, wherein the reference image is an image including the target object; The reference image is determined, and the target object image is determined in the real-life shot image. In an optional embodiment, when the identification processing module 20 performs the obtaining of the reference image, it is specifically configured to: extract the global feature of the real-scene captured image; The global feature of at least one to-be-referenced image is matched; and the to-be-referenced image that matches the feature of the live-action captured image is determined as the reference image.
  • the identification processing module 20 when performing the obtaining of the reference image, is specifically configured to: upload the current location information to the server; and receive the reference image associated with the current location information returned by the server.
  • the reference image includes a preset display position; the identification processing module 20 is performing the determining, according to the target object image, the target display position of the display object in the real-life shot image when the image of the target object is model-fitted with the reference image, so as to determine the target display of the display object in the real-life shot image based on the preset display position in the reference image Location.
  • the recognition processing module 20 is performing model adaptation of the target object image and the reference image, so as to determine, based on a preset display position in the reference image, where the display object is located.
  • the recognition processing module 20 is specifically used to: determine an image affine transformation matrix according to the reference image and the target object image; The preset display position is subjected to affine transformation to obtain the target display position of the display object in the real-life shot image.
  • the identification processing module 20 is specifically configured to: extract local features of the target object image when performing the determining of the image affine change matrix according to the reference image and the target object image; Matching the local features of the target object image with the local features of the reference image to obtain a mapping relationship between the local features; determining the image affine change matrix according to the mapping relationship between the local features .
  • the reference image includes a reference reference coordinate point of a preset display position; the identification processing module 20 is performing the affine transformation matrix according to the image, on the preset in the reference image.
  • the display position is subjected to affine transformation to obtain the target display position of the display object in the real-life shot image, which is specifically used for: performing affine transformation on the reference reference coordinate point according to the image affine transformation matrix, obtaining the transformed target reference coordinate point; and determining the target display position of the display object in the real-life shot image according to the target reference coordinate point.
  • the reference reference coordinate point including the preset display position in the reference image is preset.
  • the identification processing module 20 is further configured to upload the target object image to the server after determining the target object image in the real scene shot image; and receive the display object returned by the server, so that all The photographing and display module 10 executes the display of the display object.
  • the photographing and display module 10 executes the displaying of the display object at the target display position in the real-scene photographed image, it is specifically configured to: based on an augmented reality display technology, display the The display object is displayed superimposed on the target display position in the live-action captured image.
  • the shooting and display module 10 is further configured to respond to a user's triggering operation on the display object displayed in the real-scene shooting image; and display the triggering result of the display object on the real-scene shooting image in the image.
  • the target object is a landmark building.
  • the photographing and display module 10 is further configured to display the landmark name of the landmark building in the real-scene photographed image.
  • the image processing apparatus obtains a real-life captured image; determines a target object image in the real-life captured image, where the target object image is an image including the target object; and obtains a display associated with the target object object; determining the target display position of the display object in the live-action shot image according to the target object image; and displaying the display object at the target display position in the live-action shot image; this embodiment
  • the provided technical solution makes it possible to directly display the display object associated with the target object by using the augmented reality display technology, because it is not necessary to rebuild the virtual model, thereby saving computing resources, improving display efficiency, and enabling users to obtain better results. Good interactive experience and visual experience.
  • FIG. 10 shows a schematic structural diagram of an electronic device 900 suitable for implementing an embodiment of the present disclosure, and the electronic device 900 may be a terminal device or a media library.
  • the terminal device may include, but is not limited to, such as a mobile phone, a notebook computer, a digital broadcast receiver, a personal digital assistant (Personal Digital Assistant, PDA for short), a tablet computer (Portable Android Device, PAD for short), a portable multimedia player (Portable Media Player, PMP for short), in-vehicle terminals (such as in-vehicle navigation terminals), mobile terminals such as wearable electronic devices, and fixed terminals such as digital TVs (Television), desktop computers, smart home devices, and the like.
  • PDA Personal Digital Assistant
  • PMP portable multimedia player
  • in-vehicle terminals such as in-vehicle navigation terminals
  • mobile terminals such as wearable electronic devices
  • fixed terminals such as digital TVs (Television), desktop computers, smart home devices, and the like.
  • the electronic device shown in FIG. 10 is only an embodiment, and should not impose any limitation on the function and scope of use of the embodiment of the present disclosure. As shown in FIG.
  • the electronic device 900 may include a processing device 901 for executing an image processing method (eg, a central processing unit, a graphics processing unit, etc.), which may be stored in a read only memory (Read Only Memory, ROM for short) 902 according to the Various appropriate actions and processes are executed by the program in the ROM or the program loaded from the storage device 908 into the random access memory (Random Access Memory, RAM for short) 903 .
  • ROM Read Only Memory
  • RAM Random Access Memory
  • various programs and data necessary for the operation of the electronic device 900 are also stored: the processing device 901 , the ROM 902 , and the RAM 903 are connected to each other through a bus 904 .
  • An Input/Output (I/O for short) interface 905 is also connected to the bus 904 .
  • the following devices can be connected to the I/O interface 905: an input device 906 including, for example, a touch screen, a touch pad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; including, for example, a liquid crystal screen (Liquid Crystal Display, LCD for short) ), an output device 907 of a speaker, a vibrator, etc.; a storage device 908 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 909.
  • the communication means 909 may allow the electronic device 900 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 10 shows an electronic device 900 having various means, it should be understood that not all of the illustrated means are required to be implemented or available. More or fewer devices may alternatively be implemented or provided.
  • the processes described above with reference to the flowcharts may be implemented as computer software programs.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes a method for executing the method shown in each flowchart according to the embodiment of the present disclosure. code.
  • the computer program may be downloaded and installed from the network via the communication device 909 , or from the storage device 908 , or from the ROM 902 .
  • the computer program is executed by the processing device 901
  • the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or any combination of the above.
  • Computer readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable Read-Only Memory (Erasable Programmable Read-Only Memory, referred to as EPROM or flash memory), optical fiber, Portable Compact Disk Read-Only Memory (Portable Compact Disk Read-Only Memory, referred to as CD-ROM), optical storage device, magnetic storage device, or Any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable signal medium can be sent, propagated, or transmitted for use by the instruction execution system, apparatus, or device. program used with or in conjunction with it.
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: electric wire, optical cable, RF (Radio Frequency, radio frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above computer readable medium carries one or more programs, and when the above one or more programs are executed by the electronic device, the electronic device causes the electronic device to execute the methods shown in the above embodiments.
  • Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional Procedural programming language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or media library.
  • the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external A computer (eg using an internet service provider to connect via the internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • an internet service provider to connect via the internet.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of code that contains one or more logic functions for implementing the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented with dedicated hardware-based systems that perform the specified functions or operations , or can be implemented using a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation on the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
  • the functions described herein above may be performed, at least in part, by one or more hardware logic components.
  • exemplary types of hardware logic components include: Field-Programmable Gate Array (FPGA for short), Application Specific Integrated Circuit (ASIC for short), Application Specific Standard Products (Application Specific Standard Product, referred to as ASSP), system-on-a-chip (System-on-a-chip, referred to as SOC), complex programmable logic device (Complex Programmable Logic Device, referred to as CPLD) and so on.
  • FPGA Field-Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • ASSP Application Specific Standard Products
  • SOC System-on-a-chip
  • CPLD Complex Programmable Logic Device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing. More specific embodiments of machine-readable storage media may include one or more wire-based Air connection, portable computer disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory ( CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • optical storage devices magnetic storage devices, or any suitable combination of the foregoing.
  • an image processing method includes: obtaining a real-scene captured image; and determining a target object image in the real-scene captured image, where the target object image is an image including a target object image; acquiring a display object associated with the target object; determining a target display position of the display object in the live-action shot image according to the target object image; and displaying the display object in the live-action shot The target display position in the image.
  • the determining the image of the target object in the captured image of the real scene includes: acquiring a reference image, wherein the reference image is an image including the target object; according to the reference image, in the real scene The target object image is determined in the captured image.
  • the obtaining a reference image includes: extracting a global feature of the live-action shot image; and performing matching processing on the global feature of the live-action shot image with a global feature of at least one image to be referenced stored in an image database ; Determining the to-be-referenced image that matches the feature of the live-action captured image as the reference image.
  • the obtaining the reference image includes: uploading the current location information to the server; and receiving the reference image associated with the current location information returned by the server.
  • the target object image is model-fitted with the reference image, so as to determine, based on a preset display position in the reference image, the position of the display object in the live-action shot image.
  • the target display position includes: determining an image affine transformation matrix according to the reference image and the target object image; performing affine transformation on a preset display position in the reference image according to the image affine transformation matrix, The target display position of the display object in the real-life shot image is obtained.
  • the determining the image affine change matrix according to the reference image and the target object image includes: extracting local features of the target object image; The local features of the reference image are matched to obtain the mapping relationship between the local features; The image affine change matrix is determined according to the mapping relationship between the local features.
  • the reference image includes a reference reference coordinate point of a preset display position; the affine transformation is performed on the preset display position in the reference image according to the image affine transformation matrix, to obtain The target display position of the display object in the real-life shot image includes: performing affine transformation on the reference reference coordinate point according to the image affine transformation matrix to obtain the transformed target reference coordinate point; The target reference coordinate point is used to determine the target display position of the display object in the real-life shot image.
  • the reference reference coordinate point including the preset display position in the reference image is preset.
  • the method further includes: uploading the image of the target object to a server; and receiving the display object returned by the server, so as to execute the process of performing the operation on the display object. display.
  • the displaying the display object at the target display position in the real-scene shot image includes: based on an augmented reality display technology, superimposing and displaying the display object in the real-scene shot image of the target display position.
  • the method further includes: responding to a user's triggering operation on the display object displayed in the live-action shot image; and displaying a trigger result on the display object in the live-action shot image.
  • the target object is a landmark building.
  • an image processing apparatus includes: a photographing and display module for obtaining a real-scene photographed image; and a recognition processing module for determining a target object in the real-scene photographed image an image, the target object image includes a target object; it is also used to acquire a display object associated with the target object; according to the target object image, determine a target display position of the display object in the real-life shot image; The photographing and displaying module is further configured to display the display object at the target display position in the real-scene photographed image.
  • the identification processing module determines the image of the target object in the real-life captured image, it is specifically configured to: acquire a reference image, where the reference image is an image including the target object; according to the With reference to the image, the target object image is determined in the real-life shot image.
  • the recognition processing module executes the obtaining of the reference image, it is specifically configured to: extract the global feature of the real-life captured image; A global feature of a to-be-referenced image is matched; and a to-be-referenced image that matches the feature of the live-action captured image is determined as the reference image.
  • the identification processing module when performing the obtaining of the reference image, is specifically configured to: upload the current location information to the server; and receive the reference image associated with the current location information returned by the server.
  • the reference image includes a preset display position;
  • the identification processing module performs the determining of the target display position of the display object in the real-life shot image according to the target object image, it is specifically configured to: compare the target object image with the reference image. Model adaptation to determine a target display position of the display object in the live-action shot image based on a preset display position in the reference image.
  • the recognition processing module is performing model adaptation of the target object image and the reference image, so as to determine, based on a preset display position in the reference image, that the display object is in the reference image.
  • the target display position in the real shot image it is specifically used for: determining an image affine transformation matrix according to the reference image and the target object image; Assuming that the display position is subjected to affine transformation, the target display position of the display object in the real-life shot image is obtained.
  • the identification processing module determines the image affine change matrix according to the reference image and the target object image, it is specifically configured to: extract local features of the target object image; The local features of the target object image and the local features of the reference image are matched to obtain a mapping relationship between the local features; and the image affine change matrix is determined according to the mapping relationship between the local features.
  • the reference image includes a reference reference coordinate point of a preset display position; the identification processing module is performing the preset display in the reference image according to the image affine transformation matrix.
  • the target display position of the display object in the real-life shot image is obtained by performing affine transformation on the position, it is specifically used for: performing affine transformation on the reference reference coordinate point according to the image affine transformation matrix to obtain The transformed target reference coordinate point; according to the target reference coordinate point, determine the target display position of the display object in the real-life shot image.
  • the reference reference coordinate point including the preset display position in the reference image is preset.
  • the identification processing module determines the image of the target object in the captured image of the real scene, it is further configured to upload the image of the target object to the server; and receive the display object returned by the server, so as to enable the The photographing and display module performs display of the display object.
  • the photographing and display module when the photographing and display module performs the displaying of the display object at the target display position in the real-scene photographed image, it is specifically configured to: display the display object based on an augmented reality display technology The object is displayed superimposed on the target display position in the live-action captured image.
  • the photographing and display module is further configured to respond to a user's triggering operation on the display object displayed in the real-scene photographed image; and display the triggering result of the display object on the live-scene photographed image middle.
  • the target object is a landmark building.
  • the photographing and display module is further configured to display the landmark name of the landmark building in the real scene photographed image.
  • an electronic device comprising: at least one processor; and a memory; the memory stores computer-executed instructions; The at least one processor executes computer-implemented instructions stored in the memory to cause the at least one processor to perform the method of any preceding item.
  • a computer-readable storage medium wherein computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, Implement the method described in any of the preceding items.
  • a computer program product includes computer instructions that, when executed by a processor, implement the method described in any preceding item.
  • a computer program when the computer instructions are executed by a processor, implements the method described in any preceding item.

Abstract

摘要本公开实施例提供的图像处理方法、装置、电子设备及程序产品,通过获得实景拍摄图像;在所述实景拍摄图像中确定目标物体图像,所述目标物体图像为包括目标物体的图像;获取与所述目标物体相关联的显示对象;根据所述目标物体图像,确定所述显示对象在所述实景拍摄图像中的目标显示位置;以及将所述显示对象显示在所述实景拍摄图像中的所述目标显示位置;本实施例提供的技术方案,使得在利用增强现实显示技术对与目标物体相关联的显示对象直接进行显示,由于不必再重新构建虚拟模型,从而节约了运算资源,提高了显示效率,也使用户能够得到更好的交互体验和视觉体验。

Description

图像处 理 方法 、 装置、 电子设备 及 程序 产品 本申请要求于 2021年 02月 03日提交中国专利局、 申请号为 202110152273.8、 申请名称 为 “图像处理方法、 装置、 电子设备及程序产品 ” 的中国专利申请的优先权, 其全部内容通 过引用结合在本申请中。 技术领域 本公开实施例涉及计算机领域, 尤其涉及一种图像处理方法、 装置、 电子设备及程序产 品。 背景技术 增强现实 (Augmented Reality, 简称 AR)技术是一种将虚拟信息与真实世界巧妙融合的技 术。 利用增强现实进行信息呈现成为一种可能的信息呈现方式, 在现有技术中, 通过增强现 实技术显示与实景中某物体相关联的虚拟对象是常见的现有技术。 在对这些虚拟对象进行显 示时, 往往需要对这些虚拟对象进行重新构建。 显然的, 这样的方式使得展示虚拟对象需要大量的运算资源, 显示效率不高, 普适性较 差。 发明内容 针对上述问题, 本公开实施例提供了一种图像处理方法、 装置、 电子设备及程序产品。 第一方面, 本公开实施例提供一种图像处理方法, 包括: 获得实景拍摄图像; 在所述实景拍摄图像中确定目标物体图像, 所述目标物体图像为包括目标物体的图像; 获取与所述目标物体相关联的显示对象; 根据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 以 及 将所述显示对象显示在所述实景拍摄图像中的所述目标显示位置。 第二方面, 本公开实施例提供一种图像处理装置, 包括: 拍摄显示模块, 用于获得实景拍摄图像; 识别处理模块, 用于在所述实景拍摄图像中确定目标物体图像, 所述目标物体图像包括 目标物体; 还用于获取与所述目标物体相关联的显示对象; 根据所述目标物体图像, 确定所 述显示对象在所述实景拍摄图像中的目标显示位置; 所述拍摄显示模块还用于将所述显示对象显示在所述实景拍摄图像中的所述目标显示位 置。 第三方面, 本公开实施例提供一种电子设备, 包括: 至少一个处理器和存储器; 所述存储器存储计算机执行指令; 所述至少一个处理器执行所述存储器存储的计算机执行指令, 使得所述至少一个处理器 执行如上第一方面以及第一方面各种可能的涉及所述的图像处理方法。 第四方面, 本公开实施例提供一种计算机可读存储介质, 所述计算机可读存储介质中存 储有计算机执行指令, 当处理器执行所述计算机执行指令时, 实现如上第一方面以及第一方 面各种可能的设计所述的图像处理方法。 第五方面, 本公开实施例提供一种计算机程序产品, 包括计算机指令, 该计算机指令被 处理器执行时, 实现如上第一方面以及第一方面各种可能的设计所述的图像处理方法。 第六方面, 本公开实施例提供一种计算机程序, 该计算机指令被处理器执行时, 实现如 上第一方面以及第一方面各种可能的设计所述的图像处理方法。 本公开实施例提供的图像处理方法、 装置、 电子设备及程序产品, 通过获得实景拍摄图 像 ; 在所述实景拍摄图像中确定目标物体图像, 所述目标物体图像为包括目标物体的图像; 获取与所述目标物体相关联的显示对象; 根据所述目标物体图像, 确定所述显示对象在所述 实景拍摄图像中的目标显示位置; 以及将所述显示对象显示在所述实景拍摄图像中的所述目 标显示位置; 本实施例提供的技术方案, 使得在利用增强现实显示技术对与目标物体相关联 的显示对象直接进行显示, 由于不必再重新构建虚拟模型, 从而节约了运算资源, 提高了显 示效率, 也使用户能够得到更好的交互体验和视觉体验。 附图说明 为了更清楚地说明本公开实施例或现有技术中的技术方案, 下面将对实施例或现有技术 描述中所需要使用的附图作一简单地介绍, 显而易见地, 下面描述中的附图是本公开的一些 实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动性的前提下, 还可以根据这些 附图获得其他的附图。 图 1为本公开所基于的一种网络架构的示意图; 图 2为本公开实施例提供的一种图像处理方法的流程示意图; 图 3为公开实施例提供的一种图像处理方法的第一界面示意图; 图 4为公开实施例提供的一种图像处理方法的第二界面示意图; 图 5为公开实施例提供的一种图像处理方法的第三界面示意图; 图 6为公开实施例提供的一种图像处理方法的第四界面示意图; 图 7为公开实施例提供的一种图像处理方法的第五界面示意图; 图 8为公开实施例提供的一种图像处理方法的第六界面示意图; 图 9为本公开实施例提供的图像处理装置的结构框图; 图 10为本公开实施例提供的电子设备的硬件结构示意图。 具体实施方式 为使本 公开实施例的目的、 技术方案和优点更加清楚, 下面将结合本公开实施例中 的附图, 对本公开实施例中的技术方案进行清楚、 完整地描述, 显然, 所描述的实施例 是本公开一部分实施例, 而不是全部的实施例。 基于本公开中的实施例, 本领域普通技 术人员在没有作出创造性劳动前提下所获得 的所有其他实施例, 都属于本公开保护的范 围。 增强现实 (Augmented Reality, 简称 AR)技术是一种将虚拟信息与真实世界巧妙融合 的技术。 利 用增强现实进行信息呈现成为一种可能的信息呈现方式, 在现有技术中, 在进行 增强现实的显示时, 终端将会先对现实场景的实景进行拍摄, 以获得当前的实景拍摄图 像。 然后, 利用增强现实技术, 将包括虚拟对象的显示对象叠加在实景拍摄图像上, 并 将叠加后画面呈现给用户。 在 对这些虚拟对象进行叠加显示时, 首先需要基于实景中的物体构建相关联的虚拟 对象, 然后基于构建得到的虚拟对象, 将显示对象显示在实景拍摄图像上。 在 现有技术中, 对于虚拟对象的构建一般可通过同步定位与地图构建 (Simultaneous Localization and Mapping, 简称 SLAM)技术实现。 具体来说, 首先通过终端实时扫描环 境, 以构建基于环境的虚拟对象 (如三维虚拟模型), 然后将需要显示的显示对象加载在 虚拟对象中, 最后, 将加载有显示对象的虚拟对象叠加显示在现实场景中, 完成显示。 但 是, 基于 SLAM技术的增强现实显示技术, 在每一次显示时均需要重新对包括三 维虚拟模型在内的虚拟对象进行生成和构建 , 而生成过程和构建过程需要大量的运算资 源和时间成本, 这也就使得基于 SLAM技术的增强现实显示技术在对显示对象进行显示 时所需要的显示时间较长, 显示效率较低。 针对这样 的问题, 发明人通过研究后, 创造性地发现, 在呈现与物体相关联的显示 对象时, 可采用技术成本较低的方式。 根据本公开实施例的方法, 可直接利用图像层级 的图像识别技术以实现在实景图像中对显示对象的显示。 具体的, 通过获得实景拍摄图像; 在所述实景拍摄图像中确定目标物体图像, 所述 目标物体图像为包括 目标物体的图像; 获取与所述目标物体相关联的显示对象; 根据所 述目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 以及将所 述显示对象显示在所述实景拍摄图像中的所述目标显示位置。 本 实施例提供的技术方案, 使得在利用增强现实显示技术对与目标物体相关联的显 示对象直接进行显示, 由于不必再重新构建虚拟模型, 从而节约了运算资源, 提高了显 示效率, 也使用户能够得到更好的交互体验和视觉体验。 参考 图 b 图 1为本公开所基于的一种网络架构的示意图, 该图 1所示网络架构具体 可包括终端 1以及服务器 2。 其中, 终端 1 具体可为用户手机、 智能家居设备、 平板电脑、 可穿戴设备等可用于 拍摄实景并且展现拍摄的实景的硬件设备, 其终端 1 内可集成或安装有图像处理装置, 该图像处理装置为用于执行本公开图像处理方法硬件或软件,该图像处理装置可为终端 1 提供增强现实显示的展示页面, 并且, 终端 1 利用其屏幕或显示组件向用户显示图像处 理置所提供的增强现实显示的展示页面。 服 务器 2 可具体为设置在云端的服务器或者服务器集群, 其服务器或服务器集群中 可存储有与本公开提供的图像处理方法相关的图像数据、 以及显示对象数据等。 具体 的, 在执行本公开提供的图像处理方法时, 图像处理装置还可利用终端 1 的网 络组件与服务器 2进行交互, 获取服务器 2中存储的图像数据和显示对象数据, 并进行 相应的处理和展示。 图 1 所示架构可适用于信息呈现领域, 换句话说, 其可用于在各类场景下的实景中 与物体相关联的显示对象的呈现。 举例来说 , 本公开提供的图像处理方法可应用于基于增强现实显示的场景中, 例如, 将地标建筑结合增强现实显示技术以进行 “新年 (春节) 祝福” 或 “为某地进行助力加 油” 等显示场景中, 可先对实景中的 “地标建筑”进行识别, 然后将与 “地标建筑 ”相 关联的包括 “新年 (春节) 祝福” 或 “为某地进行助力加油”在内的显示对象显示在包 括 “地标建筑” 的实景中。 而该场景中, 对于与 “地标建筑”相关联的包括 “新年 (春 节) 祝福” 或 “为某地进行助力加油”在内的显示对象的显示, 则可通过本公开提供的 图像处理方法实现。 又例 如, 在一些基于增强现实显示技术的 “寻宝 ”游戏中, 可通过本公开提供的图 像处理方法, 可实现将 “寻宝 ”游戏, 与实景中线索物品相关联的 “宝箱” 这一显示对 象显示在实景中; 或, 在一些基于增强现实显示技术的 “养成”类游戏中, 可通过本公 开提供的图像处理方法, 可实现将 “养成 ”游戏中, 与实景中农田标记相关联的 “农田” 这一显示对象显示在实景中。 再例 如, 本公开提供的图像处理方法还可应用于基于增强现实显示的广告场景中, 例如, 对于一些商品或者产品, 可通过本公开提供的图像处理方法, 实现对于这些商品 的相关联的产品评论或产品介绍等显示对象的呈现, 从而向用户提供更多的关于该商品 的信息, 提升用户的体验。 此外 , 在一些使用到拍摄功能的日常生活场景中, 也可利用本公开提供的图像处理 方法进行显示对象的展示, 从而为用户呈现关于该场景更多的信息, 增加用户的交互体 验。 进一步举例来 说, 在进行 “扫码支付”、 “拍摄图片”等需要开启终端摄像头进行实 景拍摄的场景中, 本公开提供的图像处理方法可一并被执行, 以在各日常生活场景中完 成对于信息的呈现。 下面将针对本公开提供 的图像处理方法进行进一步说明: 第一方面 , 图 2为本公开实施例提供的一种图像处理方法的流程示意图。 参考图 2, 本公开实施例提供的图像处理方法, 包括: 步骤 101、 获得实景拍摄图像; 步骤 102、在所述实景拍摄图像中确定目标物体图像, 所述目标物体图像为包括目标 物体的图像; 步骤 103、 获取与所述目标物体相关联的显示对象; 步骤 104、根据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标 显示位置; 步骤 105、 将所述显示对象显示在所述实景拍摄图像中的所述目标显示位置。 需要说明的是, 本实施例的提供的处理方法的执行主体为前述的图像处理装置, 在 本公开的一些实施例中, 其具体指代的可安装或集成在终端上的客户端或展示端。 用户 可通过终端, 对图像处理装置进行操作, 以使图像处理装置可对用户触发的操作进行响 应。 首先 , 终端将获得一实景拍摄图像, 该实景拍摄图像可为终端调用自身的拍摄组件 对当前环境进行拍摄得到的图像, 也可为图像处理装置通过其他途径获取的实景的实时 图像。 随后, 终端将在实景拍摄图像中进行目标物体的识别, 以确定该实景拍摄图像中是 否存在有可用于执行本公开提供 的图像处理方法的目标物体, 即在实景拍摄图像中确定 是否存在有目标物体的图像。 图 3 为公开实施例提供的一种图像处理方法的第一界面示意图, 在实际场景中, 本 公开所提及的目标物体可为实景场景中出现的任何物体,包括但不限于前述场景中的 “地 标建筑 ” “商品” 以及 “活动二维码”等等。 如 图 3所示的, 此处的目标物体可为 “饮料瓶”, 通过对实景拍摄图像 300进行识别 处理, 以确定该实景拍摄图像 300中包括有“饮料瓶 ”这一目标物体的目标物体图像 301 o 可选实施方 式中, 终端在确定目标物体时, 可基于将当前实景拍摄图像的特征与目 标物体的参考图像的特征进行特征匹配来实现。 需要说明的是, 参考图像一般是预先设置在服务器的图像数据库中的包括有目标物 体的图像, 该参考图像中一般包括有目标物体的标准图, 如地标建筑的正视图, 或, 商 品的正视图, 或, 包括有全部二维码信息的图像, 等等。 图 4为公开实施例提供的一种图像处理方法的第二界面示意图, 参考图 4, 为了确定 目标物体, 首先需要获取包括有目标物体的图像, 即参考图像 302。 然后, 终端才能基于 该参考图像 302, 在实景拍摄图像 300中确定目标物体图像 301 o 对于服 务器的图像数据库来说, 其中往往存储有大量物体的待参考图像, 这些待参 考图像均为预先存储在图像数据库中的。 在本 公开提供的实施方式中, 为了能够识别实景拍摄图像中的目标物体, 首先需要 从图像数据库中存储的大量待参考 图像中, 找到与当前实景拍摄图像对应的图像以作为 参考图像。 可选 的, 这一过程可基于特征匹配的方式实现: 首先, 提取所述实景拍摄图像的全 局特征; 然后, 将所述实景拍摄图像的全局特征与图像数据库中存储的至少一个待参考 图像的全局特征进行匹配处理; 最后, 将与所述实景拍摄图像的特征匹配的待参考图像 确定为所述参考图像。 本公开实施例通过提取所述实景拍摄图像的全局特征, 利用所述 全局特征进行匹配从而确定参考 图像, 有利于提高与实景拍摄图像的匹配度, 改善用户 体验。 具体 的, 对于全局特征的提取具体可利用现有的机器学习模型实现, 如卷积神经网 络等等。 而为了尽可能的识别出当前实景拍摄图像中是否具有可用于执行本公开方案的 目标物体, 在识别时, 可对实景拍摄图像提取用于表示图像全局信息或图像全局特点的 全局特征, 以用于进行后续的匹配。 随后, 在将提取得到的实景拍摄图像的全局特征, 与图像数据库中存储的至少一个 待参考图像的全局特征进行匹配处理 的过程中, 为了提高匹配效率, 对于图像数据库中 预存的各待参考图像来说, 其全局特征可为预先提取和存储的。 而在得到实景拍摄图像 的全局特征之后, 可将该全局特征与预存的待参考图像的全局特征进行一一比对的匹配 处理, 该匹配处理包括但不限于对于特征比对处理、 基于特征之间距离的图像相似度计 算等。 通 过该匹配处理, 可从图像数据库中找到与当前拍摄的实景拍摄图像具有最相似的 全局特征的一个待参考图像, 以作为本次处理所使用的参考图像。 在 其他可选实施方式中, 上述的获取参考图像还可通过其他更高效的方式获取, 如 基于终端的当前位置信息 , 以对服务器的图像数据库中各待参考图像进行筛选, 得到相 应的参考图像。 具体 的, 在一些场景中, 由于目标物体与地理位置信息强关联, 因此, 终端的当前 位置信息也可作为对各待参考 图像进行获取的筛选条件, 即, 终端可将当前位置信息上 传至服务器, 接收服务器返回的与当前位置信息相关联的参考图像。 图 5 为公开实施例提供的一种图像处理方法的第三界面示意图终端可将当前位置信 息 “北京某街心花园 ”上传至服务器, 参考图 5 , 服务器可基于终端提供的当前位置信息 确定出在该位置处对应有 “农田二维码 ” 的参考图像 402, 然后, 服务器将该参考图像 402发送至终端, 以供终端处理以确定 “农田二维码 ” 的目标物体图像 401 , 并在终端屏 幕上显示 “农田二维码”对应的 “农田” 显示对象 403。 其 中, 对于上述的当前位置信息来说, 其可包括有地理位置信息, 如地理坐标、 经 纬度等; 还可包括有方向位置信息, 如朝向西北 45度等。 通过在服务器基于终端的当前 位置信息从预存的若干待参考 图像中确定相应的参考图像的过程中, 可通过对于地理位 置信息和方向位置信息 的结合推送终端当前拍摄的实景拍摄图像中会存在的目标物体, 进而将相应的参考图像发送给终端。 在 其他可选实施方式中或应用场景中, 上述的两种方式 (基于特征匹配和基于位置 信息) 可同时运用到对于参考图像的获取过程中, 如可先基于终端的当前位置信息上接 收到与服务器返回的与当前位置信 息相关联的至少一个待参考图像; 然后, 基于提取所 述实景拍摄图像的全局特征, 与该至少一个待参考图像的全局特征进行匹配处理; 最后, 将与所述实景拍摄图像的特征匹配 的待参考图像确定为所述参考图像。 其具体实现方式 与前述过程中的原理类似, 在此不进行赘述。 当然, 在其他可选实施方式中, 对于本公开中涉及的对目标物体进行识别过程具体 可基于现有的可用于对图像中是否包括有 目标物体进行识别的识别模型、 识别组件或识 别服务器实现。 其中, 终端可将该实景拍摄图像发送至识别模块、 识别组件或识别服务 器中, 由相应的识别模块、 识别组件或识别服务器执行对实景拍摄图像中是否包括有目 标物体的识别处理, 并将处理结果返回至终端。 通 过如上方式, 可在实景拍摄图像中识别出包括有目标物体的图像, 即目标物体图 像。 当完成对于目标物体图像的确定之后, 还将执行获取与所述目标物体相关联的显示 对象的步骤。 其 中, 显示对象是指与目标物体相关联的文字信息、 图像信息、 虚拟特效等可呈现 在终端屏幕上的静态对象或动态对象。 例 如, 在前述的将地标建筑结合增强现实显示技术以进行 “新年 (春节) 祝福” 或 “为某地进行助力加油” 的场景中, 祝福话语的文字信息或助力加油的图像信息则可作 为地标建筑相关联的显示对象; 而在前述的 “寻宝 ”游戏中, 宝箱的虚拟特效可作为实 景中某一线索物品的显示对象。 在可选 实施方式中, 获取显示对象可通过与服务器的交互实现的, 即终端会在完成 对目标物体图像的确认后, 将所述目标物体图像上传至服务器; 接收服务器返回的所述 显示对象, 以执行对所述显示对象的显示。 具体 的, 对于与目标物体相关联的显示对象来说, 其是一般是通过预先设置的, 即 通过建立显示对象与目标物体在服务器中的映射关系 , 以使得当服务器接收到包括有目 标物体的目标物体图像之后, 可基于映射关系将目标物体对应的显示对象发送给终端, 以供终端对显示对象进行显示。 此外 , 在与终端获取显示对象的步骤同步或异步的, 终端还将根据所述目标物体图 像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置, 其具体可通过步骤 1041 和步骤 1042实现: 步骤 1041、 将所述目标物体图像与所述参考图像进行模型适配; 步骤 1042、 基于所述参考图像中的预设显示位置, 确定所述显示对象在所述实景拍 摄图像中的目标显示位置。 其 中, 所述参考图像中包括预设显示位置。 通过模型匹配确定所述显示对象在所述 实景拍摄图像中的目标显示位置, 避免了显示对象在实景拍摄图像中突兀显示, 提升了 显示对象在实景拍摄图像中的显示效果。 在对显示对 象进行显示, 不仅要获取显示对象的对象内容, 还需要确定显示对象在 实景拍摄图像中的哪个位置进行显示。 基于此 , 在为目标物体配置相关联的显示对象时, 还可以为显示对象配置相应的预 设显示位置; 而为了更好的进行图像定位, 该预设显示位置可携带在参考图像中被终端 获取和处理。 具体 的, 图 6为公开实施例提供的一种图像处理方法的第四界面示意图, 如前所述 的, 参考图像包括有目标物体的全部信息的图像, 其中结合图 6 所示的, 参考图像 602 中还将包括有预设显示位置 604,该预设显示位置 604将用于表示目标物体相关联的显示 对象在参考图像 602中的显示位置。 通过将目标物体图像 601和参考图像 602进行模型 适配, 以基于所述参考图像 602中的预设显示位置 604, 确定所述显示对象在所述实景拍 摄图像中的目标显示位置 605。 其 中, 对于步骤 1041中的图像的模型适配可通过图像特征匹配实现: 首先 , 终端可根据参考图像和目标物体图像, 确定图像仿射变化矩阵。 对于仿射 变换矩阵的获取可采用多种方式, 可选实施方式中, 仿射变换矩阵可通过 图像特征匹配技术实现: 首先, 提取所述目标物体图像的局部特征; 然后, 将所述目标 物体图像的局部特征与所述参考 图像的局部特征进行匹配处理, 以得到局部特征之间的 映射关系; 最后, 根据所述局部特征之间的映射关系, 确定所述图像仿射变化矩阵。 其 中, 与前述类似的是, 对于局部特征的提取具体可利用现有的机器学习模型实现, 如卷积神经网络等等。 与前述不同的是, 在对于图像放射变化矩阵的确定过程中, 为了 尽可能的识别出目标物体图像与参考 图像之间的特征对应关系, 在提取特征时可提取目 标物体图像用于表示图像局部信息或图像局部特点的局部特征, 以用于进行后续的匹配。 可选实施方式 中, 目标物体图像和参考图像的局部特征, 均可为鲁棒性较高的 SIFT (Distinctive Image Features from Sc ale -Invariant Keypoints ) 特征, 该 SIFT特征的数据维 度可为 128维。 可选实施方 式中, 对于目标物体图像和参考图像的局部特征之间的匹配和映射关系 的获取, 则可利用基于 brute-force/linear-scan的最近相邻的查找方式来实现, 其中, 对于 匹配特征之间的距离选取可基于标准的 L2 Eclidean Distance来实现, 即使得当前最近匹 配特征点的距离比前一次匹配特征点的距离小一个比率 。 可选实施方 式中, 当得到上述的映射关系之后, 将基于该映射关系建立图像之间的 仿射矩阵, 即利用 Ran Sac的 framework, 通过随机采样和多次循环的方式来得到图像仿 射变化矩阵。 可知的是, 将参考图像与该图像仿射变化矩阵进行点乘运算可得到目标物 体图像。 相应 的, 在步骤 1042中, 终端再根据图像仿射变换矩阵, 对参考图像中的预设显示 位置进行仿射变换, 得到显示对象在实景拍摄图像中的目标显示位置。 在确 定显示对象在实景拍摄图像中的目标显示位置时, 为了降低运算量, 可采用基 准坐标点的方式实现, 图 7为公开实施例提供的一种图像处理方法的第五界面示意图, 如图 7所示的, 在参考图像中可包括预设显示位置的参考基准坐标点 6041 o 可先根据 图像仿射变换矩阵, 对参考基准坐标点 6041进行仿射变换, 得到变换后的 目标基准坐标点 6051; 然后, 根据目标基准坐标点 6051 , 确定显示对象 603在所述实景 拍摄图像中的目标显示位置 605 o其中, 所述参考图像 602中包括预设显示位置 604的参 考基准坐标点 6041是预先设置的。 最后 , 终端将所述显示对象显示在所述实景拍摄图像中的所述目标显示位置。 其中, 图像处理装置可基于增强现实显示技术, 将所述显示对象叠加显示在所述实景拍摄图像 中的所述目标显示位置。 由此, 有利于实现显示对象与实景的无缝结合, 提升交互体验, 增加显示的丰富性。 此外 , 在其他可选实施方式中, 为了向用户提供更多的交互体验, 终端还可响应用 户对显示在所述实景拍摄图像中的显示对象的触发操作; 将对所述显示对象的触发结果, 显示在所述实景拍摄图像中。 在不 同场景中, 由于显示对象是不同的, 根据实际场景, 将有着不同的触发结果: 例如 , 在 “养成 ”类游戏中, 当 “农田 ”这一显示对象显示在实景图像之后, 用户 可通过触发 “收割农产品”触发操作, 以使 “完成收割” 的触发结果显示在屏幕上; 或 用户还可通过触发 “访问好友农田” 的触发操作, 以使 “好友农田” 的触发结果显示在 屏幕上。 上述举例 仅为示例, 根据实际应用场景和实际业务需求, 结合本公开应有更多的可 实现方式, 本公开对此不进行限制。 为 了进一步描述本公开提供的图像处理方法, 下面将以将地标建筑结合增强现实显 示技术以进行 “新年 (春节) 祝福”或的场景为例, 对本公开提供的方案进行详细说明。 其中, 在本实施方式中, 目标物体可为地标建筑。 图 8为公开实施例提供的一种图像处理方法的第六界面示意图, 如图 8所示的, 用 户可触发终端的拍摄功能, 以使终端调用拍摄组件得到实景拍摄图像 800, 终端可基于当 前位置信息确定出实景拍摄图像 800中的地标建筑这一目标物体的目标物体图像 801 = 通过上述实施方式提供 的处理方法, 此处, 实景拍摄图像 800 中, 将显示与目标物 体相关联的显示对象 803 , 即图中的 “点击此处发送新年祝福”。 可选 的, 参考图 8, 实景拍摄图像 800还可将对显示对象 803进行触发操作的操作结 果 806显示在实景拍摄图像 800中; 可选的, 实景拍摄图像 800中还可显示相应的地标 名称, 如图 8所示的上海。 本公 开实施例提供的图像处理方法, 通过获得实景拍摄图像; 在所述实景拍摄图像 中确定目标物体图像, 所述目标物体图像为包括目标物体的图像; 获取与所述目标物体 相关联的显示对象; 根据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中 的目标显示位置; 以及将所述显示对象显示在所述实景拍摄图像中的所述目标显示位置; 本实施例提供的技术方案, 使得在利用增强现实显示技术对与目标物体相关联的显示对 象直接进行显示, 由于不必再重新构建虚拟模型, 从而节约了运算资源, 提高了显示效 率, 也使用户能够得到更好的交互体验和视觉体验。 对应于上文实施例 的图像处理方法, 图 9为本公开实施例提供的图像处理装置的结 构框图。 为了便于说明, 仅示出了与本公开实施例相关的部分。 参照图 9, 所述一种图像 处理装置, 包括: 拍摄显示模块 10以及识别处理模块 20; 其 中, 拍摄显示模块 10, 用于获得实景拍摄图像; 识别处理模块 20, 用于在所述实景拍摄图像中确定目标物体图像, 所述目标物体图 像包括目标物体; 还用于获取与所述目标物体相关联的显示对象; 根据所述目标物体图 像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 所述拍摄显示模块 10还用于将所述显示对象显示在所述实景拍摄图像中的所述目标 显示位置。 可选实施例中 , 所述识别处理模块 20在执行所述实景拍摄图像中确定目标物体图像 时, 具体用于: 获取参考图像, 其中所述参考图像为包括所述目标物体的图像; 根据所 述参考图像, 在所述实景拍摄图像中确定所述目标物体图像。 可选实施例中 , 所述识别处理模块 20在执行所述获取参考图像时, 具体用于: 提取 所述实景拍摄图像的全局特征; 将所述实景拍摄图像的全局特征与图像数据库中存储的 至少一个待参考图像的全局特征进行 匹配处理; 将与所述实景拍摄图像的特征匹配的待 参考图像确定为所述参考图像。 可选实施例中 , 所述识别处理模块 20在执行所述获取参考图像时, 具体用于: 将当 前位置信息上传至服务器; 接收服务器返回的与当前位置信息相关联的参考图像。 可选实施例中 , 所述参考图像中包括预设显示位置; 所述识别处理模块 20在执行所述根据所述目标物体图像, 确定所述显示对象在所述 实景拍摄图像中的目标显示位置时, 具体用于: 将所述目标物体图像与所述参考图像进 行模型适配, 以基于所述参考图像中的预设显示位置, 确定所述显示对象在所述实景拍 摄图像中的目标显示位置。 可选实施例中 , 所述识别处理模块 20在执行将所述目标物体图像与所述参考图像进 行模型适配, 以基于所述参考图像中的预设显示位置, 确定所述显示对象在所述实景拍 摄图像中的目标显示位置时, 具体用于: 根据所述参考图像和所述目标物体图像, 确定 图像仿射变化矩阵; 根据所述图像仿射变换矩阵, 对所述参考图像中的预设显示位置进 行仿射变换, 得到所述显示对象在所述实景拍摄图像中的目标显示位置。 可选实施例中 , 所述识别处理模块 20在执行所述根据所述参考图像和所述目标物体 图像, 确定图像仿射变化矩阵时, 具体用于: 提取所 述目标物体图像的局部特征; 将所述目标物体图像的局部特征与所述参考图 像的局部特征进行匹配处理, 以得到局部特征之间的映射关系; 根据所述局部特征之间 的映射关系, 确定所述图像仿射变化矩阵。 可选实施例中 , 所述参考图像中包括预设显示位置的参考基准坐标点; 所述识别处理模块 20在执行所述根据所述图像仿射变换矩阵, 对所述参考图像中的 预设显示位置进行仿射变换, 得到所述显示对象在所述实景拍摄图像中的目标显示位置 时, 具体用于: 根据所 述图像仿射变换矩阵, 对所述参考基准坐标点进行仿射变换, 得到变换后的 目标基准坐标点; 根据所述目标基准坐标点, 确定所述显示对象在所述实景拍摄图像中 的目标显示位置。 可选实施例中 , 所述参考图像中包括预设显示位置的参考基准坐标点是预先设置的。 可选实施例中 ,所述识别处理模块 20在所述实景拍摄图像中确定目标物体图像之后, 还用于将所述目标物体图像上传至服务器; 接收服务器返回的所述显示对象, 以使所述 拍摄显示模块 10执行对所述显示对象的显示。 可选实施例中 , 所述拍摄显示模块 10在执行所述将所述显示对象显示在所述实景拍 摄图像中的所述目标显示位置时, 具体用于: 基于 增强现实显示技术, 将所述显示对象叠加显示在所述实景拍摄图像中的所述目 标显示位置。 可选实施例中 , 所述拍摄显示模块 10, 还用于响应用户对显示在所述实景拍摄图像 中的显示对象的触发操作; 将对所述显示对象的触发结果, 显示在所述实景拍摄图像中。 可选实施例中 , 所述目标物体为地标建筑。 可选实施例中 , 所述拍摄显示模块 10, 还用于将所述地标建筑的地标名称显示在所 述实景拍摄图像中。 本公 开实施例提供的图像处理装置, 通过获得实景拍摄图像; 在所述实景拍摄图像 中确定目标物体图像, 所述目标物体图像为包括目标物体的图像; 获取与所述目标物体 相关联的显示对象; 根据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中 的目标显示位置; 以及将所述显示对象显示在所述实景拍摄图像中的所述目标显示位置; 本实施例提供的技术方案, 使得在利用增强现实显示技术对与目标物体相关联的显示对 象直接进行显示, 由于不必再重新构建虚拟模型, 从而节约了运算资源, 提高了显示效 率, 也使用户能够得到更好的交互体验和视觉体验。 本实施例 提供的电子设备, 可用于执行上述方法实施例的技术方案, 其实现原理和 技术效果类似, 本实施例此处不再赘述。 参考 图 10, 其示出了适于用来实现本公开实施例的电子设备 900的结构示意图, 该 电子设备 900 可以为终端设备或媒体库。 其中, 终端设备可以包括但不限于诸如移动电 话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称 PDA)、 平板电脑 (Portable Android Device, 简称 PAD)、 便携式多媒体播放器 (Portable Media Player, 简称 PMP)、 车载终端(例如车载导航终端)、 可穿戴电子设备等等的移动终端以 及诸如数字 TV(Television), 台式计算机、 智能家居设备等等的固定终端。 图 10示出的 电子设备仅仅是一个实施例, 不应对本公开实施例的功能和使用范围带来任何限制。 如图 10所示, 电子设备 900可以包括用于执行图像处理方法 (例如中央处理器、 图 形处理器等) 的处理装置 901 , 其可以根据存储在只读存储器 (Read Only Memory , 简 称 ROM) 902 中的程序或者从存储装置 908 加载到随机访问存储器 (Random Access Memory, 简称 RAM) 903中的程序而执行各种适当的动作和处理。 在 RAM 903中, 还 存储有电子设备 900操作所需的各种程序和数据 =处理装置 901、 ROM 902以及 RAM 903 通过总线 904彼此相连。输入 /输出 (Input/Output, 简称 I/O)接口 905也连接至总线 904。 通 常, 以下装置可以连接至 I/O接口 905: 包括例如触摸屏、 触摸板、 键盘、 鼠标、 摄像头、麦克风、加速度计、陀螺仪等的输入装置 906;包括例如液晶屏幕( Liquid Crystal Display , 简称 LCD)、 扬声器、 振动器等的输出装置 907; 包括例如磁带、 硬盘等的存 储装置 908; 以及通信装置 909。 通信装置 909可以允许电子设备 900与其他设备进行无 线或有线通信以交换数据。 虽然图 10示出了具有各种装置的电子设备 900, 但是应理解 的是, 并不要求实施或具备所有示出的装置。 可以替代地实施或具备更多或更少的装置。 特 别地, 根据本公开的实施例, 上文参考流程图描述的过程可以被实现为计算机软 件程序。 例如, 本公开的实施例包括一种计算机程序产品, 其包括承载在计算机可读介 质上的计算机程序, 该计算机程序包含用于执行根据本公开实施例所述的各流程图所示 的方法的程序代码。 在这样的实施例中, 该计算机程序可以通过通信装置 909 从网络上 被下载和安装, 或者从存储装置 908被安装, 或者从 ROM 902被安装。 在该计算机程序 被处理装置 901执行时, 执行本公开实施例的方法中限定的上述功能。 需要说明的是, 本公开上述的计算机可读介质可以是计算机可读信号介质或者计算 机可读存储介质或者是上述两者 的任意组合。 计算机可读存储介质例如可以是一一但不 限于一一电、 磁、 光、 电磁、 红外线、 或半导体的系统、 装置或器件, 或者任意以上的 组合。 计算机可读存储介质的更具体的例子可以包括但不限于: 具有一个或多个导线的 电连接、 便携式计算机磁盘、 硬盘、 随机访问存储器 (RAM)、 只读存储器 (ROM)、 可 擦式可编程只读存储器 (Erasable Programmable Read-Only Memory , 简称 EPROM或闪 存)、 光纤、 便携式紧凑磁盘只读存储器 (Portable Compact Disk Read-Only Memory, 简 称 CD-ROM)、 光存储器件、 磁存储器件、 或者上述的任意合适的组合。 在本公开中, 计 算机可读存储介质可以是任何包含或存储程序的有形介质, 该程序可以被指令执行系统、 装置或者器件使用或者与 其结合使用。 而在本公开中, 计算机可读信号介质可以包括在 基带中或者作为载波一部分传播的数据信号 , 其中承载了计算机可读的程序代码。 这种 传播的数据信号可以采用多种形式 , 包括但不限于电磁信号、 光信号或上述的任意合适 的组合。 计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质, 该计算机可读信号介质可以发送、 传播或者传输用于由指令执行系统、 装置或者器件使 用或者与其结合使用的程序 。 计算机可读介质上包含的程序代码可以用任何适当的介质 传输, 包括但不限于: 电线、 光缆、 RF (Radio Frequency, 射频) 等等, 或者上述的任 意合适的组合。 上述计算机可读介质可以是上述电子设备中所包含的; 也可以是单独存在, 而未装 配入该电子设备中。 上述计算机可读介质承载有一个或者多个程序, 当上述一个或者多个程序被该电子 设备执行时, 使得该电子设备执行上述实施例所示的方法。 可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程 序代码, 上述程序设计语言包括面向对象的程序设计语言一诸如 Java、 Smalltalk, C++, 还包括常规的过程式程序设计语言一诸如 “ C”语言或类似的程序设计语言。 程序代码可 以完全地在用户计算机上执行、 部分地在用户计算机上执行、 作为一个独立的软件包执 行、 部分在用户计算机上部分在远程计算机上执行、 或者完全在远程计算机或媒体库上 执行。 在涉及远程计算机的情形中, 远程计算机可以通过任意种类的网络一一包括局域 网 (Local Area Network, 简称 LAN) 或广域网 (Wide Area Network, 简称 WAN) 一连 接到用户计算机, 或者, 可以连接到外部计算机 (例如利用因特网服务提供商来通过因 特网连接)。 附图中的流程图和框图, 图示了按照本公开各种实施例的系统、 方法和计算机程序 产品的可能实现的体系架构、 功能和操作。 在这点上, 流程图或框图中的每个方框可以 代表一个模块、 程序段、 或代码的一部分, 该模块、 程序段、 或代码的一部分包含一个 或多个用于实现规定的逻辑功能的可执行指令。 也应当注意, 在有些作为替换的实现中, 方框中所标注的功能也可 以以不同于附图中所标注的顺序发生。 例如, 两个接连地表示 的方框实际上可以基本并行地执行 , 它们有时也可以按相反的顺序执行, 这依所涉及的 功能而定。 也要注意的是, 框图和 /或流程图中的每个方框、 以及框图和 /或流程图中的方 框的组合, 可以用执行规定的功能或操作的专用的基于硬件的系统来实现, 或者可以用 专用硬件与计算机指令的组合来实现。 描述于本公开实施例中所涉及到的单元可以通过软件的方式实现, 也可以通过硬件 的方式来实现。 其中, 单元的名称在某种情况下并不构成对该单元本身的限定, 例如, 第一获取单元还可以被描述为 “获取至少两个网际协议地址的单元”。 本 文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行 。 例如, 非 限制性地 , 可以使用的示 范类型 的硬件逻 辑部件 包括: 现场可编程 门阵列 ( Field-Programmable Gate Array , 简称 FPGA)、 专用集成电路 (Application Specific Intergrated Circuit, 简称 ASIC)、 专用标准产品 (Application Specific Standard Product, 简称 ASSP)、 片上系统 ( System-on-a-chip, 简称 SOC)、 复杂可编程逻辑设备 ( Complex Programmable Logic Device, 简称 CPLD) 等等。 在 本公开的上下文中, 机器可读介质可以是有形的介质, 其可以包含或存储以供指 令执行系统、 装置或设备使用或与指令执行系统、 装置或设备结合地使用的程序。 机器 可读介质可以是机器可读信号介质或机器可读储存介 质。 机器可读介质可以包括但不限 于电子的、 磁性的、 光学的、 电磁的、 红外的、 或半导体系统、 装置或设备, 或者上述 内容的任何合适组合 。 机器可读存储介质的更具体实施例会包括基于一个或多个线的电 气连接、 便携式计算机盘、 硬盘、 随机存取存储器 (RAM)、 只读存储器 (ROM)、 可擦 除可编 程只读存储器 (EPROM 或快闪存储器 )、 光纤、 便捷式紧凑盘只读存储器 (CD-ROM), 光学储存设备、 磁储存设备、 或上述内容的任何合适组合。 以下是本公开的一些实施例。 第一方面 , 根据本公开的一个或多个实施例, 一种图像处理方法, 包括: 获得实景拍摄 图像; 在所述 实景拍摄图像中确定目标物体图像, 所述目标物体图像为包括目标物体的图 像; 获取与所述 目标物体相关联的显示对象; 根据所述 目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 以及 将所述显示对象显示在所述实景拍摄 图像中的所述目标显示位置。 可选实施例中 , 所述在所述实景拍摄图像中确定目标物体图像, 包括: 获取参考 图像, 其中所述参考图像为包括所述目标物体的图像; 根据所述参考 图像, 在所述实景拍摄图像中确定所述目标物体图像。 可选实施例中 , 所述获取参考图像, 包括: 提取所述实景拍摄 图像的全局特征; 将所述 实景拍摄图像的全局特征与图像数据库中存储的至少一个待参考图像的全局 特征进行匹配处理; 将与所述实景拍摄 图像的特征匹配的待参考图像确定为所述参考图像。 可选实施例中 , 所述获取参考图像, 包括: 将 当前位置信息上传至服务器; 接收服务器返 回的与当前位置信息相关联的参考图像。 可选实施例中 , 所述参考图像中包括预设显示位置; 所述根 据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示 位置, 包括: 将所述 目标物体图像与所述参考图像进行模型适配, 以基于所述参考图像中的预设 显示位置, 确定所述显示对象在所述实景拍摄图像中的目标显示位置。 可选实施例 中, 所述将所述目标物体图像与所述参考图像进行模型适配, 以基于所 述参考图像中的预设显示位置, 确定所述显示对象在所述实景拍摄图像中的目标显示位 置, 包括: 根据所述参考 图像和所述目标物体图像, 确定图像仿射变化矩阵; 根据所 述图像仿射变换矩阵, 对所述参考图像中的预设显示位置进行仿射变换, 得 到所述显示对象在所述实景拍摄图像中的目标显示位置。 可选实施例 中, 所述根据所述参考图像和所述目标物体图像, 确定图像仿射变化矩 阵, 包括: 提取所述 目标物体图像的局部特征; 将所述 目标物体图像的局部特征与所述参考图像的局部特征进行匹配处理, 以得到 局部特征之间的映射关系; 根据所述局部特 征之间的映射关系, 确定所述图像仿射变化矩阵。 可选实施例中 , 所述参考图像中包括预设显示位置的参考基准坐标点; 所述根据所述 图像仿射变换矩阵, 对所述参考图像中的预设显示位置进行仿射变换, 得到所述显示对象在所述实景拍摄图像中的目标显示位置, 包括: 根据所 述图像仿射变换矩阵, 对所述参考基准坐标点进行仿射变换, 得到变换后的 目标基准坐标点; 根据所 述目标基准坐标点, 确定所述显示对象在所述实景拍摄图像中的目标显示位 置。 可选实施例中 , 所述参考图像中包括预设显示位置的参考基准坐标点是预先设置的。 可选实施例中 , 所述在所述实景拍摄图像中确定目标物体图像之后, 还包括: 将所述 目标物体图像上传至服务器; 接收服务器返 回的所述显示对象, 以执行对所述显示对象的显示。 可选实施例 中, 所述将所述显示对象显示在所述实景拍摄图像中的所述目标显示位 置, 包括: 基于 增强现实显示技术, 将所述显示对象叠加显示在所述实景拍摄图像中的所述目 标显示位置。 可选实施例中 , 该方法还包括: 响应用户对显示在所述实景拍摄图像中的显示对象的触发操作; 将对所述显示对象 的触发结果, 显示在所述实景拍摄图像中。 可选实施例中 , 所述目标物体为地标建筑。 可选实施例中 , 该方法还包括: 将所述地标建筑 的地标名称显示在所述实景拍摄图像中。 第二方面 , 根据本公开的一个或多个实施例, 一种图像处理装置, 包括: 拍摄显示模块, 用于获得实景拍摄图像; 识别处理模块 , 用于在所述实景拍摄图像中确定目标物体图像, 所述目标物体图像 包括目标物体; 还用于获取与所述目标物体相关联的显示对象; 根据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 所述 拍摄显示模块还用于将所述显示对象显示在所述实景拍摄图像中的所述目标显 示位置。 可选实施例中 , 所述识别处理模块在执行所述实景拍摄图像中确定目标物体图像时, 具体用于: 获取参考图像, 其中所述参考图像为包括所述目标物体的图像; 根据所述参 考图像, 在所述实景拍摄图像中确定所述目标物体图像。 可选实施例 中, 所述识别处理模块在执行所述获取参考图像时, 具体用于: 提取所 述实景拍摄图像的全局特征; 将所述实景拍摄图像的全局特征与图像数据库中存储的至 少一个待参考图像的全局特征进行匹配处理 ; 将与所述实景拍摄图像的特征匹配的待参 考图像确定为所述参考图像。 可选实施例 中, 所述识别处理模块在执行所述获取参考图像时, 具体用于: 将当前 位置信息上传至服务器; 接收服务器返回的与当前位置信息相关联的参考图像。 可选实施例中 , 所述参考图像中包括预设显示位置; 所述 识别处理模块在执行所述根据所述目标物体图像, 确定所述显示对象在所述实 景拍摄图像中的目标显示位置时, 具体用于: 将所述目标物体图像与所述参考图像进行 模型适配, 以基于所述参考图像中的预设显示位置, 确定所述显示对象在所述实景拍摄 图像中的目标显示位置。 可选实施例 中, 所述识别处理模块在执行将所述目标物体图像与所述参考图像进行 模型适配, 以基于所述参考图像中的预设显示位置, 确定所述显示对象在所述实景拍摄 图像中的目标显示位置时, 具体用于: 根据所述参考图像和所述目标物体图像, 确定图 像仿射变化矩阵; 根据所述图像仿射变换矩阵, 对所述参考图像中的预设显示位置进行 仿射变换, 得到所述显示对象在所述实景拍摄图像中的目标显示位置。 可选实施例 中, 所述识别处理模块在执行所述根据所述参考图像和所述目标物体图 像, 确定图像仿射变化矩阵时, 具体用于: 提取所 述目标物体图像的局部特征; 将所述目标物体图像的局部特征与所述参考图 像的局部特征进行匹配处理, 以得到局部特征之间的映射关系; 根据所述局部特征之间 的映射关系, 确定所述图像仿射变化矩阵。 可选实施例中 , 所述参考图像中包括预设显示位置的参考基准坐标点; 所述 识别处理模块在执行所述根据所述图像仿射变换矩阵, 对所述参考图像中的预 设显示位置进行仿射变换, 得到所述显示对象在所述实景拍摄图像中的目标显示位置时, 具体用于: 根据所 述图像仿射变换矩阵, 对所述参考基准坐标点进行仿射变换, 得到变换后的 目标基准坐标点; 根据所述目标基准坐标点, 确定所述显示对象在所述实景拍摄图像中 的目标显示位置。 可选实施例中 , 所述参考图像中包括预设显示位置的参考基准坐标点是预先设置的。 可选实施例 中, 所述识别处理模块在所述实景拍摄图像中确定目标物体图像之后, 还用于将所述目标物体图像上传至服务器; 接收服务器返回的所述显示对象, 以使所述 拍摄显示模块执行对所述显示对象的显示。 可选实施例 中, 所述拍摄显示模块在执行所述将所述显示对象显示在所述实景拍摄 图像中的所述目标显示位置时, 具体用于: 基于 增强现实显示技术, 将所述显示对象叠加显示在所述实景拍摄图像中的所述目 标显示位置。 可选实施例 中, 所述拍摄显示模块, 还用于响应用户对显示在所述实景拍摄图像中 的显示对象的触发操作; 将对所述显示对象的触发结果, 显示在所述实景拍摄图像中。 可选实施例中 , 所述目标物体为地标建筑。 可选实施例 中, 所述拍摄显示模块, 还用于将所述地标建筑的地标名称显示在所述 实景拍摄图像中。 第三方面 , 根据本公开的一个或多个实施例, 一种电子设备, 其中, 包括: 至少一个处理器 ; 以及 存储器 ; 所述存储器存储计算机执行指令 ; 所述 至少一个处理器执行所述存储器存储的计算机执行指令, 使得所述至少一个处 理器执行如前任一项所述的方法。 第 四方面, 根据本公开的一个或多个实施例, 一种计算机可读存储介质, 其中, 所 述计算机可读存储介质中存储有计算机执行指令, 当处理器执行所述计算机执行指令时, 实现如前任一项所述的方法。 第五 方面, 根据本公开的一个或多个实施例, 一种计算机程序产品, 包括计算机指 令, 该计算机指令被处理器执行时, 实现如前任一项所述的方法。 第六方面, 根据本公开的一个或多个实施例, 一种计算机程序, 该计算机指令被处理器 执行时, 实现如前任一项所述的方法。 以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。 本领域技术人员 应当理解, 本公开中所涉及的公开范围, 并不限于上述技术特征的特定组合而成的技术 方案, 同时也应涵盖在不脱离上述公开构思的情况下, 由上述技术特征或其等同特征进 行任意组合而形成的其它技术方案 。 例如上述特征与本公开中公开的 (但不限于) 具有 类似功能的技术特征进行互相替换而形成的技术方案。 此外 , 虽然采用特定次序描绘了各操作, 但是这不应当理解为要求这些操作以所示 出的特定次序或以顺序次序执行来执行 。 在一定环境下, 多任务和并行处理可能是有利 的。 同样地, 虽然在上面论述中包含了若干具体实现细节, 但是这些不应当被解释为对 本公开的范围的限制。 在单独的实施例的上下文中描述的某些特征还可以组合地实现在 单个实施例中。 相反地, 在单个实施例的上下文中描述的各种特征也可以单独地或以任 何合适的子组合的方式实现在多个实施例中。 尽管 巳经采用特定于结构特征和 /或方法逻辑动作的语言描述了本主题, 但是应当理 解所附权利要求书中所限定的主题未必局 限于上面描述的特定特征或动作。 相反, 上面 所描述的特定特征和动作仅仅是实现权利要求书的实施例形式。

Claims

权 利 要 求 书
1、 一种图像处理方法, 其特征在于, 包括: 获得实景拍摄 图像; 在所述实景拍摄 图像中确定目标物体图像, 所述目标物体图像为包括目标物体的图 像; 获取与所述 目标物体相关联的显示对象; 根据所述 目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 以及 将所述显示对象显示在所述实景拍摄 图像中的所述目标显示位置。
2、 根据权利要求 1所述的图像处理方法, 其特征在于, 所述在所述实景拍摄图像中 确定目标物体图像, 包括: 获取参考 图像, 其中所述参考图像为包括所述目标物体的图像; 根据所述参考 图像, 在所述实景拍摄图像中确定所述目标物体图像。
3、 根据权利要求 2所述的图像处理方法, 其特征在于, 所述获取参考图像, 包括: 提取所述实景拍摄 图像的全局特征; 将所述实景拍摄 图像的全局特征与图像数据库中存储的至少一个待参考图像的全局 特征进行匹配处理; 将与所述实景拍摄 图像的特征匹配的待参考图像确定为所述参考图像。
4、 根据权利要求 2所述的图像处理方法, 其特征在于, 所述获取参考图像, 包括: 将 当前位置信息上传至服务器; 接收服务器返 回的与当前位置信息相关联的参考图像。
5、 根据权利要求 2所述的图像处理方法, 其特征在于, 所述参考图像中包括预设显 示位置; 所述根据所述 目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示 位置, 包括: 将所述 目标物体图像与所述参考图像进行模型适配, 以基于所述参考图像中的预设 显示位置, 确定所述显示对象在所述实景拍摄图像中的目标显示位置。
6、 根据权利要求 5所述的图像处理方法, 其特征在于, 所述将所述目标物体图像与 所述参考图像进行模型适配, 以基于所述参考图像中的预设显示位置, 确定所述显示对 象在所述实景拍摄图像中的目标显示位置, 包括: 根据所述参考 图像和所述目标物体图像, 确定图像仿射变化矩阵; 根据所述 图像仿射变换矩阵, 对所述参考图像中的预设显示位置进行仿射变换, 得 到所述显示对象在所述实景拍摄图像中的目标显示位置。
7、 根据权利要求 6所述的图像处理方法, 其特征在于, 所述根据所述参考图像和所 述目标物体图像, 确定图像仿射变化矩阵, 包括: 提取所述 目标物体图像的局部特征; 将所述 目标物体图像的局部特征与所述参考图像的局部特征进行匹配处理, 以得到 局部特征之间的映射关系; 根据所述局部特 征之间的映射关系, 确定所述图像仿射变化矩阵。
8、 根据权利要求 6所述的图像处理方法, 其特征在于, 所述参考图像中包括预设显 示位置的参考基准坐标点; 所述根据所述 图像仿射变换矩阵, 对所述参考图像中的预设显示位置进行仿射变换, 得到所述显示对象在所述实景拍摄图像中的目标显示位置, 包括: 根据所述 图像仿射变换矩阵, 对所述参考基准坐标点进行仿射变换, 得到变换后的 目标基准坐标点; 根据所述 目标基准坐标点, 确定所述显示对象在所述实景拍摄图像中的目标显示位 置。
9、 根据权利要求 8所述的图像处理方法, 其特征在于, 所述参考图像中包括预设显 示位置的参考基准坐标点是预先设置的。
10、 根据权利要求 1所述的图像处理方法, 其特征在于, 所述在所述实景拍摄图像 中确定目标物体图像之后, 还包括: 将所述 目标物体图像上传至服务器; 接收服务器返 回的所述显示对象, 以执行对所述显示对象的显示。
11、 根据权利要求 1所述的图像处理方法, 其特征在于, 所述将所述显示对象显示 在所述实景拍摄图像中的所述目标显示位置, 包括: 基于增强现实显示技术, 将所述显示对象叠加显示在所述实景拍摄图像中的所述目 标显示位置。
12、 根据权利要求 1所述的图像处理方法, 其特征在于, 还包括: 响应用户对显示在所述实景拍摄图像中的显示对象的触发操作; 将对所述显示对象 的触发结果, 显示在所述实景拍摄图像中。
13、 根据权利要求 1所述的图像处理方法, 其特征在于, 所述目标物体为地标建筑。
14、 根据权利要求 13所述的图像处理方法, 其特征在于, 还包括: 将所述地标建筑 的地标名称显示在所述实景拍摄图像中。
15、 一种图像处理装置, 其特征在于, 包括: 拍摄显示模块, 用于获得实景拍摄图像; 识别处理模块, 用于在所述实景拍摄图像中确定目标物体图像, 所述目标物体图像 包括目标物体; 还用于获取与所述目标物体相关联的显示对象; 根据所述目标物体图像, 确定所述显示对象在所述实景拍摄图像中的目标显示位置; 所述拍摄显示模块还用于将所述显示对象显示在所述实景拍摄图像中的所述 目标显 示位置。
16、 一种电子设备, 其中, 包括: 至少一个处理器 ; 以及 存储器 ; 所述存储器存储计算机执行指令 ; 所述至少一个 处理器执行所述存储器存储的计算机执行指令, 使得所述至少一个处 理器执行如权利要求 1-14任一项所述的方法。
17、 一种计算机可读存储介质, 其中, 所述计算机可读存储介质中存储有计算机执 行指令, 当处理器执行所述计算机执行指令时, 实现如权利要求 1-14任一项所述的方法。
18、 一种计算机程序产品, 包括计算机指令, 其特征在于, 所述计算机指令被处理 器执行时实现权利要求 1-14任一项所述的方法。
19、 一种计算机程序, 其特征在于, 所述计算机程序被处理器执行时实现权利要求 1-14任一项所述的方法。
19
PCT/SG2022/050033 2021-02-03 2022-01-25 图像处理方法、装置、电子设备及程序产品 WO2022169413A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110152273.8 2021-02-03
CN202110152273.8A CN114863305A (zh) 2021-02-03 2021-02-03 图像处理方法、装置、电子设备及程序产品

Publications (1)

Publication Number Publication Date
WO2022169413A1 true WO2022169413A1 (zh) 2022-08-11

Family

ID=82623488

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2022/050033 WO2022169413A1 (zh) 2021-02-03 2022-01-25 图像处理方法、装置、电子设备及程序产品

Country Status (2)

Country Link
CN (1) CN114863305A (zh)
WO (1) WO2022169413A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111640184A (zh) * 2020-06-05 2020-09-08 上海商汤智能科技有限公司 古建筑物重现方法、装置、电子设备及存储介质
CN111815781A (zh) * 2020-06-30 2020-10-23 北京市商汤科技开发有限公司 增强现实数据呈现方法、装置、设备和计算机存储介质
CN111970557A (zh) * 2020-09-01 2020-11-20 深圳市慧鲤科技有限公司 图像显示方法、装置、电子设备及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111640184A (zh) * 2020-06-05 2020-09-08 上海商汤智能科技有限公司 古建筑物重现方法、装置、电子设备及存储介质
CN111815781A (zh) * 2020-06-30 2020-10-23 北京市商汤科技开发有限公司 增强现实数据呈现方法、装置、设备和计算机存储介质
CN111970557A (zh) * 2020-09-01 2020-11-20 深圳市慧鲤科技有限公司 图像显示方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN114863305A (zh) 2022-08-05

Similar Documents

Publication Publication Date Title
US20220321636A1 (en) Platform for Constructing and Consuming Realm and Object Feature Clouds
US9558559B2 (en) Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
WO2022083383A1 (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
US20230360337A1 (en) Virtual image displaying method and apparatus, electronic device and storage medium
WO2021184952A1 (zh) 增强现实处理方法及装置、存储介质和电子设备
US20150187139A1 (en) Apparatus and method of providing augmented reality
WO2021082801A1 (zh) 增强现实处理方法及装置、系统、存储介质和电子设备
US11893702B2 (en) Virtual object processing method and apparatus, and storage medium and electronic device
WO2020248900A1 (zh) 全景视频的处理方法、装置及存储介质
US20220386061A1 (en) Audio processing method and apparatus, readable medium, and electronic device
WO2019109828A1 (zh) Ar业务处理方法、装置、服务器、移动终端及存储介质
EP4246435A1 (en) Display method and apparatus based on augmented reality, and device and storage medium
KR20120071444A (ko) 증강 현실을 이용한 광고 제공 방법과 그를 위한 시스템, 장치 및 단말기
WO2022088819A1 (zh) 视频处理方法、视频处理装置和存储介质
US11869195B2 (en) Target object controlling method, apparatus, electronic device, and storage medium
WO2022088908A1 (zh) 视频播放方法、装置、电子设备及存储介质
WO2022068364A1 (zh) 信息交互方法、第一终端设备、服务器及第二终端设备
CN112288878A (zh) 增强现实预览方法及预览装置、电子设备及存储介质
WO2022169413A1 (zh) 图像处理方法、装置、电子设备及程序产品
WO2022227918A1 (zh) 视频处理方法、设备及电子设备
US20240007590A1 (en) Image processing method and apparatus, and electronic device, and computer readable medium
CN113535280A (zh) 图案绘制方法、装置、设备、计算机可读存储介质及产品
CN113837918A (zh) 多进程实现渲染隔离的方法及装置
US11836437B2 (en) Character display method and apparatus, electronic device, and storage medium
WO2023182935A2 (zh) 图像处理方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22750125

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18264245

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22750125

Country of ref document: EP

Kind code of ref document: A1