CN114638885A - Intelligent space labeling method and system, electronic equipment and storage medium - Google Patents

Intelligent space labeling method and system, electronic equipment and storage medium Download PDF

Info

Publication number
CN114638885A
CN114638885A CN202210265682.3A CN202210265682A CN114638885A CN 114638885 A CN114638885 A CN 114638885A CN 202210265682 A CN202210265682 A CN 202210265682A CN 114638885 A CN114638885 A CN 114638885A
Authority
CN
China
Prior art keywords
video image
coordinate system
information
equipment
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210265682.3A
Other languages
Chinese (zh)
Inventor
袁浩亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202210265682.3A priority Critical patent/CN114638885A/en
Publication of CN114638885A publication Critical patent/CN114638885A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30204Marker

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention relates to the technical field of remote assistance, and discloses an intelligent space labeling method, an intelligent space labeling system, electronic equipment and a storage medium. The method comprises the following steps: collecting a video image and recording the position information of the video image; sending the video image and the corresponding position information to an AI terminal so that a target intelligent identification model preset in the AI terminal identifies a target object of the video image to obtain identification information; receiving marking information corresponding to the video image determined by the AI terminal based on the identification information; and displaying the annotation information in real time in the display screen through calculation based on the annotation information. The embodiment of the invention is mainly applied to the field of remote-assisted target identification, the frame of the target object is determined through the target intelligent identification model, the labeling information corresponding to the frame is obtained through calculation, and then only the labeling information is sent to the AR equipment for being displayed in a display screen, so that the transmission efficiency is improved, and the real-time property is ensured.

Description

Intelligent space labeling method and system, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of remote assistance, is mainly applied to the field of target identification, and particularly relates to an intelligent spatial labeling method and system based on augmented reality, electronic equipment and a storage medium.
Background
The existing remote assistance generally comprises that an AR device and an expert terminal are used, images transmitted in real time through the AR device are received by the expert terminal, corresponding labeling is carried out on the images through experts, for example, a labeling frame of a problem occurrence position or/and other auxiliary characters and the like, and then the images and the frames are sent to the AR device for field personnel to refer to so as to solve corresponding faults and the like.
The method is mature gradually, but a serious problem exists in the method, namely, after each marking, the image is communicated with the marking frame and is sent to the AR equipment, on one hand, the method is influenced by the transmission efficiency, the real-time performance is relatively poor, on the other hand, the method cannot be matched with the position adjustment of the AR equipment, namely, when field personnel rotate a certain angle, the marking frame cannot be matched with the rotated direction of the marking frame, namely, the field personnel need to search objects corresponding to the frame according to the returned image and the frame, in addition, the drawing of the marking frame generally needs the assistance of experts, and in the remote assistance field of some target identification, the time of the experts is greatly wasted.
Disclosure of Invention
Aiming at the defects, the embodiment of the invention discloses an intelligent space labeling method, an intelligent space labeling system, electronic equipment and a storage medium, wherein only certain labeling information needs to be transmitted, and the transmission efficiency is improved.
The first aspect of the embodiment of the invention discloses an intelligent space labeling method, which comprises the following steps:
collecting a video image and recording the position information of the video image;
sending the video image and the corresponding position information to an AI terminal so that a target intelligent identification model preset in the AI terminal identifies a target object of the video image to obtain identification information;
receiving annotation information corresponding to the video image determined by the AI terminal based on the identification information;
and displaying the annotation information in real time in a display screen through calculation based on the annotation information.
As a preferred embodiment, in the first aspect of the embodiments of the present invention, acquiring a video image, and recording position information of the video image, includes:
collecting a video image of a space where the AR equipment is located through a camera;
acquiring real-time three-dimensional world coordinates of the space where the AR equipment is located, coordinate conversion matrixes of a shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment, real-time space model data of the space where the AR equipment is located and shooting coordinate system coordinates of four corners of the video image on a rear shearing surface;
and defining the real-time three-dimensional world coordinates of the space where the AR equipment is located, the coordinate transformation matrix of the shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment, and the coordinates of the shooting coordinate system of the four corners of the video image on the rear shearing surface as the position information of the video image.
As a preferred embodiment, in the first aspect of the embodiments of the present invention, acquiring a video image, and recording position information of the video image, includes:
the method comprises the steps of collecting a video image at a preset time interval, combining each video image and position information thereof to form a data packet, and setting collection time corresponding to the video image for the data packet.
As a preferred embodiment, in the first aspect of the embodiment of the present invention, the identifying a target object of the video image by using a preset target intelligent identification model in the AI terminal to obtain identification information includes:
and inputting the video image into a pre-trained target intelligent recognition model to obtain frame information, wherein the frame information is recognition information.
As a preferred embodiment, in the first aspect of the embodiment of the present invention, the determining, by the AI terminal, the annotation information corresponding to the video image based on the identification information includes:
the AI terminal calculates world coordinate system coordinates of the four corners of the video image according to the shooting coordinate system coordinates of the four corners on the rear clipping plane:
PWi=PCi×Trans
wherein, PWiWorld coordinate system coordinates, PC, for the ith corneriThe coordinate of the shooting coordinate system of the ith angle on the back shearing surface is shown, and the Trans is a coordinate transformation matrix of the shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment;
the AI terminal calculates the world coordinate system coordinates of the center position of the frame on the plane where the four corners of the video image are located according to the shooting coordinate system coordinates of the AR equipment:
CW=CC×Trans
CW is a world coordinate system coordinate of the center position of a frame on a plane where four corners of the video image are located, and CC is a shooting coordinate system coordinate of AR equipment;
and the AI terminal automatically generates marking information based on the world coordinate system coordinates of the four corners, the world coordinate system coordinates of the center position of the frame on the plane of the four corners of the video image, and the shape and the size of the frame.
As a preferred embodiment, in the first aspect of the embodiments of the present invention, displaying the annotation information in real time on a display screen by calculation based on the annotation information includes:
calculating space linear equations of the coordinate system of the center position of the frame in the plane where the four corners of the video image are located and the coordinate system of the world coordinate system of the AR equipment;
calculating the world coordinate system coordinate of the object surface corresponding to the center position of the frame according to the space linear equation and the real-time space model data of the space where the AR equipment is located;
and inputting the world coordinate system coordinate of the object surface corresponding to the central position of the frame and the shape and the size of the frame into a world coordinate system display module, and displaying the world coordinate system on a display screen of the AR equipment.
A second aspect of an embodiment of the present invention provides an AR device, including: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory for executing the method for labeling an intelligent space according to the first aspect of the embodiment of the present invention.
A third aspect of the embodiments of the present invention provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, where the computer program enables a computer to execute the method for intelligent spatial annotation according to the first aspect of the embodiments of the present invention.
A fourth aspect of the embodiments of the present invention provides an intelligent spatial labeling method, including:
receiving a video image acquired by AR equipment and position information of the video image;
inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image to obtain recognition information;
determining annotation information corresponding to the video image based on the identification information;
and sending the labeling information to the AR equipment so that the AR equipment displays the labeling information in real time in a display screen through calculation based on the labeling information.
As a preferred embodiment, in the fourth aspect of the embodiments of the present invention, receiving a video image captured by an AR device and position information of the video image includes:
receiving a video image of a space where the AR equipment is located, wherein the video image is acquired by the AR equipment through a camera on the AR equipment;
and receiving position information corresponding to the video image, wherein the position information comprises real-time three-dimensional world coordinates of a space where the AR equipment is located, coordinate transformation matrixes of a shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment, and shooting coordinate system coordinates of four corners of the video image on a rear shearing surface.
As a preferred embodiment, in the fourth aspect of the embodiment of the present invention, inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image, and obtaining recognition information includes:
inputting the video image into a pre-trained target intelligent recognition model to obtain frame information, wherein the frame information is recognition information;
determining annotation information corresponding to the video image based on the identification information, including:
calculating world coordinate system coordinates of the four corners of the video image according to shooting coordinate system coordinates of the four corners on a rear shearing surface:
PWi=PCi×Trans
wherein, PWiWorld coordinate system coordinates, PC, for the ith corneriCoordinates of a shooting coordinate system of the ith angle on the rear shearing surface, and shooting by the AR equipment by TransA coordinate transformation matrix of the coordinate system and the world coordinate system of the AR equipment;
calculating world coordinate system coordinates of the center position of the frame in the plane where the four corners of the video image are located according to shooting coordinate system coordinates of the AR equipment:
CW=CC×Trans
CW is a world coordinate system coordinate of the center position of a frame on a plane where four corners of the video image are located, and CC is a shooting coordinate system coordinate of AR equipment;
and automatically generating marking information based on the world coordinate system coordinates of the four corners, the world coordinate system coordinates of the center position of the frame on the plane of the four corners of the video image, and the shape and the size of the frame.
As a preferred embodiment, in the fourth aspect of the embodiment of the present invention, sending the annotation information to the AR device, so that the AR device displays the annotation information in real time on a display screen by calculation based on the annotation information, includes:
sending the labeling information to AR equipment;
the AR equipment calculates space linear equations of the coordinate system of the center position of the frame in the plane where the four corners of the video image are located and the coordinate system of the world coordinate system of the AR equipment;
the AR equipment calculates world coordinate system coordinates of the object surface corresponding to the center position of the frame according to the space linear equation and real-time space model data of the space where the AR equipment is located;
and the AR equipment inputs the world coordinate system coordinate of the object surface corresponding to the central position of the frame and the shape and the size of the frame into a world coordinate system display module and displays the world coordinate system coordinate on a display screen of the AR equipment.
A fifth aspect of the present invention provides an AI terminal, comprising: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory to execute the method for labeling smart space according to the fourth aspect of the embodiment of the present invention.
A sixth aspect of the present invention is a computer-readable storage medium storing a computer program, where the computer program makes a computer execute the method for intelligent spatial annotation according to the fourth aspect of the embodiments of the present invention.
A seventh aspect of the present embodiment provides an intelligent spatial annotation system, where the system includes an AR device and an AI terminal;
the AR equipment is used for collecting video images and recording position information of the video images; sending the video image and the corresponding position information to an AI terminal;
the AI terminal is used for receiving the video image and the corresponding position information, inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image to obtain recognition information, determining marking information based on the recognition information and the position information, and then sending the frame to the AR equipment;
and the AR equipment is also used for displaying the labeling information in real time in a display screen of the AR equipment through calculation based on the labeling information.
An eighth aspect of the embodiments of the present invention discloses a computer program product, which, when running on a computer, causes the computer to execute the method for labeling an intelligent space disclosed in the first aspect or the fourth aspect of the embodiments of the present invention.
A ninth aspect of the present invention discloses an application publishing platform, where the application publishing platform is configured to publish a computer program product, and when the computer program product runs on a computer, the computer is enabled to execute the intelligent space labeling method disclosed in the first aspect or the fourth aspect of the present invention.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the embodiment of the invention is mainly applied to the field of remote-assisted target identification, the frame of the target object is determined through the target intelligent identification model, the labeling information corresponding to the frame is obtained through calculation, and then only the labeling information is sent to the AR equipment for being displayed in a display screen, so that the transmission efficiency is improved, and the real-time property is ensured.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flowchart illustrating an intelligent spatial labeling method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a structure of relevant coordinates of a video image according to an embodiment of the disclosure;
FIG. 3 is a schematic flowchart of an intelligent spatial labeling method according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of an intelligent spatial annotation system disclosed in the third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first", "second", "third", "fourth", and the like in the description and the claims of the present invention are used for distinguishing different objects, and are not used for describing a specific order. The terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiment of the invention discloses an intelligent space labeling method, an intelligent space labeling system, electronic equipment and a storage medium, wherein on one hand, the frame of a target object is automatically identified through artificial intelligence without assistance of experts, on the other hand, only labeling information is sent to AR equipment for being displayed in a display screen, the transmission efficiency is improved, the real-time performance is ensured, the position of the frame can be changed according to the position adjustment of the AR equipment, the method and the system can be suitable for the remote assistance field of target identification, and the detailed description is carried out in combination with the attached drawings.
Example one
Referring to fig. 1, fig. 1 is a schematic flowchart of an intelligent spatial labeling method according to an embodiment of the present invention, where the intelligent spatial labeling method is applied to an AR device, that is, an execution subject is the AR device and related components or related devices or software in the AR device. As shown in fig. 1, the intelligent spatial labeling method includes the following steps:
s110, collecting a video image and recording the position information of the video image.
Because the video image collected by the AR device is sent to the AI terminal to be recognized through the target intelligent recognition model, so as to obtain the frame of the target object, and further determine the labeling information of the target object, in a preferred embodiment of the present invention, the AR device searches whether a connectable AI terminal exists, if so, a user of the AR device operates the AR device to select one AI terminal to apply for connection, and after the connection is successful, mutual information transmission including images, characters, and the like can be realized, and certainly, the mutual information transmission can also include voice and the like.
The AR device may acquire a video image through its video acquisition unit, such as a camera, and obtain position information corresponding to the video image. Of course, in some other embodiments, the video capture unit of the AR device, such as a camera, may perform uninterrupted data capture, for example, capture a video image at a preset interval, such as 100ms, and pack the position information corresponding to the captured video image with the video image to form a data packet, set the capture time of the video image for the data packet, and send the data packet to the AI terminal.
The uninterrupted image acquisition is basically similar to the process of acquiring only one image and sending the image to the AI terminal, and the difference is that when the uninterrupted image acquisition is carried out, the AI terminal receives a video stream formed by the uninterrupted video image and carries out target identification on each video image in the video stream, and because each video image corresponds to one data packet, the position information of the video image is determined accordingly.
Taking the collection of a video image as an example, the real-time three-dimensional world coordinates of the space where the AR device is located, the coordinate transformation matrix of the shooting coordinate system of the AR device and the world coordinate system of the AR device, the real-time space model data of the space where the AR device is located, and the shooting coordinate system coordinates of the four corners of the video image on the rear shear plane are obtained.
The spatial model data in the real-time spatial model data of the space where the AR device is located refers to all object surface model data which are not transparent to visible light in the space where the AR device is located. The data is collected by a spatial scanning unit of the AR device at a camera point of the AR device.
The coordinate transformation matrix of the shooting coordinate system of the AR device and the world coordinate system of the AR device can be determined according to relevant parameters of the camera. The real-time three-dimensional world coordinates of the space where the AR device is located can be acquired through a sensor integrated in the AR device. The coordinates of the shooting coordinate system of the four corners of the video image on the rear clipping surface are realized by artificial setting.
In the data, the real-time three-dimensional world coordinate of the space where the AR device is located, the coordinate conversion matrix of the shooting coordinate system of the AR device and the world coordinate system of the AR device, and the coordinates of the shooting coordinate system of the four corners of the video image on the rear shearing surface are defined as the position information of the video image, and the position information and the video image I are sent to the AI terminal.
And S120, sending the video image and the corresponding position information to an AI terminal so that a target intelligent recognition model preset in the AI terminal can recognize a target object of the video image to obtain recognition information.
The target intelligent recognition model is an artificial intelligent model trained in advance aiming at a target object, a neural network model is optimized, and a typical target intelligent recognition model can adopt a YOLO model.
And inputting the video image into a pre-trained target intelligent recognition model, and obtaining two-dimensional image coordinates of each position of a frame of the target object, so that frame information of the target object, including the position, size and shape of the frame, can be obtained according to the two-dimensional image coordinates, and the frame information is recorded as recognition information.
And S130, receiving the marking information corresponding to the video image determined by the AI terminal based on the identification information.
And the AI terminal calculates according to the frame information and the position information to obtain the labeling information of the video image.
Specifically, first, the calculation processing unit of the AI terminal calculates world coordinate system coordinates (i.e., world coordinate system coordinates of P1-P4 in fig. 2) of the four corners of the video image by the shooting coordinate system coordinates (i.e., P1-P4 in fig. 2) of the four corners of the video image on the rear clipping plane, as follows:
PWi=PCi×Trans
wherein, PWiWorld coordinate system coordinates, PC, for the ith corneriAnd the coordinate of the shooting coordinate system of the ith angle on the rear shearing surface is shown, and the Trans is a coordinate transformation matrix of the shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment.
Then, the calculation processing unit of the AI terminal calculates world coordinate system coordinates of the center position of the frame on the plane where the four corners of the video image are located.
CW=CC×Trans
CW is a world coordinate system coordinate of a plane where the center position of the frame is located at four corners of the video image, and CC is a shooting coordinate system coordinate of the AR device.
And finally, the AI terminal sends the calculated world coordinate system coordinates of the four corners of the video image, the world coordinate system coordinates of the center position of the frame on the plane of the four corners of the video image, and the shape and the size of the frame to the AR equipment.
It should be noted that the target intelligent identification model of the AI terminal and the calculation module for calculating the annotation information of the video image may be separately disposed in different devices, or may be integrated in the same device, and when the two parts are disposed in different devices, the two devices communicate with each other in a wired or wireless manner, so that data interaction is completed.
And S140, displaying the annotation information in real time in a display screen through calculation based on the annotation information.
First, the AR device calculates a spatial straight-line equation connecting coordinates of two points by using a world coordinate system coordinate (i.e., Q2 in fig. 2) of a plane where the center position of the frame is located at four corners of the video image and a real-time three-dimensional world coordinate (i.e., Q1 in fig. 2) of a space where the AR device shoots the points. And according to the equation and real-time space model data of the space where the AR equipment is located, calculating to obtain the coordinates of the object surface world coordinate system corresponding to the center of the frame.
The typical calculation method is to input coordinate data of each point scanned by the space model into a linear equation, so that a corresponding coordinate where the equation is established is a world coordinate system coordinate of the object surface corresponding to the required frame center.
Then, according to the world coordinate system coordinates of the object surface corresponding to the center of the frame, the coordinates, the added characters (if any), and the shape and the size of the frame after equal scaling are input into a world coordinate system display module and displayed on a display screen of the AR equipment.
The display module of the world coordinate system can also correct the display position of the frame in the display screen of the AR equipment in real time according to the received world coordinate system coordinate of the object surface corresponding to the center of the frame and the real-time three-dimensional world coordinate of the shooting point of the AR equipment in the space where the shooting point is located. A typical world coordinate system display module such as microsoft Hololen2 own world coordinate system display module.
From the above, when the real-time three-dimensional world coordinate of the shooting point of the AR device in the space is changed, the display position of the frame in the display screen is changed accordingly.
If the AI terminal does not send new annotation information any more, the annotation information sent before is continuously presented in the display screen of the AR device (for example, if the real-time three-dimensional world coordinates of the AR device shooting point in the space where the AR device shooting point is located deviate from the object surface world coordinate system coordinates corresponding to the center of the frame, the frame disappears, the real-time three-dimensional world coordinate deviation of the AR device shooting point in the space where the AR device shooting point is located is adjusted, and when the object surface world coordinate system coordinates corresponding to the center of the frame are matched, the frame reappears). And if the AI terminal sends new marking information again, the current marking information of the AR equipment is covered by the new marking information, and then a new frame is presented in the AR equipment in a recalculation mode.
Example two
Referring to fig. 3, fig. 3 is a schematic flowchart of an intelligent spatial tagging method according to an embodiment of the present invention, where the intelligent spatial tagging method is applied to an AI terminal, that is, an execution subject is an AI terminal, and may be a server with a certain processing capability, a tablet computer, a mobile phone, a general desktop computer or a notebook computer, and the like, and besides hardware, related software is also configured. As shown in fig. 3, the intelligent spatial labeling method includes the following steps:
s210, receiving a video image collected by the AR equipment and position information of the video image.
The method comprises the steps that an AI terminal receives a video image of a space where AR equipment is located, wherein the video image is collected by the AR equipment through a camera on the AI terminal; and receiving position information corresponding to the video image, where the position information includes real-time three-dimensional world coordinates of a space where the AR device is located, coordinate transformation matrices of a shooting coordinate system of the AR device and the world coordinate system of the AR device, and shooting coordinate system coordinates of four corners of the video image on the rear shear plane, and a process of the position information corresponds to step S110 in the first embodiment.
S220, inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image, and obtaining recognition information.
The target intelligent recognition model is an artificial intelligent model trained in advance aiming at a target object, a neural network model is optimized, and a typical target intelligent recognition model can adopt a YOLO model.
The video image is input into a pre-trained target intelligent recognition model, and two-dimensional image coordinates of each position of a frame of the target object can be obtained, so that frame information of the target object including the position, size and shape of the frame can be obtained according to the two-dimensional image coordinates, and the frame information is recorded as recognition information, and the process corresponds to the step S120 in the first embodiment.
And S130, determining the annotation information corresponding to the video image based on the identification information.
Specifically, world coordinate system coordinates of four corners of the video image are calculated from the shooting coordinate system coordinates of the four corners on the rear clipping plane:
PWi=PCi×Trans
wherein, PWiWorld coordinate system coordinates, PC, for the ith corneriThe coordinate of the shooting coordinate system of the ith angle on the back shearing surface is shown, and the Trans is a coordinate transformation matrix of the shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment;
calculating world coordinate system coordinates of the center position of the frame in the plane where the four corners of the video image are located according to shooting coordinate system coordinates of the AR equipment:
CW=CC×Trans
CW is a world coordinate system coordinate of the center position of a frame on a plane where four corners of the video image are located, and CC is a shooting coordinate system coordinate of AR equipment;
and automatically generating marking information based on the world coordinate system coordinates of the four corners, the world coordinate system coordinates of the center position of the frame on the plane of the four corners of the video image, and the shape and the size of the frame.
The procedure corresponds to embodiment step S130.
And S230, sending the annotation information to the AR equipment so that the AR equipment displays the annotation information in real time in a display screen through calculation based on the annotation information.
The AR equipment calculates space linear equations of the coordinate system of the center position of the frame in the plane where the four corners of the video image are located and the coordinate system of the world coordinate system of the AR equipment; the AR equipment calculates world coordinate system coordinates of the object surface corresponding to the center position of the frame according to the space linear equation and real-time space model data of the space where the AR equipment is located; and the AR equipment inputs the world coordinate system coordinate of the object surface corresponding to the central position of the frame and the shape and the size of the frame into a world coordinate system display module and displays the world coordinate system coordinate on a display screen of the AR equipment.
The process corresponds to embodiment step S140.
EXAMPLE III
An embodiment three discloses an intelligent spatial annotation system, please refer to fig. 4, which includes: AR device 310 and AI terminal 320; the AR device 310 and the AI terminal 320 establish communication, and information that can be communicated includes, but is not limited to, images, text, voice, and other types of data such as location information data.
The AR device 310 is configured to collect a video image and record position information of the video image; sending the video image and the corresponding position information to an AI terminal;
the AI terminal 320 is configured to receive the video image and the corresponding location information, input the video image into a preset target intelligent recognition model to recognize a target object of the video image, obtain recognition information, determine annotation information based on the recognition information and the location information, and then send the frame to the AR device;
the AR device 310 is further configured to display the annotation information in real time on a display screen of the AR device by calculation based on the annotation information.
Example four
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 5, the electronic device may be an AR device, an AI terminal, or a combination thereof. The electronic device may include:
a memory 410 storing executable program code; a processor 420 coupled to the memory 410;
the processor 420 calls the executable program code stored in the memory 410 to perform part or all of the steps of the intelligent space labeling method in the first or second embodiment.
The embodiment of the invention discloses a computer-readable storage medium which stores a computer program, wherein the computer program enables a computer to execute part or all of the steps in the intelligent space labeling method in the first embodiment or the second embodiment.
The embodiment of the invention also discloses a computer program product, wherein when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the intelligent space labeling method in the first embodiment or the second embodiment.
The embodiment of the invention also discloses an application publishing platform, wherein the application publishing platform is used for publishing the computer program product, and when the computer program product runs on a computer, the computer is enabled to execute part or all of the steps in the intelligent space labeling method in the first embodiment or the second embodiment.
In various embodiments of the present invention, it should be understood that the sequence numbers of the processes do not mean the execution sequence necessarily in order, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated units, if implemented as software functional units and sold or used as a stand-alone product, may be stored in a computer accessible memory. Based on such understanding, the technical solution of the present invention, which is a part of or contributes to the prior art in essence, or all or part of the technical solution, can be embodied in the form of a software product, which is stored in a memory and includes several requests for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute part or all of the steps of the method according to the embodiments of the present invention.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B can be determined. It should also be understood, however, that determining B from a does not mean determining B from a alone, but may also be determined from a and/or other information.
Those of ordinary skill in the art will appreciate that some or all of the steps of the methods of the embodiments may be implemented by instructions associated with hardware via a program, and the program may be stored in a computer-readable storage medium, where the storage medium includes Read-Only Memory (ROM), Random Access Memory (RAM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM), or other Memory, such as a magnetic disk, a, A tape memory, or any other medium readable by a computer that can be used to carry or store data.
The above detailed description is provided for an intelligent spatial labeling method, system, electronic device and storage medium disclosed in the embodiments of the present invention, and the specific examples are applied herein to explain the principles and embodiments of the present invention, and the descriptions of the above embodiments are only used to help understand the method and its core ideas of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (15)

1. An intelligent spatial annotation method, comprising:
collecting a video image and recording the position information of the video image;
sending the video image and the corresponding position information to an AI terminal so that a target intelligent identification model preset in the AI terminal identifies a target object of the video image to obtain identification information;
receiving annotation information corresponding to the video image determined by the AI terminal based on the identification information;
and displaying the annotation information in real time in a display screen through calculation based on the annotation information.
2. The intelligent spatial annotation process of claim 1 wherein capturing a video image and recording the location information of said video image comprises:
collecting a video image of a space where the AR equipment is located through a camera;
acquiring real-time three-dimensional world coordinates of the space where the AR equipment is located, coordinate conversion matrixes of a shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment, real-time space model data of the space where the AR equipment is located and shooting coordinate system coordinates of four corners of the video image on a rear shearing surface;
and defining the real-time three-dimensional world coordinates of the space where the AR equipment is located, the coordinate transformation matrix of the shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment, and the coordinates of the shooting coordinate system of the four corners of the video image on the rear shearing surface as the position information of the video image.
3. The intelligent spatial annotation process of claim 1 wherein capturing a video image and recording the location information of said video image comprises:
the method comprises the steps of collecting a video image at a preset time interval, combining each video image and position information thereof to form a data packet, and setting collection time corresponding to the video image for the data packet.
4. The intelligent spatial annotation method of claim 2, wherein the step of recognizing the target object of the video image by a target intelligent recognition model preset in an AI terminal to obtain recognition information comprises:
and inputting the video image into a pre-trained target intelligent recognition model to obtain frame information, wherein the frame information is recognition information.
5. The intelligent spatial annotation method of claim 4, wherein the AI terminal determines annotation information corresponding to the video image based on the identification information, comprising:
the AI terminal calculates world coordinate system coordinates of the four corners of the video image according to the shooting coordinate system coordinates of the four corners on the rear shearing surface:
PWi=PCi×Trans
wherein, PWiWorld coordinate system coordinates, PC, for the ith corneriThe coordinate of the shooting coordinate system of the ith angle on the back shearing surface is shown, and the Trans is a coordinate transformation matrix of the shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment;
the AI terminal calculates the world coordinate system coordinates of the center position of the frame on the plane where the four corners of the video image are located according to the shooting coordinate system coordinates of the AR equipment:
CW=CC×Trans
CW is a world coordinate system coordinate of the center position of a frame on a plane where four corners of the video image are located, and CC is a shooting coordinate system coordinate of AR equipment;
and the AI terminal automatically generates marking information based on the world coordinate system coordinates of the four corners, the world coordinate system coordinates of the center position of the frame on the plane of the four corners of the video image, and the shape and the size of the frame.
6. The smart spatial annotation process of claim 5 wherein displaying said annotation information in real time on a display screen by computation based on said annotation information comprises:
calculating space linear equations of the coordinate system of the center position of the frame in the plane where the four corners of the video image are located and the coordinate system of the world coordinate system of the AR equipment;
calculating the world coordinate system coordinate of the object surface corresponding to the center position of the frame according to the space linear equation and the real-time space model data of the space where the AR equipment is located;
and inputting the world coordinate system coordinate of the object surface corresponding to the central position of the frame and the shape and the size of the frame into a world coordinate system display module, and displaying the world coordinate system on a display screen of the AR equipment.
7. An AR device, comprising: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory for executing a smart space annotation method according to any one of claims 1 to 6.
8. A computer-readable storage medium storing a computer program, wherein the computer program causes a computer to perform a smart spatial annotation method according to any one of claims 1 to 6.
9. An intelligent spatial annotation method, comprising:
receiving a video image acquired by AR equipment and position information of the video image;
inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image to obtain recognition information;
determining annotation information corresponding to the video image based on the identification information;
and sending the labeling information to the AR equipment so that the AR equipment displays the labeling information in real time in a display screen through calculation based on the labeling information.
10. The intelligent spatial annotation method of claim 9, wherein receiving the video image captured by the AR device and the location information of the video image comprises:
receiving a video image of a space where the AR equipment is located, wherein the video image is acquired by the AR equipment through a camera on the AR equipment;
and receiving position information corresponding to the video image, wherein the position information comprises real-time three-dimensional world coordinates of a space where the AR equipment is located, coordinate transformation matrixes of a shooting coordinate system of the AR equipment and the world coordinate system of the AR equipment, and shooting coordinate system coordinates of four corners of the video image on a rear shearing surface.
11. The smart spatial labeling method of claim 10,
inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image to obtain recognition information, wherein the recognition information comprises the following steps:
inputting the video image into a pre-trained target intelligent recognition model to obtain frame information, wherein the frame information is recognition information;
determining annotation information corresponding to the video image based on the identification information, including:
calculating world coordinate system coordinates of the four corners of the video image according to shooting coordinate system coordinates of the four corners on a rear shearing surface:
PWi=PCi×Trans
wherein, PWiWorld coordinate system coordinates, PC, for the ith corneriCoordinates of a shooting coordinate system of the ith angle on the rear shearing surface, and a Trans coordinate system of the shooting coordinate system of the AR equipmentA coordinate transformation matrix with an AR device world coordinate system;
calculating the world coordinate system coordinates of the center position of the frame in the plane of the four corners of the video image according to the shooting coordinate system coordinates of the AR equipment:
CW=CC×Trans
CW is a world coordinate system coordinate of the center position of a frame on a plane where four corners of the video image are located, and CC is a shooting coordinate system coordinate of AR equipment;
and automatically generating marking information based on the world coordinate system coordinates of the four corners, the world coordinate system coordinates of the center position of the frame on the plane of the four corners of the video image, and the shape and the size of the frame.
12. The smart spatial annotation method of claim 11, wherein sending the annotation information to an AR device to enable the AR device to display the annotation information in real time on a display screen by calculation based on the annotation information comprises:
sending the labeling information to AR equipment;
the AR equipment calculates a space linear equation of the coordinate of the world coordinate system of the plane where the four corners of the video image are located according to the coordinate of the world coordinate system of the frame at the center position of the frame and the coordinate of the world coordinate system of the AR equipment;
the AR equipment calculates world coordinate system coordinates of the object surface corresponding to the center position of the frame according to the space linear equation and real-time space model data of the space where the AR equipment is located;
and the AR equipment inputs the world coordinate system coordinate of the object surface corresponding to the central position of the frame and the shape and the size of the frame into a world coordinate system display module and displays the world coordinate system coordinate on a display screen of the AR equipment.
13. An AI terminal, comprising: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory for executing a smart space annotation method according to any one of claims 9 to 12.
14. A computer-readable storage medium storing a computer program, wherein the computer program causes a computer to perform a method of intelligent spatial annotation according to any one of claims 9 to 12.
15. An intelligent spatial labeling system is characterized by comprising AR equipment and an AI terminal;
the AR equipment is used for collecting video images and recording position information of the video images; sending the video image and the corresponding position information to an AI terminal;
the AI terminal is used for receiving the video image and the corresponding position information, inputting the video image into a preset target intelligent recognition model to recognize a target object of the video image to obtain recognition information, determining marking information based on the recognition information and the position information, and then sending the frame to the AR equipment;
and the AR equipment is also used for displaying the labeling information in real time in a display screen of the AR equipment through calculation based on the labeling information.
CN202210265682.3A 2022-03-17 2022-03-17 Intelligent space labeling method and system, electronic equipment and storage medium Pending CN114638885A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210265682.3A CN114638885A (en) 2022-03-17 2022-03-17 Intelligent space labeling method and system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210265682.3A CN114638885A (en) 2022-03-17 2022-03-17 Intelligent space labeling method and system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114638885A true CN114638885A (en) 2022-06-17

Family

ID=81949963

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210265682.3A Pending CN114638885A (en) 2022-03-17 2022-03-17 Intelligent space labeling method and system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114638885A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115171197A (en) * 2022-09-01 2022-10-11 广州市森锐科技股份有限公司 High-precision image information identification method, system, equipment and storage medium
CN115713664A (en) * 2022-12-06 2023-02-24 浙江中测新图地理信息技术有限公司 Intelligent marking method and device for fire-fighting acceptance check
CN118314522A (en) * 2024-05-31 2024-07-09 深圳市维象智能科技有限公司 Automatic glue filling identification method and device based on visual identification

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115171197A (en) * 2022-09-01 2022-10-11 广州市森锐科技股份有限公司 High-precision image information identification method, system, equipment and storage medium
CN115171197B (en) * 2022-09-01 2023-05-16 广州市森锐科技股份有限公司 High-precision image information identification method, system, equipment and storage medium
CN115713664A (en) * 2022-12-06 2023-02-24 浙江中测新图地理信息技术有限公司 Intelligent marking method and device for fire-fighting acceptance check
CN118314522A (en) * 2024-05-31 2024-07-09 深圳市维象智能科技有限公司 Automatic glue filling identification method and device based on visual identification

Similar Documents

Publication Publication Date Title
US11410415B2 (en) Processing method for augmented reality scene, terminal device, system, and computer storage medium
CN114638885A (en) Intelligent space labeling method and system, electronic equipment and storage medium
CN110012209B (en) Panoramic image generation method and device, storage medium and electronic equipment
CN111340864A (en) Monocular estimation-based three-dimensional scene fusion method and device
EP3550479A1 (en) Augmented-reality-based offline interaction method and apparatus
CN108114471B (en) AR service processing method and device, server and mobile terminal
CN110853095B (en) Camera positioning method and device, electronic equipment and storage medium
EP3748533A1 (en) Method, apparatus, and storage medium for obtaining object information
CN110555876B (en) Method and apparatus for determining position
CN112419388A (en) Depth detection method and device, electronic equipment and computer readable storage medium
US20240331245A1 (en) Video processing method, video processing apparatus, and storage medium
CN112083801A (en) Gesture recognition system and method based on VR virtual office
CN114442805A (en) Monitoring scene display method and system, electronic equipment and storage medium
CN111582240A (en) Object quantity identification method, device, equipment and medium
CN115278084A (en) Image processing method, image processing device, electronic equipment and storage medium
CN115278014A (en) Target tracking method, system, computer equipment and readable medium
CN110427936B (en) Wine storage management method and system for wine cellar
CN114764897B (en) Behavior recognition method, behavior recognition device, terminal equipment and storage medium
CN117111025A (en) Target-based data processing method, laser radar and system
CN114723923B (en) Transmission solution simulation display system and method
EP4439447A1 (en) Method and apparatus for repositioning target object, storage medium and electronic apparatus
CN116778550A (en) Personnel tracking method, device and equipment for construction area and storage medium
CN113286082B (en) Target object tracking method, target object tracking device, electronic equipment and storage medium
CN113329137B (en) Picture transmission method, device, computer equipment and computer readable storage medium
CN116363725A (en) Portrait tracking method and system for display device, display device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination