CN117495698A

CN117495698A - Flying object identification method, system, intelligent terminal and computer readable storage medium

Info

Publication number: CN117495698A
Application number: CN202410001150.8A
Authority: CN
Inventors: 李家悦; 郑明炀; 苏小杭
Original assignee: Fujian Zhuohang Special Equipment Co ltd
Current assignee: Fujian Zhuohang Special Equipment Co ltd
Priority date: 2024-01-02
Filing date: 2024-01-02
Publication date: 2024-02-02

Abstract

The application relates to the technical field of computer vision, and provides a method, a system, an intelligent terminal and a computer readable storage medium for identifying flying objects, wherein the method comprises the following steps: splicing the first image frame and the second image frame to obtain a scene fusion image; inputting the scene fusion image into an identification model to obtain an identification flying object output by the identification model; and acquiring the space coordinates of the identified flying object under a phase coordinate system based on the two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame respectively. The application aims at the intelligent terminal with smaller carrier volume and smaller power consumption, can rapidly identify the flying object under the condition of no manual intervention, and expands the stereoscopic vision reconstruction work about identifying the flying object, so that the recognition precision and efficiency are considered under the condition of low cost, the working intensity of monitoring and the like is further reduced, and the working efficiency is improved.

Description

Flying object identification method, system, intelligent terminal and computer readable storage medium

Technical Field

The present disclosure relates to the field of computer vision, and in particular, to a method and system for identifying a flying object, an intelligent terminal, and a computer readable storage medium.

Background

The identification and indication of the flying object is an important function of security monitoring, mixed (augmented) reality and other applications. The intelligent and rapid recognition of the flying object and the positioning indication of the flying object are the basis for the next treatment work. The aircraft that typically need to be identified include personnel, vehicles, buildings, and the like. At present, a video or an image shot by a camera is mainly adopted for identifying the flying object, and then the flying object is calibrated by matching with laser and radar ranging. Such schemes are limited by two factors, namely, when using two cameras or multiple cameras, because the different cameras recognize the fine difference of the position of the flying object, the flying object may be repeatedly positioned. Secondly, the radiation distance of a ranging system such as laser and radar and the power consumption of the ranging system are positively correlated. Therefore, in consideration of the size and the power consumption of the carrier, the implementation of the flyer identification scheme in the intelligent wearable device or the intelligent mobile phone is generally realized by adopting the combination of an RGB camera and a depth camera or a short-distance laser radar, and the identification precision and the identification efficiency are lower due to the smaller perception range.

Disclosure of Invention

The application provides a flying object identification method, a system, an intelligent terminal and a computer readable storage medium, which are used for solving the defect that in the prior art, aiming at the setting with smaller power consumption, the identification precision and the efficiency are lower due to smaller perception range.

In a first aspect, the present application provides a method for identifying a flying object, which is applied to an intelligent terminal, and includes:

splicing the first image frame and the second image frame to obtain a scene fusion image; the first image frame and the second image frame are respectively obtained by shooting by two cameras with parallel optical axes in the binocular camera;

inputting the scene fusion image into an identification model to obtain an identification flying object output by the identification model; the identification model is obtained by training the identification flying object type label marked by the sample scene image based on the sample scene image;

acquiring space coordinates of the identified flying object under a phase coordinate system based on two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame respectively; the phase plane coordinate system is established according to a base line formed between two cameras with parallel optical axes in the binocular camera.

According to the method for identifying the flying object provided by the application, the method for acquiring the space coordinates of the identified flying object under the phase plane coordinate system based on the two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame respectively comprises the following steps:

Determining parallax information based on the X-axis coordinate value in the two-dimensional image coordinates of the identified flying object and the depth information of the identified flying object;

and determining the space coordinates of the identified flying object under a phase plane coordinate system based on the two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame respectively and the parallax information.

According to the method for identifying the flying object provided by the application, the step of splicing the first image frame and the second image frame to obtain the scene fusion image comprises the following steps:

respectively extracting features of the first image frame and the second image frame to obtain a first feature point set and a second feature point set;

and matching based on the first characteristic point set and the second characteristic point set to obtain the scene fusion image.

According to the method for identifying the flying object provided by the application, after the space coordinates of the identified flying object under the phase plane coordinate system are obtained, the method further comprises the following steps:

generating a three-dimensional space model of the identified flying object under a camera coordinate system based on the first feature point set and the second feature point set, and the first feature depth information and the second feature depth information;

Acquiring the position coordinates of the identified flying object relative to a camera according to the space coordinates of the identified flying object under the phase plane coordinate system and the three-dimensional space model of the identified flying object under the camera coordinate system;

based on the position coordinates of the identified flying object relative to the camera and the camera gesture data, calculating the position coordinates of the identified flying object under an earth coordinate system;

the camera coordinate system is established by taking the intelligent terminal as a coordinate origin; the first characteristic depth information comprises depth information corresponding to each characteristic point in the first characteristic point set; the second feature depth information includes depth information corresponding to each feature point in the second feature point set.

According to the method for identifying the flying object provided by the application, the generating of the three-dimensional space model of the identified flying object under the camera coordinate system based on the first feature point set, the second feature point set, the first feature depth information and the second feature depth information comprises the following steps:

generating sparse point cloud information based on the first feature point set and the second feature point set, and the first feature depth information and the second feature depth information;

And encrypting and smoothing the sparse point cloud information to form the three-dimensional space model of the identified flying object under the camera coordinate system.

According to the method for identifying the flying object, the identification model comprises a feature extraction layer and a classification layer;

correspondingly, the step of inputting the scene fusion image into the recognition model to acquire the recognized flying object output by the recognition model comprises the following steps:

inputting the scene fusion image into the feature extraction layer to obtain a feature vector output by the feature extraction layer;

and inputting the feature vector to the classification layer, and processing by adopting a multi-head attention mechanism to obtain the identified flying object output by the classification layer.

According to the method for identifying the flying object provided by the application, before the first image frame and the second image frame are spliced to obtain the scene fusion image, the method further comprises the following steps:

and obtaining an internal reference matrix of the binocular camera through calibration so as to calculate the position coordinate of the identified flying object under the earth coordinate system based on the position coordinate of the identified flying object relative to the camera, the camera gesture data and the internal reference matrix of the binocular camera.

In a second aspect, the present application further provides a flyer identification system, provided in an intelligent terminal, the system including:

the binocular fusion module is used for splicing the first image frame and the second image frame to obtain a scene fusion image; the first image frame and the second image frame are respectively obtained by shooting by two cameras with parallel optical axes in the binocular camera;

the flyer identification module is used for inputting the scene fusion image into an identification model to acquire an identified flyer output by the identification model; the recognition model is obtained by training a recognition flyer category label marked by a sample scene image;

the flying object positioning module is used for acquiring space coordinates of the identified flying object under a phase plane coordinate system based on two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame respectively; the phase plane coordinate system is established according to a base line formed between two cameras with parallel optical axes in the binocular camera.

In a third aspect, the present application further provides an intelligent terminal, including an intelligent terminal body provided with a binocular camera, and a vision processor communicatively connected with the binocular camera; the visual processor executes a program to implement the method for identifying the flying object according to any one of the above;

The intelligent terminal body comprises a mobile phone terminal, augmented reality glasses, mixed reality glasses and augmented reality glasses.

In a fourth aspect, the present application also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of identifying a flying object as described in any one of the above.

According to the method and the system for identifying the flying object, after the first image frame and the second image frame which are respectively acquired by the binocular camera are fused, only the scene fusion image which is reserved at the overlapped part is used as the input of the identification model, and the output result is the identification flying object which is correspondingly framed in the image, so that the two-dimensional image coordinates of the identification flying object in the first image frame and the second image frame are utilized to convert the space coordinates of the identification flying object in the phase plane coordinate system to carry out stereoscopic vision reconstruction. Aiming at the intelligent terminal with smaller carrier volume and power consumption, the flying object can be rapidly identified under the condition of no manual intervention, the stereoscopic vision reconstruction work about the identification of the flying object is developed, the identification precision and the efficiency are considered under the condition of low cost, the working intensity of monitoring and the like is further reduced, and the working efficiency is improved.

Drawings

For a clearer description of the present application or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for identifying a flying object provided by the application.

Fig. 2 is a schematic structural diagram of the flyer identification system provided in the present application.

Fig. 3 is a schematic structural diagram of an intelligent terminal provided in the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The terms "first," "second," and the like in the description of the present application, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type and not limited to the number of objects, e.g., the first object may be one or more. It is to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises" and "comprising" indicate the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Optionally, fig. 1 is a schematic flow chart of a method for identifying a flying object provided in the present application. As shown in fig. 1, the method for identifying a flying object provided in the embodiment of the present application is applied to an intelligent terminal, and the method includes steps 101 to 103:

And step 101, splicing the first image frame and the second image frame to obtain a scene fusion image. The first image frame and the second image frame are respectively shot by two cameras with parallel optical axes in the binocular camera.

It should be noted that, the execution subject of the flyer identification method is a flyer identification system, and the system may be integrated on a control chip in the intelligent terminal.

The flyer identification method is suitable for identifying the flyers through the electronic equipment with the flyer identification system, and further identifying the true geographic positions of the flyer ground objects.

The electronic device described above may be implemented in various forms. For example, the electronic device described in the embodiments of the present application may be a smart terminal device integrating a flyer recognition system and a video capture system, such as a mobile terminal of a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a navigation system, a smart bracelet, a smart watch, a digital camera, and the like.

The electronic device described in the embodiments of the present application may also be a terminal device provided with a flyer identification system, where the terminal needs to be connected with the video acquisition system in a communication manner. A fixed terminal such as a desktop computer or the like. In the following, it is assumed that the electronic device is a mobile terminal. However, it will be understood by those skilled in the art that the configuration according to the embodiment of the present application can be applied to a fixed type terminal in addition to elements particularly used for a moving purpose.

The video acquisition system can be combined according to the installation and use characteristics of the intelligent terminal, for example, the two RGB cameras are parallel in optical axis and are rigidly connected, and the two RGB cameras are integrated into a binocular camera with fixed relative position between the two cameras.

The first image frame refers to a video image acquired in real time by an RGB camera on the left side of the binocular camera.

The second image frame refers to a video image acquired in real time by the RGB camera on the right side in the binocular camera.

Specifically, the flyer recognition system performs feature extraction on a first image frame and a second image frame which are acquired by the binocular camera at the same time, and performs registration by using the extracted features, so that a scene fusion image with non-overlapping portions in the two images removed can be obtained.

102, inputting the scene fusion image into a recognition model, and obtaining a recognized flying object output by the recognition model. The recognition model is obtained by training a recognition flying object type label marked on the sample scene image based on the sample scene image.

It should be noted that the recognition model may be a neural network model, and the structure and parameters of the neural network include, but are not limited to, the input layer, the hidden layer, the number of output layers, and the weight parameters of each layer. The kind and structure of the neural network are not particularly limited in the embodiments of the present application. For example, the recognition model may be a neural network model consisting of an input layer, a hidden layer, and an output layer, wherein:

the input layer directly receives and digitally encodes the scene fusion image at the forefront portion of the overall network.

The hidden layer may have one or more layers and the input vector is calculated by means of weighted summation of its neurons.

The output layer is the last layer and is used for decoding the vector obtained after the weighted summation, and different classes of flying objects are framed from the scene fusion image.

The sample data includes a sample scene image corresponding to the sample data, and an identification flying object type label marked on the sample scene image. The sample data is divided into a training set and a testing set according to a certain proportion.

Illustratively, the duty cycle of the training set and the test set in the sample data includes, but is not limited to, 9: 1. 8:2, etc., to which the embodiments of the present application are not particularly limited.

In step 102, the flyer identification system initializes the weight coefficient between each layer of the constructed identification model, inputs a set of sample scene images in the training set and the identification flyer category labels marked on the sample scene images to the neural network under the current weight coefficient, and sequentially calculates the output of each node of the input layer, the hidden layer and the output layer. And correcting the weight coefficient between each node of the input layer and the hidden layer according to the accumulated error between the final output result of the output layer and the actual connection position state type of the output layer and the gradient descent method. According to the above process, until all samples in the training set are traversed, the weight coefficients of the input layer and the hidden layer can be obtained.

And the flyer identification system restores the identification model in the step 102 according to the weight coefficients of the neural network input layer and the hidden layer, and inputs the scene fusion image obtained in the step 101 into the trained identification model, so that the identified flyers of various categories identified by the scene fusion image can be obtained.

Step 103, based on the two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame, acquiring the space coordinates of the identified flying object under a phase plane coordinate system.

The phase plane coordinate system is established according to a base line formed between two cameras with parallel optical axes in the binocular camera.

Specifically, in step 103, after the flying object identification system specifically identifies the flying objects, the flying object identification system respectively obtains two-dimensional image coordinates of the center of each identified flying object in the image coordinate system where the first image frame and the second image frame are respectively located, and based on the two-dimensional image coordinates, converts the spatial coordinates of the identified flying object in the phase plane coordinate system, so as to perform stereoscopic imaging on the binocular camera.

The phase plane coordinate system is a virtual imaging plane created for a binocular camera, the base line formed between two cameras with parallel optical axes in the binocular camera extends to form an X axis, the left boundary of a picture shot by a left camera extends to form a Y axis, and finally a straight line perpendicular to an XY plane passing through an intersection point between the X axis and the Y axis is taken as a Z axis.

After the first image frame and the second image frame which are respectively acquired by the binocular camera are fused, a scene fusion image which only keeps an overlapped part is used as input of an identification model, and an output result is an identification flying object which corresponds to a frame in the image, so that the two-dimensional image coordinates of the identification flying object in the first image frame and the second image frame are utilized to convert the space coordinates of the identification flying object in a phase plane coordinate system to carry out stereoscopic vision reconstruction. Aiming at the intelligent terminal with smaller carrier volume and power consumption, the flying object can be rapidly identified under the condition of no manual intervention, the stereoscopic vision reconstruction work about the identification of the flying object is developed, the identification precision and the efficiency are considered under the condition of low cost, the working intensity of monitoring and the like is further reduced, and the working efficiency is improved.

On the basis of any one of the above embodiments, acquiring spatial coordinates of the identified flying object in a phase plane coordinate system based on two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame, respectively, includes: and determining parallax information based on the X-axis coordinate value in the two-dimensional image coordinates of the identified flying object and the depth information of the identified flying object.

Specifically, in step 103, assuming that the C1 camera coordinate system in which the first image frame is located is O1x1y1z1, the C2 camera coordinate system in which the second image frame is located is O2x2y2z2, the focal length is f, the distance between the two cameras of the binocular camera is d, the center point P of the identified flying object is (x 1, y1, z 1) in the C1 camera coordinate system, and is (x 2, y2, z 2) in the C2 camera coordinate system. The coordinates of the optical centers of the left camera of the binocular camera in the phase plane coordinate system are (u 1, v1, 0), and the coordinates of the optical centers of the right camera of the binocular camera in the phase plane coordinate system are (u 2, v2, 0).

According to the principle of triangle, the following deduction relation exists for the left camera and the right camera:

assuming that, ideally, there is parallax between the C1 camera coordinate system and the C2 camera coordinate system only in the X-axis where the base line is located, the relationship of the phase plane coordinate system ozz to the C1 camera coordinate system and the C2 camera coordinate system can be expressed as follows:

Two formulas can be obtained on the same time:

the available parallax information may then be calculated by the following formula:

wherein,to identify depth information of the flyer.

Specifically, the flyer identification system combines the formula (4) and the formula (1), and converts the space coordinates (X, Y, Z) of the identified flyer under the phase plane coordinate system, wherein the conversion formula is as follows:

according to the method and the device for identifying the three-dimensional coordinates of the flying object, the parallax information is determined through identifying the two-dimensional image coordinates of the flying object in the first image frame and the second image frame respectively, and the three-dimensional coordinates of the flying object can be recovered by utilizing the parallax information and the image point coordinates of the flying object which are matched in the first image frame and the second image frame respectively. The visual field range of the camera of the intelligent terminal can be expanded, so that the flexibility of the vision system is enhanced.

On the basis of any one of the above embodiments, stitching the first image frame and the second image frame to obtain a scene fusion image includes: and respectively carrying out feature extraction on the first image frame and the second image frame to obtain a first feature point set and a second feature point set.

Specifically, in step 101, the flyer recognition system performs feature point extraction on the first image frame, and integrates feature points that do not change with rotation, scale, and illumination changes into a first feature point set. And simultaneously, extracting the characteristic points of the second image frame to obtain a second characteristic point set.

Among them, the feature point extraction is an algorithm used, including but not limited to a scale-invariant feature transform (Scale Invariant Feature Transform, SIFT) algorithm, a Speeded-up robust feature (Speeded-Up Robust Features, SURF) algorithm, and a fast feature point extraction and description (Oriented FAST and Rotated BRIEF, ORB), which is not specifically limited in this embodiment of the present application.

Specifically, the flyer identification system finds out matched feature point pairs by carrying out similarity measurement on the first feature point set and the second feature point set, and after the spatial mapping relation between two image frames is determined, the overlapped parts of two images with the same plane in space can be associated through homography transformation to obtain a scene fusion image.

It should be noted that, for the application with higher response time requirement, image fusion may not be performed, only one of the binocular cameras is selected as a basis, and the binocular overlapping portion is reserved and then used as an input image of the recognition model.

According to the method and the device for identifying the scene fusion image, the respective characteristic points are extracted from the first image frame and the second image frame, the characteristic point sets of the two images are matched, the spatial mapping relation between the two images is optimized by affine transformation/perspective transformation and the like after optimal matching is obtained, one of the images is deformed into the same spatial layout as the other image, and then the images are spliced into the scene fusion image, so that identification accuracy is guaranteed.

On the basis of any one of the foregoing embodiments, after the acquiring the spatial coordinates of the identified flying object in the phase plane coordinate system, the method further includes: and generating a three-dimensional space model of the identified flying object under a camera coordinate system based on the first feature point set and the second feature point set, and the first feature depth information and the second feature depth information.

It should be noted that, the flyer identification system also needs to establish communication connection with the ranging sensing system to obtain the first feature depth information, where the first feature depth information includes depth information corresponding to each feature point in the first feature point set, and integrate the first feature depth information. And similarly, obtaining second characteristic depth information corresponding to the second characteristic point set.

Specifically, after step 103, the flyer identification system finds a matched pair of feature points by performing similarity measurement on the first feature point set and the second feature point set, and then respectively combines depth values recorded in the first feature depth information and depth values recorded in the second feature depth information with the matched pair of feature points to construct a three-dimensional space model for identifying the flyer under a camera coordinate system established by taking the intelligent terminal as an origin.

And acquiring the position coordinates of the identified flying object relative to the camera according to the space coordinates of the identified flying object under the phase plane coordinate system and the three-dimensional space model of the identified flying object under the camera coordinate system.

Specifically, the flying object recognition system obtains the position coordinates of the recognition flying object relative to the camera by combining the spatial coordinates of the recognition flying object in the phase plane coordinate system obtained in the step 103 and converting the spatial coordinates recorded in the three-dimensional space model.

And calculating the position coordinate of the identified flying object under the earth coordinate system based on the position coordinate of the identified flying object relative to the camera and the camera gesture data.

Specifically, the flying object identification system is matched with Beidou, GPS, magnetic compass and inertial measurement device to carry out coordinate system conversion on the position coordinates of the identification flying object relative to the camera according to the camera gesture data acquired by the binocular camera, so that the position coordinates of the identification flying object under the earth coordinate system can be obtained through calculation.

According to the method and the device, based on the matched characteristic points between binocular images and corresponding depth information, after a three-dimensional space model for identifying the flying object under a camera coordinate system is formed, the position coordinate of the flying object relative to a camera is converted through the space coordinate of the flying object under a phase plane coordinate system and the coordinate of the three-dimensional space model for identifying the flying object under the camera coordinate system, and then the position coordinate of the flying object under the earth coordinate system is calculated by combining the camera gesture data. The intelligent terminal can be used for positioning and real-time tracking of the flying object, and can also be used for identifying the flying object aiming at a visual blind area, so that the situation awareness capability of a terminal user is improved.

On the basis of any one of the above embodiments, generating the three-dimensional space model of the identified flying object under the camera coordinate system based on the first feature point set and the second feature point set, and the first feature depth information and the second feature depth information includes: generating sparse point cloud information based on the first feature point set and the second feature point set, and the first feature depth information and the second feature depth information.

Specifically, the flyer identification system finds matched feature point pairs in the first feature point set and the second feature point set by performing similarity measurement, and depth values recorded in the first feature depth information and the second feature depth information respectively serve as projections of real three-dimensional space points on an image plane. And performing triangulation (triangulation) on the feature points, constructing a triangle by using geometric information (epipolar geometry) to determine the position of the three-dimensional space point, and performing second projection on the pose of the camera, namely re-projecting the calculated virtual pixel point coordinates. And taking the coordinate value difference of the two as a reprojection error to minimize the sum of reprojection errors of the matching point pairs to obtain the optimal camera pose parameter and the coordinates of the three-dimensional space points, thereby forming sparse point cloud information.

Specifically, the flying object recognition system encrypts and smoothes the sparse point cloud information, eliminates discontinuous surfaces such as holes and the like, and forms a three-dimensional space model for recognizing the flying object under a camera coordinate system.

According to the embodiment of the application, sparse point cloud information is generated based on the matched characteristic points between the binocular images and the corresponding depth information, and encryption and smoothing processing are carried out on the sparse point cloud information, so that a three-dimensional space model for reserving and identifying foreground points of flying objects is formed. The cost function is constructed to optimize through the error obtained by comparing the pixel coordinates (with the position obtained by projecting the 3D point according to the current estimated pose), so that the calculation error of the homography matrix is considered, the measurement error of the image point is also considered, and the positioning precision of the flying object is improved.

On the basis of any one of the above embodiments, the recognition model includes a feature extraction layer and a classification layer.

Specifically, the hidden layer of the recognition model includes a feature extraction layer and a classification layer, wherein:

and the feature extraction layer is used for reducing the dimension of the scene fusion image.

And the classification layer is used for mapping the feature vectors after the dimension reduction to different aircraft categories.

Correspondingly, the step of inputting the scene fusion image into the recognition model to acquire the recognized flying object output by the recognition model comprises the following steps: and inputting the scene fusion image to the feature extraction layer to obtain the feature vector output by the feature extraction layer.

Specifically, in step 102, the flyer recognition system extracts features from the scene fusion image by using the convolutional neural network deployed in the feature extraction layer, and obtains feature vectors after dimension reduction.

Specifically, the flyer recognition system inputs the feature vector output by the feature extraction layer to a transducer network deployed by the classification layer, and further learns global features by utilizing a multi-head attention mechanism of the transducer network, so that each point recorded in the feature vector can interact with other points to recognize flyers of one category in an image, and recognition of other flyers can be reduced to a category with semantic association according to semantic description of the flyers until all flyers are recognized.

According to the embodiment of the application, the neural network model is utilized to extract the characteristics of the scene fusion image, and the obtained characteristic vectors are subjected to classification mapping by adopting a multi-head attention mechanism, so that different types of identified flying objects are obtained. Global information can be effectively obtained through a multi-head attention mechanism and can be mapped to a plurality of spaces, so that the expression capacity of the model is enhanced, and the efficiency of identifying flying objects is improved.

On the basis of any one of the above embodiments, before the splicing the first image frame and the second image frame to obtain the scene fusion image, the method further includes: and obtaining an internal reference matrix of the binocular camera through calibration so as to calculate the position coordinate of the identified flying object under the earth coordinate system based on the position coordinate of the identified flying object relative to the camera, the camera gesture data and the internal reference matrix of the binocular camera.

Specifically, prior to step 101, the flyer identification uses Zhang Youzheng calibration method or other calibration method to determine an internal reference matrix of each camera in the binocular camera, so as to be used as an initial estimated value of the camera pose in subsequent orthodontic and coordinate system conversion.

According to the embodiment of the application, calibration correction is advanced before the recognition of the flying object, barrel-shaped and pillow-shaped distortion of the shot image can be avoided, and the accuracy of the recognition of the flying object is improved.

Fig. 2 is a schematic structural diagram of the flyer identification system provided in the present application. As shown in fig. 2, on the basis of any one of the above embodiments, the flyer identification system provided in the embodiment of the present application is disposed in an intelligent terminal, and the system includes: a binocular fusion module 210, a flyer recognition module 220, and a flyer positioning module 230, wherein:

the binocular fusion module 210 is configured to splice the first image frame and the second image frame to obtain a scene fusion image; the first image frame and the second image frame are respectively obtained by shooting by two cameras with parallel optical axes in the binocular camera;

the flyer identification module 220 is configured to input the scene fusion image into an identification model, and obtain an identified flyer output by the identification model; the identification model is obtained by training the identification flying object type label marked by the sample scene image based on the sample scene image;

a flyer positioning module 230, configured to obtain spatial coordinates of the identified flyer in a phase coordinate system based on two-dimensional image coordinates of the identified flyer in the first image frame and the second image frame, respectively; the phase plane coordinate system is established according to a base line formed between two cameras with parallel optical axes in the binocular camera.

Specifically, binocular fusion module 210, flyer identification module 220, and flyer positioning module 230 are electrically connected in sequence.

The binocular fusion module 210 performs feature extraction on the first image frame and the second image frame acquired by the binocular camera at the same time, and performs registration by using the extracted features, so as to obtain a scene fusion image from which a non-overlapping portion in the two images is removed.

The flyer identification module 220 restores the identification model according to the weight coefficients of the neural network input layer and the hidden layer, and inputs the scene fusion image obtained by the binocular fusion module 210 into the trained identification model, so that the identified flyers of each category identified by the scene fusion image can be obtained.

After the flying object positioning module 230 clearly identifies the flying objects, two-dimensional image coordinates of the center of each identified flying object in the image coordinate system where the first image frame and the second image frame are respectively located are respectively obtained, and based on the two-dimensional image coordinates, the spatial coordinates of the identified flying objects in the phase plane coordinate system are converted, so that the binocular camera can be used for stereoscopic imaging.

Optionally, the flyer positioning module 230 includes a parallax information determining unit and a flyer positioning unit, wherein:

And the parallax information determining unit is used for determining parallax information based on the X-axis coordinate value in the two-dimensional image coordinates of the identification flying object and the depth information of the identification flying object.

And the flying object positioning unit is used for determining the space coordinates of the identified flying object under a phase plane coordinate system based on the two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame and the parallax information.

Optionally, the binocular fusion module 210 includes a feature point extraction unit and a registration unit, wherein:

and the characteristic point extraction unit is used for respectively carrying out characteristic extraction on the first image frame and the second image frame to obtain a first characteristic point set and a second characteristic point set.

And the registration unit is used for matching based on the first characteristic point set and the second characteristic point set to obtain the scene fusion image.

Optionally, the system further comprises a three-dimensional reconstruction module, a coordinate system conversion module, and a real world positioning module, wherein:

the three-dimensional reconstruction module is used for generating a three-dimensional space model of the identified flying object under a camera coordinate system based on the first feature point set, the second feature point set, the first feature depth information and the second feature depth information.

And the coordinate system conversion module is used for acquiring the position coordinates of the identification flying object relative to the camera according to the space coordinates of the identification flying object under the phase plane coordinate system and the three-dimensional space model of the identification flying object under the camera coordinate system.

And the real world positioning module is used for calculating the position coordinates of the identified flying object under the earth coordinate system based on the position coordinates of the identified flying object relative to the camera and the camera gesture data.

Optionally, the three-dimensional reconstruction module includes a point cloud acquisition unit and a three-dimensional reconstruction unit, wherein:

and the point cloud acquisition unit is used for generating sparse point cloud information based on the first characteristic point set, the second characteristic point set, the first characteristic depth information and the second characteristic depth information.

And the three-dimensional reconstruction unit is used for encrypting and smoothing the sparse point cloud information to form the three-dimensional space model of the identified flying object under the camera coordinate system.

Optionally, the recognition model includes a feature extraction layer and a classification layer.

Accordingly, the flyer recognition module 220 includes a feature extraction unit and a classification unit, wherein:

and the feature extraction unit is used for inputting the scene fusion image into the feature extraction layer and obtaining the feature vector output by the feature extraction layer.

And the classifying unit is used for inputting the characteristic vector to the classifying layer and processing the characteristic vector by adopting a multi-head attention mechanism to acquire the identified flying object output by the classifying layer.

Optionally, the system further comprises a calibration module, wherein:

and the calibration module is used for obtaining the internal reference matrix of the binocular camera through calibration so as to calculate the position coordinate of the identified flying object under the earth coordinate system based on the position coordinate of the identified flying object relative to the camera, the camera gesture data and the internal reference matrix of the binocular camera.

The embodiment of the application provides a flyer identification system for executing the flyer identification method, and its implementation mode is consistent with the implementation mode of the flyer identification method provided by the application, and can achieve the same beneficial effects, and the detailed description is omitted here.

Fig. 3 is a schematic structural diagram of an intelligent terminal provided in the present application. As shown in fig. 3, on the basis of any one of the above embodiments, the intelligent terminal provided in the embodiments of the present application includes an intelligent terminal body 300 provided with a binocular camera 310, and a vision processor 320 communicatively connected to the binocular camera 310. The vision processor 320, when executing a program, implements the method for identifying flying objects as described in any one of the above.

The intelligent terminal body 300 comprises a mobile phone terminal, augmented reality glasses, mixed reality glasses and augmented reality glasses.

Specifically, the smart terminal uses the smart terminal body 300 as a mounting platform, and a binocular camera 310 and a vision processor 320 which are in communication connection are disposed on the platform.

The vision processor 320 performs feature extraction on the first image frame and the second image frame acquired by the binocular camera 310 at the same time, and performs registration by using the extracted features, so as to obtain a scene fusion image from which a non-overlapping portion in the two images is removed.

The vision processor 320 restores the recognition model according to the weight coefficients of the neural network input layer and the hidden layer, and inputs the obtained scene fusion image into the trained recognition model, so that the recognition flying object of each category recognized by the scene fusion image can be obtained.

After the visual processor 320 clearly identifies the flying objects, two-dimensional image coordinates of the center of each identified flying object in the image coordinate system where the first image frame and the second image frame are respectively located are respectively obtained, and based on the two-dimensional image coordinates, spatial coordinates of the identified flying objects in the phase coordinate system are converted, so that stereoscopic imaging can be performed on the binocular camera.

It is understood that the smart terminal body 300 may be a handheld smart terminal, such as a mobile phone, a smart watch, a tablet computer, etc. The smart terminal body 300 may also be a wearable smart terminal, such as augmented reality glasses, mixed reality glasses, and augmented reality glasses. The embodiment of the present application is not particularly limited thereto.

Further, the logic instructions may be embodied in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In another aspect, the present application further provides a computer program product, where the computer program product includes a computer program, where the computer program may be stored on a non-transitory computer readable storage medium, where the computer program, when executed by a processor, is capable of executing the method for identifying a flying object provided by the above methods, and the method is applied to an intelligent terminal, and the method includes:

In yet another aspect, the present application further provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the method for identifying a flying object provided by the above methods, and the method is applied to an intelligent terminal, and the method includes:

The system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. The flyer identification method is characterized by being applied to an intelligent terminal and comprising the following steps of:

2. The method of claim 1, wherein acquiring spatial coordinates of the identified flying object in a phase plane coordinate system based on two-dimensional image coordinates of the identified flying object in the first image frame and the second image frame, respectively, comprises:

3. The method of claim 1, wherein the stitching the first image frame and the second image frame to obtain the scene fusion image comprises:

4. A method of identifying an object as claimed in claim 3, further comprising, after said obtaining the spatial coordinates of the identified object in the phase plane coordinate system:

5. The method of claim 4, wherein generating the three-dimensional model of the identified flying object in the camera coordinate system based on the first set of feature points and the second set of feature points, and the first feature depth information and the second feature depth information, comprises:

6. The method of claim 1, wherein the recognition model includes a feature extraction layer and a classification layer;

7. The method for identifying a flying object according to any one of claims 1 to 6, further comprising, before the first image frame and the second image frame are spliced to obtain a scene fusion image:

8. The utility model provides a flyer identification system which characterized in that sets up in intelligent terminal, includes:

9. The intelligent terminal is characterized by comprising an intelligent terminal body provided with a binocular camera and a vision processor in communication connection with the binocular camera; the vision processor implementing the method for identifying a flying object according to any one of claims 1 to 7 when executing a program;

10. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the method of identifying flying objects according to any one of claims 1 to 7.