WO2018210305A1 - 图像的识别跟踪方法、装置、智能终端和可读存储介质 - Google Patents

图像的识别跟踪方法、装置、智能终端和可读存储介质 Download PDF

Info

Publication number
WO2018210305A1
WO2018210305A1 PCT/CN2018/087282 CN2018087282W WO2018210305A1 WO 2018210305 A1 WO2018210305 A1 WO 2018210305A1 CN 2018087282 W CN2018087282 W CN 2018087282W WO 2018210305 A1 WO2018210305 A1 WO 2018210305A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
mark pattern
smart terminal
pattern
rotation angle
Prior art date
Application number
PCT/CN2018/087282
Other languages
English (en)
French (fr)
Inventor
孙星
郭晓威
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018210305A1 publication Critical patent/WO2018210305A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the present application relates to the field of Internet application technologies, and in particular, to an image recognition and tracking method, an apparatus, an intelligent terminal, and a readable storage medium.
  • the intelligent terminal estimates the posture corresponding to the image by tracking the target of the obtained image, and implements various interactive applications based on the obtained image according to the obtained posture.
  • the pose corresponding to this image is used to describe the translation and rotation of the physical target corresponding to capturing this image in physical space.
  • Target tracking is based on feature point tracking.
  • Tracking based solely on feature points will inevitably lead to trade-offs between time performance and tracking performance.
  • most simple feature points are used to obtain faster feature point extraction speed and tracking speed, but tracking accuracy
  • it is very low, and it is impossible to achieve the tracking effect and time performance of the target tracking on the smart terminal.
  • an object of the present application is to provide an image recognition and tracking method and apparatus for solving the existing technology. Defects in tracking performance and time performance cannot be guaranteed at the same time.
  • An image recognition tracking method comprising:
  • the rotation angle output by the translation information and the multi-sensor in the smart terminal is formed into a pose matrix of the mark pattern in the image.
  • the marking pattern in the image is located according to the recognition result of the marking pattern, and the target tracking is performed by the positioning marking pattern to obtain the translation information of the marking pattern in the image in the image, including:
  • Marking patterns in the positioning image are identified by the marking pattern indicated by the recognition result
  • a vertical distance of the mark pattern in the image in the image is calculated according to the zoom size and the size of the mark image, and the vertical distance and the translation distance form translation information.
  • the method before the panning information is merged with the multi-sensor in the smart terminal to form a pose matrix of the mark pattern in the real scene image, the method further includes:
  • the performing a multi-sensor fusion algorithm on the sensor data to calculate a rotation angle of the smart terminal in a space includes:
  • the sensor data is calculated by the rotation angle of the smart terminal itself in space.
  • the calculating, by the sensor data, the rotation angle of the smart terminal itself in the space includes:
  • the auxiliary calculation of the rotation angle of the rotation rough value according to the acceleration and gravity direction information in the sensor data obtains the rotation angle of the smart terminal in the space with respect to each orientation.
  • the obtaining a recognition result of the mark pattern in the image captured by the smart terminal includes:
  • the smart terminal continues image capture and obtains the recognition result of the mark pattern in the currently captured image
  • the method further includes:
  • the transmission transform pre-processing of the currently captured image is performed with respect to the image that is first recognized to the marker pattern to obtain a transmission image for performing target tracking of the currently captured image.
  • the transmission transform pre-processing of the currently captured image is performed with respect to the image that is first recognized to the marker pattern to obtain a transmission image for performing target tracking of the currently captured image.
  • a transmission image is obtained by performing a transmission transformation of the currently captured image by the included angle.
  • the method further includes:
  • a projection of the preset image in the image is performed based on the pose matrix of the mark pattern in the image.
  • obtaining the recognition result of the mark pattern in the image captured by the smart terminal includes:
  • the recognition result of the mark pattern is obtained by matching of an image with a preset mark pattern or by a user-specified trigger of a mark pattern in the captured image.
  • the rotation angle that is obtained by fusing the translation information and the multi-sensor in the smart terminal to form a pose matrix of the mark pattern in the image includes:
  • the movement of the mark pattern indicated by the translation information in the space and the rotation indicated by the rotation angle are respectively taken as elements to form a pose matrix of the mark pattern.
  • An image recognition and tracking device characterized in that the device comprises:
  • a recognition result obtainer configured to obtain a recognition result of the mark pattern in the image captured by the smart terminal
  • a target tracker configured to locate a mark pattern in the image according to the recognition result of the mark pattern, and perform target tracking by the positioned mark pattern to obtain translation information of the mark pattern in the image in the space;
  • the pose obtainer is configured to form a rotation angle of the translation information and the multi-sensor in the smart terminal to form a pose matrix of the mark pattern in the image.
  • the target tracker includes:
  • a mark locator configured to identify a mark pattern in the position image by a mark pattern indicated by the recognition result
  • a tracking actuator configured to perform target tracking according to the marked pattern of the positioning, obtaining a translation distance of the marking pattern in the spatial level of the image and a scaling of the marking pattern relative to the pre-stored marking image from the performed target tracking;
  • the translation information former is configured to calculate a vertical distance of the mark pattern in the image in the image according to the zoom size and the size of the mark image, the vertical distance and the translation distance forming translation information.
  • the apparatus further includes:
  • a data obtainer configured to obtain sensor data output by the plurality of sensors when the smart terminal captures the image
  • the multi-sensor fuser is configured to perform a multi-sensor fusion algorithm on the sensor data to calculate a rotation angle of the smart terminal in space, the rotation angle being output by multi-sensor fusion, and used to form a pose matrix of the mark pattern in the image.
  • the multi-sensor fuser configuration performs:
  • the sensor data is calculated by the rotation angle of the smart terminal itself in space.
  • the multi-sensor fuser configuration performs performing the sensor data in the rotation angle calculation of the smart terminal itself in space, and the configuration is performed:
  • the auxiliary calculation of the rotation angle of the rotation rough value according to the acceleration and gravity direction information in the sensor data obtains the rotation angle of the smart terminal in the space with respect to each orientation.
  • the recognition result obtainer is further configured to continuously perform image capture by the smart terminal and obtain a recognition result of the mark pattern in the currently captured image;
  • the device also includes:
  • the transmissive transducer is configured to perform a transmissive transform pre-processing of the currently captured image relative to the image that was first identified to the marking pattern to obtain a transmissive image for performing target tracking of the currently captured image.
  • the transmissive transducer comprises:
  • An initial rotation obtainer configured to acquire a rotation angle corresponding to the mark pattern in the image recognized for the first time to the mark pattern, wherein the rotation angle is used as an initial rotation angle
  • a rotation converter configured to calculate an angle between a currently captured image and an image that is first recognized to the marking pattern according to a rotation angle output by the multi-sensor fusion in the smart terminal and the initial rotation angle
  • An image transmissive transducer configured to perform a transmissive transformation of the currently captured image by the included angle to obtain a transmission image.
  • the apparatus further includes:
  • a projector configured to perform projection of the preset virtual scene image in the image according to a pose matrix of the mark pattern in the image.
  • An intelligent terminal comprising:
  • a memory having stored thereon computer readable instructions that, when executed by the processor, implement an identification tracking method for an image as described above.
  • a computer readable storage medium having stored thereon a computer program that, when executed by a processor, implements an image recognition tracking method as described above.
  • the recognition result of the mark pattern in the image captured by the smart terminal is obtained, and then the mark pattern in the image is located according to the recognition result of the mark pattern, and the target mark is performed by the positioned mark pattern to obtain the mark pattern in the image in space.
  • the information is translated, and the rotation angle of the translation information and the multi-sensor fusion in the intelligent terminal is finally formed into a pose matrix of the image mark pattern.
  • the rotation of the target tracking process is no longer needed.
  • the angle improves the time performance as a whole, and also avoids the limitation of the rotation angle when the rotation angle is obtained by using the target tracking process, and even the limitation of the calculation.
  • the combination of the target tracking process and the multi-sensor fusion is guaranteed quickly. While tracking speed, it also has strong stability and accuracy, which can balance tracking effect and time performance.
  • FIG. 1 is a flowchart of an image recognition and tracking method according to an exemplary embodiment
  • FIG. 2 is a flowchart illustrating details of performing sensor data rotation angle of a smart terminal in space according to an exemplary embodiment
  • FIG. 3 is a flow chart depicting details of step 230, according to an exemplary embodiment
  • FIG. 4 is a flow diagram depicting the details of step 130, illustrated in accordance with a corresponding embodiment of Figure 1;
  • FIG. 5 is a diagram showing a transmission transform pre-processing obtained by performing a transmission transform on a currently captured image with respect to an image that is first recognized to a marker pattern, the transmission image being used to perform a target tracking step of the currently captured image, according to an exemplary embodiment. a detailed description of the flow chart;
  • FIG. 6 is a flow chart depicting the details of step 110, illustrated in accordance with the corresponding embodiment of Figure 1;
  • FIG. 7 is a block diagram of a reality enhancement system in a smart terminal, according to an exemplary embodiment
  • Figure 8 is an implementation framework diagram of the steps of tracking images in the corresponding embodiment of Figure 7;
  • FIG. 9 is a block diagram of an image recognition and tracking device according to an exemplary embodiment.
  • Figure 10 is a block diagram depicting details of a target tracker, according to a corresponding embodiment of Figure 9;
  • FIG. 11 is a block diagram of an image recognition and tracking device according to another exemplary embodiment.
  • FIG. 12 is a block diagram showing details of a transmissive converter, according to an exemplary embodiment
  • FIG. 13 is a block diagram of an apparatus, according to an exemplary embodiment.
  • FIG. 1 is a flowchart of an image recognition tracking method according to an exemplary embodiment.
  • the image recognition tracking method as shown in FIG. 1, may include the following steps.
  • step 110 a recognition result of the mark pattern in the image captured by the smart terminal is obtained.
  • the smart terminal is configured to perform the image recognition and tracking method of the present application, where the smart terminal will first acquire the captured image thereof. It should be noted here that the image capturing process of the smart terminal may be performed currently or may be performed in advance. Through the image recognition tracking process of the present application, an identification tracking process will be performed for the image, which is captured by the smart terminal or captured by the smart terminal in advance.
  • the smart terminal that performs the identification tracking process on the captured image may be the same smart terminal, or may be two different smart terminals.
  • the smart terminal that needs to perform image capture passes the captured image and corresponding other data, such as the subsequent sensor data, to the smart terminal performing the identification tracking process.
  • the images captured by the smart terminal are obtained by a shooting component configured by the smart terminal, for example, various built-in or external camera implementations.
  • the captured image is a real scene image.
  • the image captured by the smart terminal is used to reflect the real scene in the form of an image.
  • the real scene is a scene in a real environment. That is to say, the captured image is captured by the smart terminal carried by the user for shooting in a real environment.
  • it may be captured currently or may be obtained in advance. This is not limited.
  • the mark pattern is a pattern specified in advance, and for the image, the mark pattern is present in the image by photographing the mark placed in the real environment.
  • the recognition result of the mark pattern in the image is used to indicate whether the mark pattern exists in the image and the position of the mark pattern in the image. Specifically, for the case where the mark pattern exists in the image, the obtained recognition result indicates that the mark pattern is recognized. At this point, you can perform target tracking on this image.
  • the recognition result of the mark pattern in the image may be outputted by matching the image with the preset mark image, or may be outputted by other methods to identify the mark image in the image, where It is not limited as long as the recognition result of the mark pattern in the image can be obtained from the output information.
  • the recognition result of the marker pattern in the image captured by the smart terminal is obtained by matching of the image with the preset marker pattern or by a user-specified trigger of the marker pattern in the captured image.
  • the configuration of the recognition result of the mark pattern in the captured image provides various ways for the image recognition and tracking process of the smart terminal, so as to be able to flexibly adapt to various scenes and the performance of the smart terminal, and improve the image recognition and tracking. reliability.
  • step 130 the mark pattern in the image is located according to the recognition result of the mark pattern, and the target track is performed by the positioned mark pattern to obtain the translation information of the mark pattern in the image in the space.
  • the target tracking of the image refers to performing a target tracking process on the image
  • the target referred to herein is the marking pattern in the image. Therefore, for the target tracking of the image, it must be triggered by the recognition result of the mark pattern to avoid invalid target tracking, waste computing resources, and improve processing efficiency.
  • the process of calculating the pose of the mark pattern in the image In the target tracking of the triggered image, the process of calculating the pose of the mark pattern in the image.
  • the mark pattern pose obtained by the target tracking will include the mark in the image.
  • the translation information of the pattern in space will include the mark in the image.
  • the space referred to is corresponding to the space where the real scene is located, that is, the physical space, which is the three-dimensional space constructed by the intelligent terminal. Therefore, the translation information of the marker pattern in the space in the image is relative to the three orientations in the space.
  • the three orientations in space are the directions pointed by the three coordinate axes in the constructed spatial coordinate system.
  • the spatial coordinate system is constructed. There is a transformation between the physical coordinate system and the constructed spatial coordinate system between the physical space and the space, so that the captured real environment can be accurately mapped to the space, and accurate. Get translation information in space.
  • the spatial coordinate system is a three-dimensional coordinate system, which includes x coordinate axes, y coordinate axes, and z coordinate axes perpendicular to each other.
  • the translation information of the marker patterns in the image in the image is based on the x coordinate axis, the y coordinate axis, and the z coordinate axis. And the translation distance obtained by the operation.
  • the translation information of the mark pattern in the space in the image is used to indicate the movement of the mark pattern in the space, and the translation information includes the translation distance and the vertical distance on the horizontal plane.
  • the translational distances on the horizontal plane of the space correspond to the orientations of the two coordinate axes in space, respectively.
  • the marker pattern can be quickly and stably predicted in three orientations in space, so the target tracking in the image will still be performed, but the rotation information of the marker pattern in the image is no longer calculated. This will greatly improve and improve the time performance and speed of target tracking.
  • the algorithm for implementing target tracking in an image may be a single target tracking algorithm, or may use continuous matching when the matching speed is fast enough, or may be implemented by using deep learning. Even multi-target tracking algorithms are not enumerated here.
  • step 150 the rotation angle output by the translation information and the multi-sensor in the smart terminal is formed into a pose matrix of the mark pattern in the image.
  • the smart terminal refers to a terminal device that performs the image recognition and tracking process of the present application.
  • the smart terminal may be a portable mobile terminal such as a smart phone or a tablet computer.
  • Various sensors are installed in the smart terminal, so that the sensor data output by the plurality of sensors in the smart terminal performing image capturing is fused, and the rotation angle reflecting the rotation of the smart terminal is obtained, because the smart terminal captures the image and The sensor data is output by the installed sensor. Therefore, the rotation angle output by the multi-sensor fusion can be used to describe the occurrence of rotation in the pose corresponding to the mark pattern in the image.
  • the multi-sensor fusion in the intelligent terminal can quickly and accurately calculate the rotation angle, and cooperate with the target tracking of the image, avoiding the problem of inaccurate or even incalculable calculation of the rotation angle in the target tracking, thereby ensuring efficiency and accuracy.
  • the translation angle of the translation information and the multi-sensor fusion output together form a pose matrix of the marker pattern in the image.
  • the pointed pose matrix that forms the marking pattern in the image together with the translation information and the rotation angle refers to the movement of the marking pattern indicated by the translation information in space, and the rotation indicated by the rotation angle as an element.
  • the pose matrix of the marker pattern is
  • the marking pattern indicated by the translation information in space it can be represented by the distance respectively obtained with respect to the three coordinate axis orientations in the space; the rotation angle can also be corresponding to each of the three coordinate axes in the space.
  • the coordinate axis is rotated by an angle, so that a six-degree-of-freedom pattern pattern pose matrix can be obtained.
  • the distance corresponding to the three coordinate axes in the space and the coordinate rotation angles corresponding to the three coordinate axes respectively are used as the elements of the matrix to form the pose matrix of the mark pattern.
  • the pose matrix of the mark pattern in the image is used to describe the pose change of the mark pattern with respect to its initial pose.
  • the pose matrix obtained by the operation will make the subsequently implemented business realization scene match the pose matrix, and then adapt to the image and the mark pattern in the image.
  • the virtual scene image is projected according to the calculated pose matrix, so that the virtual scene image projected to the image is adapted to its pose matrix, which ensures The accuracy and adaptability of the implementation of the subsequent business implementation scenarios.
  • the implementation of image recognition tracking in the smart terminal can complete the operation at a very fast speed while maintaining good tracking accuracy. It can ensure the real-time performance of the intelligent terminal, that is, the mobile terminal, and can also ensure that the tracking effect is not affected, so that the tracking of the local real-time mark pattern of the mobile terminal becomes a reality and has better stability.
  • the method for identifying the tracking of the image may further include the following steps.
  • a multi-sensor fusion algorithm is performed on the sensor data to calculate a rotation angle of the smart terminal in space, and the rotation angle is output by multi-sensor fusion, and is used to form a pose matrix of the mark pattern in the image.
  • the multi-sensor fusion is for the data output by the plurality of sensors in the intelligent terminal, that is, the sensor data.
  • the multi-sensor fusion algorithm is executed in the sensor data, and the rotation angles corresponding to the three orientations of the smart terminal in the space are calculated, and the calculated rotation angle corresponds to the rotation of the mark pattern in the image in the image.
  • FIG. 2 is a flowchart illustrating details of performing sensor data on a rotation angle of a smart terminal in space according to an exemplary embodiment. This step, as shown in FIG. 2, may include the following steps.
  • step 210 sensor data output by a plurality of sensors when the smart terminal captures an image is obtained.
  • the capture of the image and the acquisition of the sensor data in the smart terminal are performed simultaneously to ensure that the collected sensor data can correspond to the posture of the image captured by the smart terminal.
  • the calculations made based on the sensor data can be made accurate with respect to the image.
  • data acquisition by multiple sensors in the smart terminal may also be performed while the smart terminal captures the image, or may be captured and executed while the smart terminal captures the image, which is not limited herein. .
  • the plurality of sensors for outputting sensor data when capturing a real image in the smart terminal refer to all sensors in the smart terminal that can be used to calculate the rotation angle of the smart terminal.
  • the number of sensors is three.
  • the sensor data is related to a plurality of sensors related to the rotation of the smart terminal.
  • the sensor data may be data output by an angular velocity meter, an accelerometer, and a gravity sensor in the smart terminal.
  • the sensor When the image is captured by the smart terminal, the sensor performs data acquisition to output sensor data associated with the image.
  • step 230 the sensor data is subjected to a rotation angle calculation of the smart terminal itself in space.
  • the calculation of the rotation angle in the space is performed by the sensor data, and the rotation angle with respect to each position is obtained.
  • the pointed orientation is the coordinate axis direction of the three-dimensional coordinate system constructed in the space.
  • FIG. 3 is a flow chart describing the details of step 230, according to an exemplary embodiment.
  • This step 230 may include the following steps.
  • step 231 the angular velocity integration in the sensor data is performed to obtain a coarse rotational value of the smart terminal with respect to each orientation in the space.
  • the sensor data includes angular velocity, acceleration and gravity direction information.
  • the angular velocity is acquired by the angular velocity meter in the intelligent terminal.
  • the acceleration is acquired by the accelerometer, and the gravity direction information is acquired by the gravity sensor.
  • the sensor data is also obtained by a plurality of sensors specified in the smart terminal.
  • To extract the angular velocity in the sensor data firstly integrate the angular velocity in the sensor data to obtain a rough value of the rotation of the smart terminal with respect to each orientation in the space.
  • the device error of the angular velocity meter is always accumulated, and therefore, the accurate value cannot be obtained, and only the coarse value of the rotation with respect to each orientation is obtained.
  • step 233 an auxiliary calculation of the rotation angle of the rotation rough value according to the acceleration and gravity direction information in the sensor data is used to obtain a rotation angle of the smart terminal in the space with respect to each orientation.
  • the rotational coarse values are accurately calculated by the acceleration and gravity direction information in the sensor data to finally obtain an accurate rotation angle.
  • the acceleration and gravity direction information in the sensor data are information for recording the position and movement of the intelligent terminal. Therefore, Kalman filtering is performed with the acceleration and gravity direction information as an aid, thereby reducing the error in the coarse rotation value. Obtain a rotation angle in which the error is greatly reduced.
  • the acceleration and gravity direction information in the sensor data are used as auxiliary, and the angular velocity is sent to the Kalman filter together, and Kalman filter is used.
  • the device outputs the rotation angle of the smart terminal itself in the space with respect to each orientation.
  • step 130 is a flow chart depicting the details of step 130, illustrated in accordance with a corresponding embodiment of FIG. 1.
  • This step 130 may include the following steps.
  • step 131 the mark pattern in the positioned image is identified by the mark pattern indicated by the recognition result.
  • step 133 target tracking is performed according to the positioned mark pattern, and the translation distance of the mark pattern in the spatial horizontal plane and the zoom size of the mark pattern relative to the pre-stored mark image are obtained from the target tracking performed.
  • step 110 After the recognition result of the mark pattern in the image is obtained through step 110, under the instruction that the mark pattern is recognized in the image, the target tracking process of the image is triggered, and the mark pattern in the image is located.
  • the moving distance of the image corresponding to the left and right, upper and lower directions on the horizontal plane of the image can be obtained, that is, the translation distance of the marking pattern in the spatial horizontal plane.
  • the mark image is an image stored in advance and including the mark pattern, and therefore, the mark pattern in the image is in a proportional relationship with the mark image, that is, corresponding to a zoom size, performed by the target tracking process. Get this zoom size.
  • step 135 the vertical distance of the mark pattern in the image in the image is calculated according to the zoom size and the size of the mark image, and the vertical distance and the translation distance form translation information.
  • the size of the pre-stored mark image is obtained, and the vertical distance of the mark pattern in the space, that is, the translation distance in the vertical direction of the space is calculated under the cooperation of the zoom size, thereby forming the vertical distance and the translation distance of the space horizontal plane together Pan the information.
  • a multi-sensor fusion algorithm is implemented in the smart terminal to quickly and accurately calculate the rotation angle of the smart terminal in three directions in the space, which will also be the captured image and the mark pattern in the image. Pose.
  • step 110 further includes the smart terminal continuing image capture and obtaining a recognition result of the mark pattern in the currently captured image.
  • the method for identifying and tracking the image further includes the following steps.
  • a transmission transform pre-processing of the currently captured image is performed to obtain a transmission image for performing target tracking of the currently captured image with respect to the image in which the marker pattern is first recognized.
  • the recognition tracking implemented for the image is performed in the image capturing that is continuously performed by the smart terminal.
  • the image that is subjected to the recognition tracking is each image that the smart terminal continuously captures.
  • the obtained image is one frame of the image being captured, and as the shooting continues, the next frame image will also be subjected to the recognition tracking process.
  • the translation information of the mark pattern in the image in the image will be obtained through the target tracking process of the image, and in order to ensure the accuracy of the translation information and further simplify the target tracking process, the processing speed and efficiency will be improved.
  • the image is optimized, that is, the transmission transform preprocessing is performed.
  • the transmission image obtained by preprocessing the transmission transform performs a target tracking process of the image, so that the transmission image for the target tracking is consistent with the spatial angular posture of the image in which the marker pattern is first recognized, thereby implementing the target tracking. It is no longer necessary to consider the angle of rotation, but to obtain the translation information directly.
  • the image recognition tracking method obtains the pose matrix with respect to the initial pose, that is, the pose corresponding to the image in which the marker pattern is first recognized.
  • FIG. 5 is a schematic diagram showing a transmission transform pre-processing of a currently captured image with respect to an image that is first recognized to a marker pattern, the transmission image being used for performing the currently captured image, according to an exemplary embodiment.
  • This step may include the following steps.
  • step 301 the rotation angle corresponding to the marking pattern in the image recognized for the first time to the marking pattern is acquired, with the rotation angle as the initial rotation angle.
  • step 303 an angle between the currently captured image and the image that is first recognized to the marker pattern is calculated according to the rotation angle and the initial rotation angle output by the multi-sensor fusion in the smart terminal.
  • the rotation angle of the image is obtained.
  • the initial rotation angle corresponding to the marker pattern in the image that is first recognized to the marker pattern is acquired, and the angle is obtained by calculating the difference between the rotation angle of the image and the initial rotation angle.
  • a transmission image is obtained by performing a transmission transformation of the currently captured image by an included angle.
  • performing the transmission transformation on the captured image substantially corrects the captured image, and cancels the rotation and distortion error between the image and the image that is first recognized to the marking pattern, thereby facilitating subsequent target tracking.
  • this transmission image is used to perform the target tracking process of the image without directly using the image.
  • the obtained angle can be combined, and the angle also corresponds to the three orientations in the space, and the obtained angle and translation information are combined to form a pose matrix.
  • a camera pose matrix of six degrees of freedom it is possible to quickly obtain a camera pose matrix of six degrees of freedom.
  • step 110 in the embodiment shown in FIG. 1 may include:
  • a matching between the captured image of the smart terminal and the pre-stored mark image is performed, and whether a mark pattern exists in the image is recognized, and a recognition result of the mark pattern in the image is obtained.
  • the image is matched with the preset mark image. If the image matches the mark image, the mark pattern exists in the image, and the mark pattern in the image is recognized, thereby obtaining corresponding Identify the results.
  • FIG. 6 is a flow diagram depicting the details of step 110, illustrated in accordance with the corresponding embodiment of Figure 1.
  • This step 110 may include the following steps.
  • step 111 a user instruction to perform a marker pattern designation in the captured image is received.
  • whether the mark pattern exists in the image can be realized by using the interaction recognition with the user. Specifically, the captured image is displayed. At this time, the user can view the captured image and confirm whether there is a mark pattern in the image, and if there is a mark pattern, the specified operation of triggering the mark pattern in the image Correspondingly, the smart terminal will generate a user instruction for specifying the mark pattern in the image in response to the specified operation.
  • the triggering of the pattern designation operation by the user in the image may be an operation in which the user selects the marker pattern in the displayed image.
  • other operations may be performed, which are not limited herein.
  • step 113 the recognition result of the mark pattern in the image is obtained according to a user instruction.
  • the mark image in the image is made to be implemented simply and quickly by the user interaction, and the accuracy and efficiency are further improved for the image recognition and tracking.
  • the image tracking method further includes:
  • the projection of the preset virtual scene image in the image is performed according to the pose matrix of the mark pattern in the image.
  • the virtual scene image in the image can be projected according to the basis, thereby realizing various business scenarios, and then constructing a personal reality-oriented enhancement system in the smart terminal, or constructing Real-world support for office systems, etc.
  • a markup image can be identified and identified, which corresponds to a specific location in a desk or office in a real environment, and generates a mail according to the pose matrix of the mark image.
  • Business scenarios such as conference notifications and video conferences can also be applied to remote conferences to create a sense of reality for the conference partners.
  • the smart terminal it is possible to implement fast mark recognition tracking in the mobile terminal, that is, the smart terminal, so that the smart terminal local real-time mark pattern recognition tracking becomes a reality, and has better performance than the conventional feature point tracking. Stability and higher tracking accuracy.
  • FIG. 7 is a block diagram of a reality enhancement system in a smart terminal, according to an exemplary embodiment.
  • the smart terminal continuously captures the real environment and continuously captures images to form a video image.
  • the continuously captured image is a frame image in the video image of the reality enhanced display in the smart terminal.
  • step 410 After capturing one frame of image, matching the marked image is performed, as shown in step 410 in the system framework. If the matching is performed, step 430 is performed to perform image tracking, and finally obtained.
  • the pose matrix of the frame image enables the projection of the virtual scene image for the frame image in the real-life enhancement system of the smart terminal, thereby performing realistic enhanced display.
  • step 410 When step 410 is performed and the upper marked image is not matched, the next frame image will be awaited; if the tracking image step can be successfully performed, the next frame image will also be awaited.
  • recognition tracking is continuously performed for captured images.
  • FIG. 8 is an implementation framework diagram of the steps of tracking images in the corresponding embodiment of FIG. 7.
  • FIG. 8 For each frame of the image in the video image, the corresponding transmitted image will be obtained with the cooperation of multiple sensors, as shown in step 510 of the implementation framework.
  • the single-target tracking is performed with the transmission image as an input, and the pose matrix of the transmission transformation is obtained, that is, as in step 530.
  • the pose matrix of the transmission transformation is obtained by eliminating the rotation, and therefore, it only describes the translation information of the marker pattern in the image, and does not include the rotation angle.
  • the rotation angle obtained by multi-sensor fusion together with the pose matrix of the transmission transformation forms a pose matrix of the mark pattern in the image in the image, and thus, the reality in the smart terminal
  • the implementation of the enhanced system provides the fastest and most stable implementation, further accelerating performance, and no longer requires trade-offs between time performance and recognition.
  • the multiple sensors in the smart terminal are fully integrated, and the stability and accuracy of tracking are kept at a high level.
  • the following is an embodiment of the apparatus of the present application, and may be configured to perform the method for identifying and tracking the above image of the present application.
  • the embodiment of the identification tracking method of the image of the present application please refer to the embodiment of the identification tracking method of the image of the present application.
  • FIG. 9 is a block diagram of an image recognition and tracking device according to an exemplary embodiment.
  • the image recognition and tracking device may include, but is not limited to, a recognition result obtainer 710, a target tracker 730, and a pose obtainer 750.
  • the recognition result obtainer 710 is configured to obtain a recognition result of the mark pattern in the image captured by the smart terminal.
  • the target tracker 730 is configured to locate the mark pattern in the image according to the recognition result of the mark pattern, and perform target tracking by the positioned mark pattern to obtain translation information of the mark pattern in the space in the image.
  • the pose obtainer 750 is configured to form a pose matrix of the mark pattern in the image by combining the translation information and the rotation angle output by the multi-sensor in the smart terminal.
  • FIG. 10 is a block diagram depicting details of a target tracker, shown in accordance with the corresponding embodiment of Figure 9.
  • the target tracker 730 may include, but is not limited to, a tag locator 731, a tracking executor 733, and a panning information former 735.
  • the mark locator 731 is configured to recognize the mark pattern in the positioned image by the mark pattern indicated by the recognition result.
  • the tracking actuator 733 is configured to perform target tracking according to the positioned marking pattern, and obtain the translation distance of the marking pattern in the spatial plane of the image and the scaling of the marking pattern relative to the pre-stored marking image from the target tracking performed.
  • the translation information former 735 is configured to calculate a vertical distance, a vertical distance, and a translation distance of the mark pattern in the image according to the zoom size and the size of the mark image to form translation information.
  • FIG. 11 is a block diagram of an image recognition tracking device, including but not limited to: a data obtainer 810 and a multi-sensor fuse 830, according to another exemplary embodiment.
  • the data obtainer 810 is configured to obtain sensor data output by the plurality of sensors when the smart terminal captures an image.
  • the multi-sensor fuse 830 is configured to perform a multi-sensor fusion algorithm on the sensor data to calculate a rotation angle of the smart terminal in space, the rotation angle is output by multi-sensor fusion, and is used to form a pose matrix of the mark pattern in the image.
  • the recognition result obtainer 710 is further configured to continuously perform image capture by the smart terminal and obtain a recognition result of the mark pattern in the currently captured image.
  • the image recognition tracking device also includes a transmissive transducer.
  • the transmissive transducer is configured to perform a transmissive transform pre-processing of the currently captured image to obtain a transmissive image for performing target tracking of the currently captured image with respect to the image that was first identified to the marker pattern.
  • FIG. 12 is a block diagram depicting details of a transmissive converter, according to an exemplary embodiment.
  • the transmissive transducer 910 may include, but is not limited to, an initial rotation gainer 911, a rotation transducer 913, and an image transmission transducer 915.
  • the initial rotation obtainer 911 is configured to acquire a rotation angle corresponding to the mark pattern in the image recognized for the first time to the mark pattern, with the rotation angle as the initial rotation angle.
  • the rotation transformer 913 is configured to calculate an angle between the currently captured image and the image that is first recognized to the marker pattern based on the rotation angle and the initial rotation angle output by the multi-sensor fusion in the smart terminal.
  • An image transmission transducer 915 is configured to obtain a transmission image by performing a transmission transformation of the currently captured image by an included angle.
  • the recognition result obtainer 710 shown in FIG. 9 is further configured to perform matching between the captured image and the mark image, identify whether a mark pattern exists in the image, and obtain recognition of the mark pattern in the image. result.
  • the image recognition device further includes, but is not limited to, a projector.
  • the projector is configured to pre-project the projection of the virtual scene image in the image based on the pose matrix of the marker pattern in the image.
  • FIG. 13 is a block diagram of an apparatus, according to an exemplary embodiment.
  • device 900 can be smart terminal 110 in the implementation environment shown in FIG.
  • the smart terminal 110 may be a terminal device such as a smartphone or a tablet.
  • device 900 can include one or more of the following components: processing component 902, memory 904, power component 906, multimedia component 908, audio component 910, sensor component 914, and communication component 916.
  • Processing component 902 typically controls the overall operation of device 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations, and the like.
  • Processing component 902 can include one or more processors 918 to execute instructions to perform all or part of the steps of the methods described below.
  • processing component 902 can include one or more modules to facilitate interaction between component 902 and other components.
  • processing component 902 can include a multimedia module to facilitate interaction between multimedia component 908 and processing component 902.
  • Memory 904 is configured to store various types of data to support operation at device 900. Examples of such data include instructions for any application or method operating on device 900.
  • the memory 904 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read only memory (Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Red-Only Memory (PROM), Read Only Memory ( Read-Only Memory (ROM), magnetic memory, flash memory, disk or optical disk.
  • SRAM Static Random Access Memory
  • EEPROM Electrically erasable programmable read only memory
  • EPROM Erasable Programmable Read Only Memory
  • PROM Programmable Red-Only Memory
  • ROM Read Only Memory
  • magnetic memory flash memory
  • flash memory disk or optical disk.
  • Also stored in memory 904 is one or more modules configured to be executed by the one or more processors 918 to perform any
  • Power component 906 provides power to various components of device 900.
  • Power component 906 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 900.
  • the multimedia component 908 includes a screen between the device 900 and the user that provides an output interface.
  • the screen may include a liquid crystal display (LCD) and a touch panel. If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the screen may also include an Organic Light Emitting Display (OLED).
  • OLED Organic Light Emitting Display
  • the audio component 910 is configured to output and/or input an audio signal.
  • the audio component 910 includes a microphone (Microphone, MIC for short) that is configured to receive an external audio signal when the device 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 904 or transmitted via communication component 916.
  • audio component 910 also includes a speaker configured to output an audio signal.
  • Sensor assembly 914 includes one or more sensors for providing device 900 with various aspects of status assessment.
  • sensor component 914 can detect an open/closed state of device 900, relative positioning of components, and sensor component 914 can also detect a change in position of device 900 or one component of device 900 and a temperature change of device 900.
  • the sensor component 914 can also include a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 916 is configured to facilitate wired or wireless communication between device 900 and other devices.
  • the device 900 can access a wireless network based on a communication standard, such as WiFi (WIreless-Fidelity).
  • communication component 916 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 916 further includes a Near Field Communication (NFC) module to facilitate short range communication.
  • NFC Near Field Communication
  • the NFC module can be implemented based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth technology, and other technologies. .
  • RFID Radio Frequency Identification
  • IrDA Infrared Data Association
  • UWB Ultra Wideband
  • the device 900 may be configured by one or more Application Specific Integrated Circuits (ASICs), digital signal processors, digital signal processing devices, programmable logic devices, field programmable gate arrays, A controller, microcontroller, microprocessor or other electronic component implementation configured to perform the methods described below.
  • ASICs Application Specific Integrated Circuits
  • digital signal processors digital signal processing devices
  • programmable logic devices programmable logic devices
  • field programmable gate arrays A controller, microcontroller, microprocessor or other electronic component implementation configured to perform the methods described below.
  • the application further provides a smart terminal, which can be used in the implementation environment shown in FIG. 1 to perform any of the operations shown in any of FIG. 1, FIG. 2, FIG. 3, FIG. 4, FIG. 5, and FIG. All or part of the steps of the image recognition tracking method.
  • the smart terminal includes:
  • a memory configured to store processor executable instructions
  • processor configured to execute:
  • the rotation angle output by the translation information and the multi-sensor in the smart terminal is formed into a pose matrix of the mark pattern in the image.
  • a storage medium is also provided, which is a computer readable storage medium, such as a temporary and non-transitory computer readable storage medium including instructions.
  • the storage refers to, for example, a memory 904 that includes instructions that are executable by processor 918 of apparatus 900 to perform the above method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

本申请揭示了一种图像的识别跟踪方法。所述方法包括:获得图像中标记图案的识别结果;根据所述标记图案的识别结果定位所述图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息;将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。

Description

图像的识别跟踪方法、装置、智能终端和可读存储介质 技术领域
本申请涉及互联网应用技术领域,特别涉及一种图像的识别跟踪方法、装置、智能终端和可读存储介质。
背景技术
随着互联网应用技术的迅猛发展,智能终端通过对已获得图像的目标追踪而估计得到此图像所对应的姿态,根据所得到的姿态而基于已获得图像实现各种交互应用。此图像所对应的姿态用于描述捕获此图像所对应实体目标在物理空间发生的平移和旋转。
目标追踪是基于特征点追踪实现的。一般而言,追踪的特征点数目越多,特征点的描述子越复杂,追踪效果越好,所估计的姿态越准确,但是,运行速度也越慢,即,现有的目标追踪存在着时间性能和追踪效果矛盾的局限性。
单纯基于特征点的追踪必然会带来时间性能和追踪效果的取舍问题。对于现有目标追踪在智能终端上的应用,为了获得较好的时间性能以及处理资源的限制,大都采用简单的特征点,以获得较快的特征点提取速度和追踪速度,但是,追踪准确率却非常低,无法使得智能终端上目标追踪的实现兼顾追踪效果和时间性能。
发明内容
为了解决相关技术中存在的图像中目标追踪的实现无法兼顾追踪效果和时间性能的技术问题,本申请的一个目的在于提供一种图像的识别跟踪方法和装置,用于解决现有技术所存在的无法同时保证追踪效果和时间性能的缺陷。
一种图像的识别跟踪方法,所述方法包括:
获得智能终端所捕获图像中标记图案的识别结果;
根据所述标记图案的识别结果定位所述图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息;
将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。
在其中一个示例性实施例中,根据标记图案的识别结果定位图像中的标记图案,由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息,包括:
通过识别结果指示的标记图案被识别定位图像中的标记图案;
根据定位的所述标记图案进行目标追踪,从进行的目标追踪获得所述图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸;
根据所述缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离,所述垂直距离和平移距离形成平移信息。
在其中一个示例性实施例中,所述将所述平移信息和所在智能终端中多传感器融合而输出的旋转角度形成所述真实场景图像中标记图案的位姿矩阵之前,所述方法还包括:
获得智能终端捕获所述图像时多个传感器输出的传感器数据;
对所述传感器数据执行多传感器融合算法计算所述智能终端在空间中的旋转角度,所述旋转角度由多传感器融合而输出,且用于形成所述图像中标记图案的位姿矩阵。
在其中一个示例性实施例中,所述对所述传感器数据执行多传感器融合算法计算所述智能终端在空间中的旋转角度,包括:
获得所述智能终端捕获图像时多个传感器输出的传感器数据;
对传感器数据进行智能终端自身在空间中的旋转角度计算。
在其中一个示例性实施例中,所述对传感器数据进行智能终端自身在空间中的旋转角度计算,包括:
进行传感器数据中角速度的积分,获得智能终端分别相对于空间中各个方位的旋转粗略值;
根据传感器数据中的加速度和重力方向信息对旋转粗略值进行旋转角度的辅助计算获得所在智能终端在空间中分别相对于各个方位的旋转角度。
在其中一个示例性实施例中,所述获得智能终端所捕获图像中标记图案的识别结果,包括:
智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果;
所述根据所述标记图案的识别结果定位图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息之前,所述方法还包括:
相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,所述透射图像用于进行当前所捕获图像的目标追踪。
在其中一个示例性实施例中,所述相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,所述透射图像用于进行当前所捕获图像的目标追踪,包括:
获取首次被识别到标记图案的图像中标记图案对应的旋转角度,以所述旋转角度作为初始旋转角度;
根据所述智能终端中多传感器融合而输出的旋转角度以及所述初始旋转角度,运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角;
通过所述夹角进行当前所捕获图像的透射变换获得透射图像。
在其中一个示例性实施例中,所述将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵之后,所述方法还包括:
根据所述图像中标记图案的位姿矩阵进行预置图像在所述图像中的投影。
在其中一个示例性实施例中,获得智能终端所捕获图像中标记图案的识别结果,包括:
由图像与预置标记图案的匹配获得或由所捕获图像中标记图案的用户指定 的触发获得所述标记图案的识别结果。
在其中一个示例性实施例中,所述将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵,包括:
将平移信息所指示标记图案在空间中的移动,以及旋转角度所指示的旋转分别作为元素而一并形成标记图案的位姿矩阵。
一种图像的识别跟踪装置,其特征在于,所述装置包括:
识别结果获得器,配置为获得智能终端所捕获图像中标记图案的识别结果;
目标追踪器,配置为根据所述标记图案的识别结果定位所述图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息;
位姿获得器,配置为将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。
在其中一个示例性实施例中,所述目标追踪器包括:
标记定位器,配置为通过识别结果指示的标记图案被识别定位图像中的标记图案;
追踪执行器,配置为根据定位的所述标记图案进行目标追踪,从进行的目标追踪获得所述图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸;
平移信息形成器,配置为根据所述缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离,所述垂直距离和平移距离形成平移信息。
在其中一个示例性实施例中,所述装置还包括:
数据获得器,配置为获得智能终端捕获所述图像时多个传感器输出的传感器数据;
多传感器融合器,配置为对传感器数据执行多传感器融合算法计算智能终端在空间中的旋转角度,所述旋转角度由多传感器融合而输出,且用于形成图像中标记图案的位姿矩阵。
在其中一个示例性实施例中,所述多传感器融合器配置执行:
获得所述智能终端捕获图像时多个传感器输出的传感器数据;
对传感器数据进行智能终端自身在空间中的旋转角度计算。
在其中一个示例性实施例中,所述多传感器融合器配置执行对传感器数据进行智能终端自身在空间中的旋转角度计算中,配置执行:
进行传感器数据中角速度的积分,获得智能终端分别相对于空间中各个方位的旋转粗略值;
根据传感器数据中的加速度和重力方向信息对旋转粗略值进行旋转角度的辅助计算获得所在智能终端在空间中分别相对于各个方位的旋转角度。
在其中一个示例性实施例中,所述识别结果获得器进一步配置为智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果;
所述装置还包括:
透射变换器,配置为相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,所述透射图像用于进行当前所捕获图像的目标追踪。
在其中一个示例性实施例中,所述透射变换器包括:
初始旋转获得器,配置为获取首次被识别到标记图案的图像中标记图案对应的旋转角度,以所述旋转角度作为初始旋转角度;
旋转变换器,配置为根据所述智能终端中多传感器融合而输出的旋转角度以及所述初始旋转角度,运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角;
图像透射变换器,配置为通过所述夹角进行当前所捕获图像的透射变换获得透射图像。
在其中一个示例性实施例中,所述装置还包括:
投影器,配置为根据所述图像中标记图案的位姿矩阵进行预置虚拟场景图像在所述图像中的投影。
一种智能终端,包括:
处理器;以及
存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现如上所述的图像的识别跟踪方法。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的图像的识别跟踪方法。
本申请的实施例提供的技术方案可以包括以下有益效果:
对于所捕获的图像,首先获得智能终端所捕获图像中标记图案的识别结果,然后根据标记图案的识别结果定位图像中的标记图案,由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息,最终将平移信息和智能终端中多传感器融合而输出的旋转角度形成图像标记图案的位姿矩阵,在智能终端中多传感器融合的作用下,不再需要采用目标追踪的过程而获得旋转角度,从整体上提升时间性能,并且也避免了采用目标追踪的过程而获得旋转角度时旋转角度很不准确,甚至无法计算的局限性,目标追踪过程和多传感器融合的结合在保证很快的追踪速度的同时,还具备较强的稳定性和准确性,能够兼顾追踪效果和时间性能。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本申请。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并于说明书一起用于解释本申请的原理。
图1是根据一示例性实施例示出的一种图像的识别跟踪方法的流程图;
图2是根据一示例性实施例示出的对传感器数据进行智能终端在空间中的旋转角度的细节进行描述的流程图;
图3是根据一示例性实施例示出的对步骤230的细节进行描述的流程图;
图4是根据根据图1对应实施例示出的对步骤130的细节进行描述的流程 图;
图5是根据一示例性实施例示出的对相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,透射图像用于进行当前所捕获图像的目标追踪步骤的细节进行描述的流程图;
图6是根据图1对应实施例示出的对步骤110的细节进行描述的流程图;
图7是根据一示例性实施例示出的智能终端中现实增强系统的框架图;
图8是图7对应实施例中追踪图像步骤的实现框架图;
图9是根据一示例性实施例示出的一种图像的识别跟踪装置的框图;
图10是根据图9对应实施例示出的对目标追踪器的细节进行描述的框图;
图11是根据另一示例性实施例示出的一种图像的识别跟踪装置的框图;
图12是根据一示例性实施例示出的对透射变换器的细节进行描述的框图;
图13是根据一示例性实施例示出的一种装置的框图。
具体实施方式
这里将详细地对示例性实施例执行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
图1是根据一示例性实施例示出的一种图像的识别跟踪方法的流程图。在一个示例性实施例中,该图像的识别跟踪方法,如图1所示,可以包括以下步骤。
在步骤110中,获得智能终端所捕获图像中标记图案的识别结果。
其中,智能终端配置为执行本申请的图像识别跟踪方法,在此智能终端,将首先获取其捕获的图像。在此应当说明,智能终端的图像捕获过程,可以是当前所执行的,也可以是预先执行的。通过本申请的图像识别跟踪过程,将为图像执行识别跟踪过程,此图像是智能终端所即时捕获的或者智能终端所预先捕获的。
当然,可以理解的,对于执行图像捕获的智能终端,与当前对所捕获图像 执行识别跟踪过程的智能终端,可以是同一智能终端,也可以是各不相同的两个智能终端。对于各不相同的智能终端,将需要执行图像捕获的智能终端将所捕获图像和相应的其它数据,例如后续所说的传感器数据传递给执行识别跟踪过程的智能终端。
在一个示例性实施例中,智能终端所捕获图像是通过智能终端配置的拍摄组件获得,例如,各种内置或者外置的摄像头实现。由此,所捕获图像为真实场景图像。
具体而言,随着智能终端对现实环境的拍摄,智能终端所捕获图像用于以图像的形式反映真实场景。可以理解的,真实场景,即为现实环境中的场景。也就是说,所捕获图像是用户携带的智能终端针对现实环境进行拍摄而捕获得到的,当然,也应当进一步说明的是,其可以是当前所捕获得到的,也可以是预先所获得的,在此不进行限定。
标记图案为预先所指定的图案,对于图像而言,标记图案是通过针对现实环境中布设的标记进行拍摄而存在于图像中的。
图像中标记图案的识别结果用于指示图像中是否存在标记图案以及标记图案在图像中的位置,具体而言,对于图像中存在标记图案的情况,所获得的识别结果便指示了标记图案被识别,此时,即可对此图像执行目标追踪。
在此应当说明的是,对于图像中标记图案的识别结果,可以由图像与预置的标记图像进行匹配而输出得到,也可以通过其它方式完成图像中标记图像的识别之后输出得到的,在此不进行限定,只要能够从输出的信息中获得图像中标记图案的识别结果即可。
在一个示例性实施例的具体实现中,智能终端所捕获图像中标记图案的识别结果,由图像与预置标记图案的匹配获得或由所捕获图像中标记图案的用户指定的触发获得。
通过这一捕获图像中标记图案的识别结果获得方式的配置,为智能终端的图像识别跟踪过程提供多种方式,以便于能够灵活适配于各种场景以及智能终端的性能,提高图像识别跟踪的可靠性。
在步骤130中,根据标记图案的识别结果定位图像中的标记图案,由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息。
其中,图像的目标追踪是指对图像执行目标追踪过程,在此所指的目标即 为图像中的标记图案。因此,对于图像的目标追踪而言,其必将是在标记图案的识别结果作用下触发进行的,以避免进行无效的目标追踪,浪费运算资源,提高处理效率。
在所触发进行图像的目标追踪中,是对图像中的标记图案运算其位姿的过程,在本申请的示例性实施例中,由目标追踪而获得的标记图案位姿,将包括图像中标记图案在空间中的平移信息。
所指的空间,是与真实场景所在空间,即物理空间对应的,其是所在智能终端构建的三维空间。因此,图像中标记图案在空间中的平移信息,是相对于空间中的三个方位而言的。
空间中的三个方位是在所构建的空间坐标系中三个坐标轴所指向的方向。相对于物理空间,构建了空间坐标系,物理空间和空间之间存在着物理坐标系和所构建空间坐标系之间的转换,由此,方能够将所捕获的现实环境精准映射至空间,准确获得空间中的平移信息。
空间坐标系为三维坐标系,其包括相互垂直的x坐标轴、y坐标轴和z坐标轴,图像中标记图案在空间中的平移信息是以x坐标轴、y坐标轴和z坐标轴为基准而运算得到的平移距离。
具体的,图像中标记图案在空间中的平移信息用于指示标记图案在空间中发生的移动,平移信息包括在空间水平面上的平移距离以及垂直距离。在空间水平面上的平移距离是分别对应于空间中的两个坐标轴方位的。
对于图像的目标追踪而言,能够对标记图案在空间中三个方位平移有快速稳定的预测,所以将仍然进行图像中的目标追踪,但不再进行图像中标记图案的相关旋转信息的运算,由此便将使得目标追踪的时间性能和速度得到极大提升和改善。
在一个示例性实施例的具体现实现中,用于实现图像中目标追踪的算法,可以是单目标追踪算法,也可以在匹配速度足够快时采用连续的匹配,还可以使用深度学习的方式实现,甚至于多目标追踪算法,在此不进行一一列举。
在步骤150中,将平移信息和智能终端中多传感器融合而输出的旋转角度形成图像中标记图案的位姿矩阵。
其中,智能终端是指执行本申请图像识别跟踪过程的终端设备,例如,智能终端可以是智能手机、平板电脑等便携移动终端。智能终端中装设有各种传 感器,因此,可以用于执行图像捕获的智能终端中多个传感器输出的传感器数据进行融合,而获得反映智能终端所发生旋转的旋转角度,由于智能终端捕获图像且通过所装设的传感器输出传感器数据,因此,多传感器融合而输出的旋转角度,即可用于描述图像中标记图案所对应位姿中旋转的发生。
智能终端中的多传感器融合能够快速准确的实现旋转角度的计算,并且与图像的目标追踪相配合,而避免了目标追踪中旋转角度的计算不准确甚至无法计算的问题,进而方能够保证效率和准确性。
在经由图像的目标追踪获得标记图案在空间中的平移信息之后,将此平移信息与多传感器融合而输出的旋转角度一并形成图像中标记图案的位姿矩阵。
所指的将平移信息和旋转角度一并形成图像中标记图案的位姿矩阵,是指将平移信息所指示标记图案在空间中的移动,以及旋转角度所指示的旋转分别作为元素而一并形成标记图案的位姿矩阵。
例如,对于平移信息所指示标记图案在空间中的移动,其可以通过相对于空间中三个坐标轴方位而分别获得的距离体现;旋转角度也可以通过相对于空间中三个坐标轴所分别对应的坐标轴旋转角度,因此,即可获得六个自由度的标记图案位姿矩阵。
至此,将相对于空间中三个坐标轴所分别对应的距离,以及相相对于三个坐标轴所分别对应的坐标轴旋转角度分别作为矩阵中元素而构成标记图案的位姿矩阵。
图像中标记图案的位姿矩阵用于描述标记图案相对于其初始位姿而发生的位姿变化。通过运算得到的位姿矩阵,将使得后续所实现的业务实现场景是与位姿矩阵相匹配的,进而适配于图像以及图像中的标记图案。
例如,对于后续所实现的现实增强业务场景,根据所运算得到的位姿矩阵而进行虚拟场景图像的投影,以使得被投影至图像的虚拟场景图像是与其位姿矩阵相适配的,保证了后续业务实现场景在实现上的精准性和适应性。
如上所述的示例性实施例中,在图像的目标追踪和多传感器融合的作用下,使得智能终端中图像识别跟踪的实现能够在保持较好的追踪准确率下以非常快的速度完成运算,既可以保证智能终端,即移动端的实时性,也可以保证追踪效果不会受到影响,使得移动端本地实时标记图案的追踪成为现实,并且具有更好的稳定性。
在一个示例性实施例中,步骤350之前,该图像的识别跟踪方法,还可以包括以下步骤。
获得智能终端捕获图像时多个传感器输出的传感器数据;
对传感器数据执行多传感器融合算法计算智能终端在空间中的旋转角度,旋转角度由多传感器融合而输出,且用于形成图像中标记图案的位姿矩阵。
其中,如前所述的,多传感器融合是针对智能终端中多个传感器所输出的数据,即传感器数据而言的。在传感器数据中执行多传感器融合算法,计算得到智能终端在空间中分别对应于三个方位的旋转角度,计算得到的旋转角度即对应于图像中标记图案在空间中发生的旋转。
进一步的,图2是根据一示例性实施例示出的对传感器数据进行智能终端在空间中的旋转角度的细节进行描述的流程图。该步骤,如图2所示,可以包括以下步骤。
在步骤210中,获得智能终端捕获图像时多个传感器输出的传感器数据。
其中,为了多传感器融合下实现图像中标记图案的位姿预测,此图像的捕获与智能终端中传感器数据的采集是同时进行的,以保证所采集的传感器数据能够对应于智能终端捕获图像的姿态,进而方能够使得依据传感器数据而进行的计算相对于图像是准确的。
除此之外,智能终端中多个传感器进行的数据采集,也可以在智能终端捕获图像时保持姿态而执行,还可以是在智能终端捕获图像之前执行并保持姿态而捕获,在此不进行限定。
智能终端中用于在捕获真实图像时输出传感器数据的多个传感器,是指智能终端中能够用于运算得到智能终端的旋转角度的所有传感器。在一个示例性实施例中,传感器的数量为三个。
传感器数据,是与智能终端的旋转相关的多个传感器相关的。例如,传感器数据可以是由智能终端中角速度计、加速度计和重力感应计输出的数据。
在智能终端进行图像的捕获时,传感器便进行数据采集,以输出此图像关联的传感器数据。
在步骤230中,对传感器数据进行智能终端自身在空间中的旋转角度计算。
其中,通过传感器数据而进行空间中旋转角度的计算,获得相对于每一方 位的旋转角度。在此应当说明的是,所指的方位,即为空间中所构建三维坐标系的坐标轴方向。
具体的,图3是根据一示例性实施例示出的对步骤230的细节进行描述的流程图。该步骤230,如图3所示,可以包括以下步骤。
在步骤231中,进行传感器数据中角速度的积分,获得智能终端分别相对于空间中各个方位的旋转粗略值。
其中,传感器数据包括角速度、加速度和重力方向信息。角速度是由智能终端中角速度计采集得到的,加速度是由加速度计采集得到,而重力方向信息则是由重力感应计采集得到。
随着图像捕获的进行,也由智能终端中指定的多个传感器获得了传感器数据。提取传感器数据中的角速度,首先进行传感器数据中角速度的积分,获得智能终端分别相对于空间中每一方位的旋转粗略值。
也就是说,在角速度的积分过程中,角速度计的设备误差是一直在累积的,因此,未能得到准确值,而仅仅获得相对于每一方位的旋转粗略值。
例如,可以理解的,在空间中存在着三个坐标轴方向,即此空间所建立坐标系的x坐标轴、y坐标轴和z坐标轴所指向的三个方向,这三个方向便是所指的方位,对每一方位都进行传感器数据中角速度的积分,分别获得每一方位的旋转粗略值。
在步骤233中,根据传感器数据中的加速度和重力方向信息对旋转粗略值进行旋转角度的辅助计算获得所在智能终端在空间中分别相对于各个方位的旋转角度。
其中,在获得智能终端分别相对于空间中各个方位的旋转粗略值之后,将以传感器数据中的加速度和重力方向信息对此旋转粗略值进行精确计算,以最终得到准确的旋转角度。
传感器数据中加速度和重力方向信息都是记录智能终端的位置和移动的信息,因此,将以加速度和重力方向信息为辅助,而进行卡尔曼滤波,由此来降低旋转粗略值中的误差,进而获得误差被大大降低的旋转角度。
具体而言,在运算得到智能终端分别相对于空间中各个方位的旋转粗略值之后,便将传感器数据中的加速度和重力方向信息作为辅助,和角速度一起送入卡尔曼滤波器,由卡尔曼滤波器输出智能终端自身在空间中分别相对于各个 方位的旋转角度。
图4是根据根据图1对应实施例示出的对步骤130的细节进行描述的流程图。该步骤130,如图4所示,可以包括以下步骤。
在步骤131中,通过识别结果指示的标记图案被识别定位图像中的标记图案。
在步骤133中,根据定位的标记图案进行目标追踪,从进行的目标追踪获得图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸。
其中,经由步骤110获得图像中标记图案的识别结果之后,在图像中标记图案被识别的指示下,触发执行图像的目标追踪过程,并定位图像中的标记图案。
在图像的目标追踪过程中,可以得到图像在空间水平面上对应于左右、上下两个方位的移动距离,即为标记图案在空间水平面的平移距离。
如前所述的,标记图像是预先存储且包含了标记图案的图像,因此,图像中标记图案是与标记图像呈一定的比例关系,即对应于一缩放尺寸的,由目标追踪过程的执行而获得此缩放尺寸。
在步骤135中,根据缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离,垂直距离和平移距离形成平移信息。
其中,获取所预先存储标记图像的尺寸,并在缩放尺寸的配合下计算标记图案在空间中的垂直距离,即空间中垂直方向的平移距离,从而将垂直距离和空间水平面的平移距离一并形成平移信息。
通过如上所述的示例性实施例,便在智能终端中实现了多传感器融合算法,快速准确的计算出空间中智能终端在三个方位的旋转角度,其也将是捕获图像以及图像中标记图案的位姿。
在另一个示例性实施例中,步骤110进一步包括:智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果。
与之相对应的,步骤130之前,该图像的识别跟踪方法,还包括以下步骤。
相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预 处理获得透射图像,透射图像用于进行当前所捕获图像的目标追踪。
其中,在此首先应当说明的是,为图像而实现的识别追踪是在智能终端所持续进行的图像捕获中进行的。换而言之,被执行识别追踪的图像,是智能终端持续捕获得到的每一图像。
例如,在以帧为单位进行真实场景的拍摄,所获得的图像即为拍摄中的一帧图像,随着拍摄的持续进行,下一帧图像也将被执行识别追踪过程。
如前所述的,将通过图像的目标追踪过程获得图像中标记图案在空间中的平移信息,而为了保证平移信息的准确性,并且进一步简化目标追踪过程,提高处理速度和效率,将在对图像执行目标追踪过程之前,对此图像进行优化,即执行透射变换预处理。
通过透射变换预处理所获得的透射图像,执行此图像的目标追踪过程,进而使得用于目标追踪的透射图像是与首次识别到标记图案的图像的空间角度姿态一致,从而使得目标追踪的实现中不再需要考虑旋转角度,而直接获得平移信息即可。
在此应当补充说明,图像的识别跟踪方法,所获得的位姿矩阵是相对于初始位姿而言的,即相对于首次被识别到标记图案的图像,其所对应的位姿。
进一步的,图5是根据一示例性实施例示出的对相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,透射图像用于进行当前所捕获图像的目标追踪步骤的细节进行描述的流程图。
该步骤,如图5所示,可以包括以下步骤。
在步骤301中,获取首次被识别到标记图案的图像中标记图案对应的旋转角度,以旋转角度作为初始旋转角度。
在步骤303中,根据智能终端中多传感器融合而输出的旋转角度以及初始旋转角度,运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角。
其中,对于一图像,在为其完成了智能终端中的多传感器融合之后,获得此图像的旋转角度。为此图像获取首次被识别到标记图案的图像中标记图案对应的初始旋转角度,通过运算此图像的旋转角度和初始旋转角度二者之间的差值获得夹角。
在步骤305中,通过夹角进行当前所捕获图像的透射变换获得透射图像。
其中,对所捕获图像执行透射变换实质为校正所捕获图像,消除其与首次被识别到标记图案的图像之间的旋转和畸变误差,进而方便后续进行目标追踪。
在获得透射图像之后,便使用此透射图像来执行图像的目标追踪过程,而不再直接使用图像。
对于图像中标记图案位姿矩阵的获得,可以结合所获得的夹角,此夹角也是分别对应于所在空间中的三个方位的,结合所获得的夹角和平移信息形成位姿矩阵,由此,就可以快速得到六个自由度的相机位姿矩阵。
在一个示例性实施例中,图1所示实施例中的步骤110,可以包括:
进行智能终端所捕获图像和预存储标记图像二者之间的匹配,识别图像中是否存在标记图案,获得图像中标记图案的识别结果。
其中,在进行图像的目标追踪之前,将进行此图像与预置标记图像的匹配,如果图像匹配上标记图像,则说明图像中存在着标记图案,图像中的标记图案被识别,从而获得相应的识别结果。
如果图像未匹配上标记图像,则说明图像中不存在标记图案,从而将不进行后续的目标追踪,而等待下一图像。
图6是根据图1对应实施例示出的对步骤110的细节进行描述的流程图。该步骤110,如图6所示,可以包括以下步骤。
在步骤111中,接收在所捕获图像中进行标记图案指定的用户指令。
其中,图像中是否存在标记图案,可以利用与用户的交互识别来实现。具体而言,对捕获得的图像进行显示,此时,用户可查看所捕获的图像,并确认此图像中是否存在着标记图案,如果存在着标记图案,则在图像中触发标记图案的指定操作,与之相对应的,智能终端将响应这一指定操作而生成图像中进行标记图案指定的用户指令。
例如,用户在图像中标记图案指定操作的触发,可以是用户在所显示的图像中框选出标记图案的操作,当然也可以其它操作,在此不进行限定。
在步骤113中,根据用户指令获得图像中标记图案的识别结果。
通过如上所述的示例性实施例,使得图像中的标记图像通过用户交互的方 式简单快捷的实现,对于图像的识别跟踪而言,将进一步提高了准确性和效率。
在一个示例性实施例中,步骤150之后,该图像的识别跟踪方法还包括:
根据图像中标记图案的位姿矩阵进行预置虚拟场景图像在图像中的投影。
其中,在获得图像中标记图案的位姿矩阵之后,即可以此为依据进行图像中虚拟场景图像的投影,进而实现各种业务场景,进而在智能终端构建面向个人的现实增强系统,也可以构建面向企业的现实增强辅助办公系统等。
例如,对于面向企业的现实增强辅助办公系统的实现,可以在识别跟踪到标记图像,此标记图案对应于现实环境中的办公桌或者办公室内的特定位置,依据标记图像的位姿矩阵而产生邮件、会议通知、视频会议等业务场景,也可应用于远程会议中,以产生会议伙伴就在身边的真实感。
通过如上所述的示例性实施例,便能够在移动端,即智能终端中实现快速标记识别追踪,使得智能终端本地实时标记图案识别追踪成为现实,并且相较于传统特征点追踪具有更好的稳定性和更高的追踪准确率。
以智能终端中现实增强系统的实现为例,对如上所述的示例性实施例结合产品进行阐述。
图7是根据一示例性实施例示出的智能终端中现实增强系统的框架图。智能终端持续进行现实环境的拍摄,而不断捕获图像,进而形成视频图像。
也就是说,所不断捕获的图像,即为智能终端中现实增强显示的视频图像中的一帧图像。
如图7所示的,在捕获得到一帧图像之后,对其进行标记图像的匹配,即如系统框架中的步骤410所示,如果匹配上,则执行步骤430,进行图像追踪,进而最终获得这一帧图像的位姿矩阵,从而在智能终端的现实增强系统中方能够为此帧图像实现虚拟场景图像的投影,进而进行现实增强显示。
在执行步骤410而未匹配上标记图像时,将等待下一帧图像;如果能成功执行追踪图像步骤,则也将等待下一帧图像。
以此类推,不断为捕获的图像执行识别跟踪。
图8是图7对应实施例中追踪图像步骤的实现框架图。对于视频图像中的 每一帧图像,都将在多传感器的配合下获得相应的透射图像,即如实现框架中步骤510所示。
进而以透射图像为输入而进行单目标追踪,获得透射变换的位姿矩阵,即如步骤530。透射变换的位姿矩阵是消除了旋转而获得的,因此,其仅仅描述了图像中标记图案的平移信息,而并未包含旋转角度。
此时,在多传感器的配合下,通过多传感器融合所获得的旋转角度与此透射变换的位姿矩阵一起,形成图像中标记图案在空间中的位姿矩阵,至此,便为智能终端中现实增强系统的实现提供了最为快速稳定的实现,进一步加速性能的同时,也不再需要进行时间性能和识别效果的取舍。
通过如上所述的实现,充分融合了智能终端中的多个传感器,其追踪的稳定性和准确率都保持在很高的水平。
下述为本申请装置实施例,可以配置为执行本申请上述图像的识别跟踪方法实施例。对于本申请装置实施例中未披露的细节,请参照本申请图像的识别跟踪方法实施例。
图9是根据一示例性实施例示出的一种图像的识别跟踪装置的框图。该图像的识别跟踪装置,如图9所示,可以包括但不限于:识别结果获得器710、目标追踪器730和位姿获得器750。
识别结果获得器710,配置为获得智能终端所捕获图像中标记图案的识别结果。
目标追踪器730,配置为根据标记图案的识别结果定位图像中的标记图案,由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息。
位姿获得器750,配置为将平移信息和智能终端中多传感器融合而输出的旋转角度形成图像中标记图案的位姿矩阵。
图10是根据图9对应实施例示出的对目标追踪器的细节进行描述的框图。该目标追踪器730,如图10所示,可以包括但不限于:标记定位器731、追踪执行器733和平移信息形成器735。
标记定位器731,配置为通过识别结果指示的标记图案被识别定位图像中的标记图案。
追踪执行器733,配置为根据定位的标记图案进行目标追踪,从进行的目标追踪获得图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸。
平移信息形成器735,配置为根据缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离,垂直距离和平移距离形成平移信息。
图11是根据另一示例性实施例示出的一种图像的识别跟踪装置的框图,该图像的识别跟踪装置还包括但不限于:数据获得器810和多传感器融合器830。
数据获得器810,配置为获得智能终端捕获图像时多个传感器输出的传感器数据。
多传感器融合器830,配置为对传感器数据执行多传感器融合算法计算智能终端在空间中的旋转角度,旋转角度由多传感器融合而输出,且用于形成图像中标记图案的位姿矩阵。
在另一个示例性实施例中,识别结果获得器710进一步配置为智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果。
该图像的识别跟踪装置还包括透射变换器。该透射变换器配置为相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,透射图像用于进行当前所捕获图像的目标追踪。
进一步的,图12是根据一示例性实施例示出的对透射变换器的细节进行描述的框图。该透射变换器910,如图12所示,可以包括但不限于:初始旋转获得器911、旋转变换器913和图像透射变换器915。
初始旋转获得器911,配置为获取首次被识别到标记图案的图像中标记图案对应的旋转角度,以旋转角度作为初始旋转角度。
旋转变换器913,配置为根据智能终端中多传感器融合而输出的旋转角度以及初始旋转角度,运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角。
图像透射变换器915,配置为通过夹角进行当前所捕获图像的透射变换获得透射图像。
在一个示例性实施例中,图9所示的识别结果获得器710进一步配置为进行所捕获图像和标记图像二者之间的匹配,识别图像中是否存在标记图案,获得图像中标记图案的识别结果。
在另一个示例性实施例中,该图像的识别跟踪装置还包括但不限于投影器。投影器配置为根据图像中标记图案的位姿矩阵进行预置虚拟场景图像在图像中的投影。
图13是根据一示例性实施例示出的一种装置的框图。例如,装置900可以是图1所示实施环境中的智能终端110。例如,智能终端110可以是智能手机、平板电脑等终端设备。
参照图13,装置900可以包括以下一个或多个组件:处理组件902,存储器904,电源组件906,多媒体组件908,音频组件910,传感器组件914以及通信组件916。
处理组件902通常控制装置900的整体操作,诸如与显示,电话呼叫,数据通信,相机操作以及记录操作相关联的操作等。处理组件902可以包括一个或多个处理器918来执行指令,以完成下述的方法的全部或部分步骤。此外,处理组件902可以包括一个或多个模块,便于处理组件902和其他组件之间的交互。例如,处理组件902可以包括多媒体模块,以方便多媒体组件908和处理组件902之间的交互。
存储器904被配置为存储各种类型的数据以支持在装置900的操作。这些数据的示例包括用于在装置900上操作的任何应用程序或方法的指令。存储器904可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random Access Memory,简称SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,简称EEPROM),可擦除可编程只读存储器(Erasable Programmable Read Only Memory,简称EPROM),可编程只读存储器(Programmable Red-Only Memory,简称PROM),只读存储器(Read-Only Memory,简称ROM),磁存储器,快闪存储器,磁盘或光盘。存储器904中还存储有一个或多个模块,该一个或多个 模块被配置成由该一个或多个处理器918执行,以完成下述图3、图4、图5和图6任一所示方法中的全部或者部分步骤。
电源组件906为装置900的各种组件提供电力。电源组件906可以包括电源管理系统,一个或多个电源,及其他与为装置900生成、管理和分配电力相关联的组件。
多媒体组件908包括在所述装置900和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(Liquid Crystal Display,简称LCD)和触摸面板。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。屏幕还可以包括有机电致发光显示器(Organic Light Emitting Display,简称OLED)。
音频组件910被配置为输出和/或输入音频信号。例如,音频组件910包括一个麦克风(Microphone,简称MIC),当装置900处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器904或经由通信组件916发送。在一些实施例中,音频组件910还包括一个扬声器,配置为输出音频信号。
传感器组件914包括一个或多个传感器,用于为装置900提供各个方面的状态评估。例如,传感器组件914可以检测到装置900的打开/关闭状态,组件的相对定位,传感器组件914还可以检测装置900或装置900一个组件的位置改变以及装置900的温度变化。在一些实施例中,该传感器组件914还可以包括磁传感器,压力传感器或温度传感器。
通信组件916被配置为便于装置900和其他设备之间有线或无线方式的通信。装置900可以接入基于通信标准的无线网络,如WiFi(WIreless-Fidelity,无线保真)。在一个示例性实施例中,通信组件916经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件916还包括近场通信(Near Field Communication,简称NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(Radio Frequency Identification,简称RFID)技术,红外数据协会(Infrared Data Association,简称IrDA)技术,超宽带(Ultra Wideband,简称UWB)技术,蓝牙技术和其他技术来实现。
在示例性实施例中,装置900可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,简称ASIC)、数字信号处理器、数字信号处理设备、可编程逻辑器件、现场可编程门阵列、控制器、微控制器、微处理器或其他电子元件实现,配置为执行下述方法。
可选的,本申请还提供一种智能终端,该电视终端可以用于图1所示实施环境中,执行图1、图2、图3、图4、图5和图6任一所示的图像的识别跟踪方法的全部或者部分步骤。所述智能终端包括:
处理器;
配置为存储处理器可执行指令的存储器;
其中,所述处理器被配置为执行:
获得智能终端所捕获图像中标记图案的识别结果;
根据所述标记图案的识别结果定位所述图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息;
将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。
该实施例中的装置的处理器执行操作的具体方式已经在有关该智能终端的图像的识别跟踪方法的实施例中执行了详细描述,此处将不做详细阐述说明。
在示例性实施例中,还提供了一种存储介质,该存储介质为计算机可读存储介质,例如可以为包括指令的临时性和非临时性计算机可读存储介质。该存储介指例如包括指令的存储器904,上述指令可由装置900的处理器918执行以完成上述方法。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围执行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (18)

  1. 一种图像的识别跟踪方法,所述方法包括:
    获得智能终端所捕获图像中标记图案的识别结果;
    根据所述标记图案的识别结果定位所述图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息;
    将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。
  2. 根据权利要求1所述的方法,其中,根据标记图案的识别结果定位图像中的标记图案,由定位的标记图案进行目标追踪获得图像中标记图案在空间的平移信息,包括:
    通过识别结果指示的标记图案被识别定位图像中的标记图案;
    根据定位的所述标记图案进行目标追踪,从进行的目标追踪获得所述图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸;
    根据所述缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离,所述垂直距离和平移距离形成平移信息。
  3. 根据权利要求1所述的方法,其中,所述将所述平移信息和所在智能终端中多传感器融合而输出的旋转角度形成所述真实场景图像中标记图案的位姿矩阵之前,所述方法还包括:
    获得智能终端捕获所述图像时多个传感器输出的传感器数据;
    对所述传感器数据执行多传感器融合算法计算所述智能终端在空间中的旋转角度,所述旋转角度由多传感器融合而输出,且用于形成所述图像中标记图案的位姿矩阵。
  4. 根据权利要求3所述的方法,其中,所述对所述传感器数据执行多传感器融合算法计算所述智能终端在空间中的旋转角度,包括:
    获得所述智能终端捕获图像时多个传感器输出的传感器数据;
    对传感器数据进行智能终端自身在空间中的旋转角度计算。
  5. 根据权利要求4所述的方法,其特征在于,所述对传感器数据进行智能终端自身在空间中的旋转角度计算,包括:
    进行传感器数据中角速度的积分,获得智能终端分别相对于空间中各个方位的旋转粗略值;
    根据传感器数据中的加速度和重力方向信息对旋转粗略值进行旋转角度的辅助计算获得所在智能终端在空间中分别相对于各个方位的旋转角度。
  6. 根据权利要求1所述的方法,其中,所述获得智能终端所捕获图像中标记图案的识别结果,包括:
    智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果;
    所述根据所述标记图案的识别结果定位图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息之前,所述方法还包括:
    相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,所述透射图像用于进行当前所捕获图像的目标追踪。
  7. 根据权利要求6所述的方法,其中,所述相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,所述透射图像用于进行当前所捕获图像的目标追踪,包括:
    获取首次被识别到标记图案的图像中标记图案对应的旋转角度,以所述旋转角度作为初始旋转角度;
    根据所述智能终端中多传感器融合而输出的旋转角度以及所述初始旋转角度,运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角;
    通过所述夹角进行当前所捕获图像的透射变换获得透射图像。
  8. 根据权利要求1所述的方法,其中,所述将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵之后,所 述方法还包括:
    根据所述图像中标记图案的位姿矩阵进行预置图像在所述图像中的投影。
  9. 根据权利要求1所述的方法,其中,获得智能终端所捕获图像中标记图案的识别结果,包括:
    由图像与预置标记图案的匹配获得或由所捕获图像中标记图案的用户指定的触发获得所述标记图案的识别结果。
  10. 根据权利要求1所述的方法,其特征在于,所述将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵,包括:
    将平移信息所指示标记图案在空间中的移动,以及旋转角度所指示的旋转分别作为元素而一并形成标记图案的位姿矩阵。
  11. 一种图像的识别跟踪装置,其中,所述装置包括:
    识别结果获得器,配置为获得智能终端所捕获图像中标记图案的识别结果;
    目标追踪器,配置为根据所述标记图案的识别结果定位所述图像中的标记图案,由定位的所述标记图案进行目标追踪获得所述图像中标记图案在空间的平移信息;
    位姿获得器,配置为将所述平移信息和智能终端中多传感器融合而输出的旋转角度形成所述图像中标记图案的位姿矩阵。
  12. 根据权利要求11所述的装置,其中,所述目标追踪器包括:
    标记定位器,配置为通过识别结果指示的标记图案被识别定位图像中的标记图案;
    追踪执行器,配置为根据定位的所述标记图案进行目标追踪,从进行的目标追踪获得所述图像中标记图案在空间水平面的平移距离以及标记图案相对于预存储标记图像的缩放尺寸;
    平移信息形成器,配置为根据所述缩放尺寸和标记图像的尺寸计算得到图像中标记图案在空间的垂直距离,所述垂直距离和平移距离形成平移信息。
  13. 根据权利要求11所述的装置,其中,所述装置还包括:
    数据获得器,配置为获得智能终端捕获所述图像时多个传感器输出的传感器数据;
    多传感器融合器,配置为对传感器数据执行多传感器融合算法计算智能终端在空间中的旋转角度,所述旋转角度由多传感器融合而输出,且用于形成图像中标记图案的位姿矩阵。
  14. 根据权利要求11所述的装置,其中,所述识别结果获得器进一步配置为智能终端持续进行图像捕获且获得当前所捕获图像中标记图案的识别结果;
    所述装置还包括:
    透射变换器,配置为相对于首次被识别到标记图案的图像,进行当前所捕获图像的透射变换预处理获得透射图像,所述透射图像用于进行当前所捕获图像的目标追踪。
  15. 根据权利要求14所述的装置,其中,所述透射变换器包括:
    初始旋转获得器,配置为获取首次被识别到标记图案的图像中标记图案对应的旋转角度,以所述旋转角度作为初始旋转角度;
    旋转变换器,配置为根据所述智能终端中多传感器融合而输出的旋转角度以及所述初始旋转角度,运算得到当前所捕获图像和首次被识别到标记图案的图像二者之间的夹角;
    图像透射变换器,配置为通过所述夹角进行当前所捕获图像的透射变换获得透射图像。
  16. 根据权利要求11所述的装置,其中,所述装置还包括:
    投影器,配置为根据所述图像中标记图案的位姿矩阵进行预置虚拟场景图像在所述图像中的投影。
  17. 一种智能终端,其中,包括:
    处理器;以及
    存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现根据权利要求1至10中任一项所述的图像的识别跟踪方法。
  18. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现根据权利要求1至10中任一项所述的图像的识别跟踪方法。
PCT/CN2018/087282 2017-05-18 2018-05-17 图像的识别跟踪方法、装置、智能终端和可读存储介质 WO2018210305A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710351693.2 2017-05-18
CN201710351693.2A CN107194968B (zh) 2017-05-18 2017-05-18 图像的识别跟踪方法、装置、智能终端和可读存储介质

Publications (1)

Publication Number Publication Date
WO2018210305A1 true WO2018210305A1 (zh) 2018-11-22

Family

ID=59875263

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/087282 WO2018210305A1 (zh) 2017-05-18 2018-05-17 图像的识别跟踪方法、装置、智能终端和可读存储介质

Country Status (2)

Country Link
CN (1) CN107194968B (zh)
WO (1) WO2018210305A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194968B (zh) * 2017-05-18 2024-01-16 腾讯科技(上海)有限公司 图像的识别跟踪方法、装置、智能终端和可读存储介质
CN110120060B (zh) * 2018-02-06 2023-07-14 广东虚拟现实科技有限公司 标记物的识别方法、装置及识别跟踪系统
WO2019154169A1 (zh) * 2018-02-06 2019-08-15 广东虚拟现实科技有限公司 跟踪交互装置的方法、存储介质以及电子设备
CN108366238A (zh) * 2018-02-08 2018-08-03 广州视源电子科技股份有限公司 图像处理方法、系统、可读存储介质及电子设备
CN108734736B (zh) 2018-05-22 2021-10-26 腾讯科技(深圳)有限公司 相机姿态追踪方法、装置、设备及存储介质
CN111476876B (zh) * 2020-04-02 2024-01-16 北京七维视觉传媒科技有限公司 一种三维影像渲染方法、装置、设备及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110050562A1 (en) * 2009-08-27 2011-03-03 Schlumberger Technology Corporation Visualization controls
CN106569591A (zh) * 2015-10-26 2017-04-19 苏州梦想人软件科技有限公司 基于计算机视觉跟踪和传感器跟踪的跟踪方法和跟踪系统
CN106681510A (zh) * 2016-12-30 2017-05-17 光速视觉(北京)科技有限公司 位姿识别装置、虚拟现实显示装置以及虚拟现实系统
CN107194968A (zh) * 2017-05-18 2017-09-22 腾讯科技(上海)有限公司 图像的识别跟踪方法、装置、智能终端和可读存储介质

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06174487A (ja) * 1992-12-10 1994-06-24 Haruo Nonin 姿勢検出装置
US6978167B2 (en) * 2002-07-01 2005-12-20 Claron Technology Inc. Video pose tracking system and method
CN1864176A (zh) * 2003-10-30 2006-11-15 日本电气株式会社 用于估计对象状态的估计系统、估计方法和估计程序
CN100458359C (zh) * 2006-03-02 2009-02-04 浣石 远距离面内小位移测量系统
JP2009236532A (ja) * 2008-03-26 2009-10-15 Seiko Epson Corp 測位方法、プログラム及び測位装置
KR100962557B1 (ko) * 2009-02-05 2010-06-11 한국과학기술원 증강현실 구현 장치 및 증강현실 구현 방법
KR101613418B1 (ko) * 2014-05-29 2016-04-21 주식회사 고영테크놀러지 옵티컬 트래킹 시스템 및 옵티컬 트래킹 시스템의 마커부 자세 및 위치 산출방법
KR101615086B1 (ko) * 2014-05-29 2016-04-27 주식회사 고영테크놀러지 옵티컬 트래킹 시스템 및 옵티컬 트래킹 시스템의 마커부 자세 산출방법
CN104243833B (zh) * 2014-09-30 2017-12-08 精宸智云(武汉)科技有限公司 一种相机姿态的调整方法及装置
KR101584080B1 (ko) * 2015-04-10 2016-01-11 (주)코어센스 3차원 회전 모션센서의 가속도 신호처리 방법
JP6565465B2 (ja) * 2015-08-12 2019-08-28 セイコーエプソン株式会社 画像表示装置、コンピュータープログラム、および画像表示システム
CN105222772B (zh) * 2015-09-17 2018-03-16 泉州装备制造研究所 一种基于多源信息融合的高精度运动轨迹检测系统
WO2017050761A1 (en) * 2015-09-21 2017-03-30 Navigate Surgical Technologies, Inc. System and method for determining the three-dimensional location and orientation of identification markers
KR102462799B1 (ko) * 2015-11-05 2022-11-03 삼성전자주식회사 자세 추정 방법 및 자세 추정 장치
CN106257911A (zh) * 2016-05-20 2016-12-28 上海九鹰电子科技有限公司 用于视频图像的图像稳定方法和装置
CN105953796A (zh) * 2016-05-23 2016-09-21 北京暴风魔镜科技有限公司 智能手机单目和imu融合的稳定运动跟踪方法和装置
CN105931275A (zh) * 2016-05-23 2016-09-07 北京暴风魔镜科技有限公司 基于移动端单目和imu融合的稳定运动跟踪方法和装置
CN106250839B (zh) * 2016-07-27 2019-06-04 徐鹤菲 一种虹膜图像透视校正方法、装置和移动终端
CN106292721A (zh) * 2016-09-29 2017-01-04 腾讯科技(深圳)有限公司 一种控制飞行器跟踪目标对象的方法、设备及系统
CN106441138B (zh) * 2016-10-12 2019-02-26 中南大学 基于视觉测量的变形监测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110050562A1 (en) * 2009-08-27 2011-03-03 Schlumberger Technology Corporation Visualization controls
CN106569591A (zh) * 2015-10-26 2017-04-19 苏州梦想人软件科技有限公司 基于计算机视觉跟踪和传感器跟踪的跟踪方法和跟踪系统
CN106681510A (zh) * 2016-12-30 2017-05-17 光速视觉(北京)科技有限公司 位姿识别装置、虚拟现实显示装置以及虚拟现实系统
CN107194968A (zh) * 2017-05-18 2017-09-22 腾讯科技(上海)有限公司 图像的识别跟踪方法、装置、智能终端和可读存储介质

Also Published As

Publication number Publication date
CN107194968B (zh) 2024-01-16
CN107194968A (zh) 2017-09-22

Similar Documents

Publication Publication Date Title
WO2018210305A1 (zh) 图像的识别跟踪方法、装置、智能终端和可读存储介质
US9661214B2 (en) Depth determination using camera focus
CN111983635B (zh) 位姿确定方法及装置、电子设备和存储介质
US10241990B2 (en) Gesture based annotations
CN110503689B (zh) 位姿预测方法、模型训练方法及装置
KR20220053670A (ko) 목표 대상물 매칭 방법 및 장치, 전자 기기 및 기억 매체
US11288531B2 (en) Image processing method and apparatus, electronic device, and storage medium
EP2775374B1 (en) User interface and method
CN103105926A (zh) 多传感器姿势识别
CN109582122B (zh) 增强现实信息提供方法、装置及电子设备
CN114494487B (zh) 基于全景图语义拼接的户型图生成方法、设备及存储介质
CN110853095B (zh) 相机定位方法、装置、电子设备及存储介质
TWI752594B (zh) 一種資訊處理方法、電子設備、儲存媒體和程式
US20160202947A1 (en) Method and system for remote viewing via wearable electronic devices
CN106875446B (zh) 相机重定位方法及装置
CN114529621B (zh) 户型图生成方法、装置、电子设备及介质
CN113587928B (zh) 导航方法、装置、电子设备、存储介质及计算机程序产品
CN114549578A (zh) 目标跟踪方法、装置及存储介质
CN114494486B (zh) 户型图生成方法、设备及存储介质
WO2022237071A1 (zh) 定位方法及装置、电子设备、存储介质和计算机程序
US11770551B2 (en) Object pose estimation and tracking using machine learning
US20220345621A1 (en) Scene lock mode for capturing camera images
KR102614102B1 (ko) 실물 객체에 대한 정밀한 추적을 위한 자동화된 캘리브레이션 시스템, 캘리브레이션 방법 및 캘리브레이션 방법을 기초로 이미지 내에서 실물 객체를 추적하고 실물 객체에 가상 모델을 증강하는 방법
CN110060355B (zh) 界面显示方法、装置、设备及存储介质
CN114722570B (zh) 视线估计模型建立方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18801469

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18801469

Country of ref document: EP

Kind code of ref document: A1