WO2018212514A1

WO2018212514A1 - Method and apparatus for processing 360-degree image

Info

Publication number: WO2018212514A1
Application number: PCT/KR2018/005440
Authority: WO
Inventors: 사-가리가앨버트; 반디니앨레산드로; 매스트리토마소
Original assignee: 삼성전자 주식회사
Priority date: 2017-05-18
Filing date: 2018-05-11
Publication date: 2018-11-22

Abstract

Provided is a 360-degree image processing method for: acquiring a plurality of motion vectors for a 360-degree image, determining at least one motion vector, which indicates global rotation of the 360-degree image, among the plurality of motion vectors through filtering; performing three-dimensional-transformation on the determined at least one motion vector so as to acquire three-dimensional rotation information of the 360-degree image, and correcting distortion, of the 360-degree image, caused by shaking on the basis of the acquired three-dimensional rotation information.

Description

Method and apparatus for processing 360 degree image

The present disclosure relates to a recording medium on which a method for processing a 360 degree image, a device for processing a 360 degree image, and a program for processing a 360 degree image are recorded.

As image processing technology is developed, research on a method of providing a 360 degree image as one of technologies for providing a more realistic image to a user is being actively conducted. In providing a 360 degree image, a problem is a so-called virtual reality (VR) disease, in which a user feels motion sickness-like symptoms while watching a 360 degree image. The VR bottle may be generated by receiving contradictory sensations while the user watches a 360 degree image. VR bottles can be alleviated by correcting undesirable camera movements to stabilize the image.

On the other hand, the stabilization of the image can be performed in the post-processing step of the image, most image stabilization technology has to perform two separate tasks. Firstly, the task of detecting and suppressing unintended camera movement from the estimated camera trajectory should be performed, and secondly, generating a new image sequence using the stable trajectory of the camera and the original image sequence. Should be. However, it is difficult to estimate the camera trajectory in the uncorrected single-view imaging system, and generating new images from the stabilized camera view is also difficult to reliably perform. Therefore, further research is required to stabilize the 360 degree image.

The disclosed embodiment provides a method and apparatus for processing a 360 degree image capable of stabilizing an image by converting a motion vector of the 360 degree image into rotation information and using the same to correct distortion caused by shaking included in the 360 degree image. I would like to.

According to one or more exemplary embodiments, a method of processing a 360 degree image may include: obtaining a plurality of motion vectors with respect to the 360 degree image; Determining at least one motion vector representing global rotation of a 360 degree image among the plurality of motion vectors through filtering; Obtaining three-dimensional rotation information of a 360 degree image by three-dimensionally transforming the determined at least one motion vector; And correcting the distortion of the 360 degree image due to the shaking based on the obtained 3D rotation information.

In the method of processing a 360-degree image according to an embodiment of the present disclosure, the determining of the at least one motion vector includes removing a motion vector included in a predetermined region according to a type of projection among a plurality of motion vectors. can do.

According to one or more exemplary embodiments, a method of processing a 360 degree image, the determining of at least one motion vector comprises: generating a mask based on an edge detected from the 360 degree image; Determining a region where no texture exists in the 360 degree image by applying the generated mask to the 360 degree image; And removing a motion vector included in a region in which no texture exists among the plurality of motion vectors.

A method of processing a 360 degree image according to an embodiment, the determining of the at least one motion vector comprises: detecting at least one moving object from the 360 degree image through a preset object detection process; And removing the motion vector associated with the detected object among the plurality of motion vectors.

In the method of processing a 360-degree image according to an embodiment of the present disclosure, the determining of the at least one motion vector may include: motion vectors positioned opposite to each other on a unitsphere from which a 360-degree image is projected from among the plurality of motion vectors A motion vector that is parallel, has the opposite sign, and has a magnitude within a certain threshold may be determined as a motion vector representing global rotation.

In the 360-degree image processing method according to an embodiment, obtaining the three-dimensional rotation information, the step of classifying the determined at least one motion vector into a plurality of bins corresponding to a specific direction and a specific size range; Selecting a bin containing the most motion vectors of the sorted plurality of bins; And converting the direction and the distance of the selected bin to obtain the 3D rotation information.

In the method of processing a 360-degree image according to an embodiment of the present disclosure, the obtaining of the 3D rotation information may include applying the weighted average to the directions and distances of the selected bin and a plurality of bins adjacent to the selected bin, and thus the 3D rotation information Can be obtained.

In the method of processing a 360-degree image according to an embodiment, the obtaining of the 3D rotation information may obtain, as the 3D rotation information, a rotation value for minimizing the sum of the determined at least one motion vector.

In the method of processing a 360-degree image according to an embodiment, the obtaining of the 3D rotation information may include obtaining the 3D rotation information based on a plurality of motion vectors using a previously generated learning network model. have.

According to an embodiment, a method of processing a 360 degree image may further include acquiring sensor data generated as a result of sensing a shake generated when capturing a 360 degree image through a photographing apparatus, and detecting distortion of the 360 degree image. In the correcting, the distortion of the 360 degree image may be corrected by combining the acquired sensor data and the 3D rotation information.

According to an embodiment, an apparatus for processing a 360 degree image may include a memory configured to store one or more instructions; And a processor that executes one or more instructions stored in a memory, wherein the processor obtains a plurality of motion vectors for the 360 degree image, and filters the at least one representing a global rotation of the 360 degree image among the plurality of motion vectors. Determine a motion vector of the at least one motion vector, and three-dimensionally convert the determined at least one motion vector to obtain three-dimensional rotation information of the 360-degree image, and correct the distortion of the 360-degree image due to shaking based on the obtained three-dimensional rotation information. You can correct it.

1 is a diagram illustrating a format in which a 360 degree image is stored, according to an exemplary embodiment.

2 is a flowchart illustrating a method of processing a 360-degree image by the image processing apparatus according to an exemplary embodiment.

3 is a flowchart illustrating a method of processing a 360-degree image by the image processing apparatus according to an exemplary embodiment in more detail.

4 is a diagram for describing a motion vector in a 360 degree image, according to an exemplary embodiment.

5 is a diagram for describing a method of removing, by filtering, a motion vector of a predetermined region from a plurality of motion vectors by an image processing apparatus, according to an exemplary embodiment.

FIG. 6 is a diagram for describing a method of removing, by the image processing apparatus, a motion vector included in a texture free area through filtering, according to an exemplary embodiment.

FIG. 7 is a diagram for describing a method of removing, by the image processing apparatus, a motion vector determined to not be global rotation through filtering, according to an exemplary embodiment.

8 is a flowchart for describing a method of determining, by the image processing apparatus, a motion vector indicating global rotation through filtering, according to an exemplary embodiment.

9 is a flowchart illustrating a method of converting a motion vector into 3D rotation by an image processing apparatus, according to an exemplary embodiment.

10 illustrates a motion vector of a 360 degree image, according to an exemplary embodiment.

11 is a table for describing a result of classifying a plurality of motion vectors into a plurality of bins, according to an exemplary embodiment.

12 illustrates a histogram of a plurality of motion vectors classified in FIG. 11 according to an exemplary embodiment.

FIG. 13 is a flowchart for describing a method of determining, by an image processing apparatus, rotation information obtained by combining a rotation information acquired based on a motion vector and sensing data about shaking for a 360 degree image.

14 is a block diagram of an image processing apparatus according to an exemplary embodiment.

15 is a diagram for describing at least one processor, according to an exemplary embodiment.

16 is a block diagram of a data learner, according to an exemplary embodiment.

17 is a block diagram of a data recognizer according to an exemplary embodiment.

18 is a block diagram of an image processing apparatus according to another exemplary embodiment.

Terms used herein will be briefly described and the present invention will be described in detail.

The terms used in the present invention have been selected as widely used general terms as possible in consideration of the functions in the present invention, but this may vary according to the intention or precedent of the person skilled in the art, the emergence of new technologies and the like. In addition, in certain cases, there is also a term arbitrarily selected by the applicant, in which case the meaning will be described in detail in the description of the invention. Therefore, the terms used in the present invention should be defined based on the meanings of the terms and the contents throughout the present invention, rather than the names of the simple terms.

Terms including ordinal numbers such as first and second may be used to describe various components, but the components are not limited by the terms. The terms are only used to distinguish one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The term and / or includes any one of a plurality of related items or a combination of a plurality of related items.

When any part of the specification is to "include" any component, this means that it may further include other components, except to exclude other components unless otherwise stated. In addition, the term "part" as used herein refers to a hardware component such as software, a field programmable gate array (FPGA), or an application specific integrated circuit (ASIC), and the "part" performs certain roles. However, "part" is not meant to be limited to software or hardware. The “unit” may be configured to be in an addressable storage medium and may be configured to play one or more processors. Thus, as an example, a "part" refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, Subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays and variables. The functionality provided within the components and "parts" may be combined into a smaller number of components and "parts" or further separated into additional components and "parts".

DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and like reference numerals designate like parts throughout the specification.

Referring to FIG. 1, a 360 degree image may be stored in various formats. For example, according to a unitsphere representation, the pixels that make up the frame of a 360 degree image can be indexed in a three-dimensional coordinate system that defines the location of each pixel on the surface of the virtual sphere 110. have.

However, this is merely an example, and equivalent two-dimensional representations such as cube map projection 120 or equilateral rectangular projection 130 may be used according to another example. In cube map projection 120, image data for each face of the virtual cube may be stored as a two-dimensional image over a 90 ° × 90 ° field of view. In addition, in equilateral rectangular projection 130, image data may be stored as a single two-dimensional image over a 360 ° × 180 ° field of view.

Meanwhile, in FIG. 1, the labels 'top', 'bottom', 'front', 'back', 'left' and 'right' indicate areas of 360-degree images respectively corresponding to the equivalent projection described above. . However, the formats shown in FIG. 1 are just examples, and according to another exemplary embodiment, the 360 degree image may be stored in a format different from the format shown in FIG. 1.

In operation S210, the image processing apparatus may acquire a plurality of motion vectors for the 360 degree image. According to one embodiment, an example of a motion vector in two-dimensional image data for a 360 degree image is shown in FIG. 4.

The motion vector is information describing the displacement of a predetermined area 411 of the image between the reference frame 401 and the current frame 402. In the present embodiment, the frame immediately before the image is selected as the reference frame 401, but in another embodiment, the motion vector may be calculated using the non-contiguous frame as the reference frame. In the present embodiment, to maximize the wide field of view of the frame of the 360-degree image, the motion vector may be obtained at a point uniformly distributed throughout the frame.

Meanwhile, although the 2D motion vector V is illustrated in FIG. 4, according to another embodiment, a plurality of 3D motion vectors may be obtained. For example, when image data for the current frame is stored using the unit sphere representation shown in FIG. 1, a 3D motion vector may be obtained.

The plurality of motion vectors obtained in this embodiment are motion vectors previously generated while encoding the image data of the frame of the 360 degree image. Motion vectors can generally be generated and stored in existing video encoding processes such as MPEG 4.2 or H.264 encoding. During encoding of the image, the motion vector can be used to compress the image data by reusing the blocks of the previous frame to draw the next frame. A detailed description of the method for generating the motion vector will be omitted.

Meanwhile, the previously generated motion vector may be retrieved from the stored 360 degree image file. Reusing motion vectors in this way can reduce the overall processing burden. According to another embodiment, when the 360 degree image file does not include a motion vector, the motion vector may be generated in step S210.

In operation S220, the image processing apparatus may determine at least one motion vector indicating global rotation of the 360 degree image among the plurality of motion vectors through filtering.

Here, 'global rotation' refers to a rotation that affects the image throughout the frame, unlike a local rotation that affects only part of the image. Global rotation can be the result of the camera being rotated while the image is being captured, or a large portion of the frame moving around the camera in the same way. For example, if a 360-degree image is taken from a moving vehicle, the rotation of the vehicle can cause global rotation in the background, and the rotation of the camera itself can cause global rotation in all parts of the vehicle visible in the background and foreground. . Rotation can be regarded as 'global rotation' when it affects a significant portion of the frame.

Examples of motion vectors that do not represent global rotation may include motion vectors associated with objects of relatively smaller movement in the scene, or motion vectors associated with static objects that do not appear to rotate when the camera rotates as they are fixed relative to the camera. have.

The image processing apparatus according to an embodiment may perform filtering to remove a motion vector included in a predetermined region among the plurality of motion vectors. This will be described later in more detail with reference to FIG. 5.

Also, the image processing apparatus according to another embodiment may perform filtering to generate a mask based on edges detected from the 360 degree image, and apply the generated mask to the 360 degree image to texture-free the 360 degree image. ) Motion vectors included in the region can be removed. This will be described later in more detail with reference to FIG. 6.

According to another exemplary embodiment, the image processing apparatus may perform filtering to remove a motion vector associated with a moving object in a 360 degree image.

The image processing apparatus according to another embodiment may determine whether a motion vector located on the opposite side of the unit sphere satisfies a specific condition, and determine whether the motion vector indicates global rotation to perform filtering. This will be described later in more detail with reference to FIG. 7.

Meanwhile, the image processing apparatus may combine two or more of the above-described filtering methods to remove a motion vector that does not represent global rotation among the plurality of motion vectors. As an example, other filtering methods may be used. Other embodiments in which the motion vector may be filtered may include, but are not limited to, static object filtering, background flow subtraction, and manual filtering. In static object filtering, static objects that do not change their position from one frame to the next may be detected, and motion vectors associated with the static objects may be filtered. Examples of static object types that can occur in 360-degree images include black pixels on the lens or the user's finger in front of the camera.

In the background flow subtraction, background pixels moving at a constant rate in the entire image may be excluded, assuming that they do not contain useful information for calculating the stabilization rotation. Manual filtering may include a human operator that manually filters the motion vector.

In operation S230, the image processing apparatus may obtain 3D rotation information about the 360 degree image by 3D transforming the determined at least one motion vector.

The image processing apparatus according to an embodiment may classify the determined at least one motion vector into a plurality of bins corresponding to a specific direction and a specific size range. The image processing apparatus may obtain the 3D rotation information by converting the direction and the distance of the bin including the most motion vectors among the plurality of classified bins. However, this is merely an example, and according to another example, the image processing apparatus may obtain the 3D rotation information by applying a weighted average to the directions and distances of the bin including the most motion vectors and the plurality of bins adjacent to the bin. have.

The image processing apparatus according to another exemplary embodiment may obtain, as 3D rotation information, a rotation value for minimizing the sum of the determined at least one motion vector.

According to another exemplary embodiment, the image processing apparatus may obtain 3D rotation information based on a plurality of motion vectors using a previously generated learning network model.

For example, humans can stabilize their gaze while maintaining their eye level by analyzing image shifts (similar to motion vectors) caused by movement to the environment as the body rotates. Similar behavior can be observed in simpler samples, such as flies, with relatively few neurons.

Neurons can convert sensory information into a format corresponding to their motor system requirements. Thus, in an AI-based embodiment, a machine learning mechanism may be used to mimic the behavior of living things and to obtain sensor rotational transformations using motion vectors as input data. In addition, in an AI based embodiment, a machine learning system may be used, such as a learning network model trained with a pattern of motion vectors in a frame having a particular rotation. Such mechanisms tend to mimic living beings and may receive a plurality of motion vectors as inputs and output an overall rotation for stabilizing a 360 degree image.

In operation S240, the image processing apparatus may correct the distortion of the 360 degree image due to the shaking based on the obtained 3D rotation information.

The image processing apparatus according to the exemplary embodiment may correct the distortion of the 360 degree image due to the shaking by rotating the 360 degree image according to the 3D rotation information. In addition, the image processing apparatus may render and display the corrected 360-degree image, or encode and store it for later playback.

According to an embodiment, all the steps of the method disclosed in FIG. 3 may be performed in the same apparatus, and each of the steps may be performed in different apparatuses. 3 may be performed by software or hardware according to an embodiment. When one or more steps are performed in software, an apparatus for performing the method disclosed in FIG. 3 includes a processing unit comprising one or more processors, and a computer reading storing computer program instructions executable by the processing unit to perform the method. Possible memory may be included.

In operation S310, the image processing apparatus may acquire a plurality of motion vectors for the current frame of the 360 degree image.

The image processing apparatus according to an embodiment may obtain a plurality of motion vectors by searching for a motion vector from a stored 360 degree image file or generating a motion vector at a point uniformly distributed throughout the frame.

Meanwhile, step S310 may correspond to step S210 described above with reference to FIG. 2.

In operation S320, the image processing apparatus may perform filtering on the plurality of motion vectors. In particular, in operation S320, the motion vector may be filtered to remove a motion vector that does not represent global rotation of the 360 degree image.

For example, the image processing device may filter to detect a motion vector associated with an object of relatively smaller movement in a frame or a motion vector associated with a static object that does not appear to rotate when the camera rotates as it is fixed relative to the camera. Can be removed Examples of various methods of filtering the motion vectors will be described in more detail later with reference to FIGS. 5 to 7.

Meanwhile, according to another embodiment, the motion vector may not be filtered, in which case step S320 may be omitted.

In operation S330, the image processing apparatus may convert the motion vector into 3D rotation.

According to an embodiment, the image processing apparatus filters a plurality of motion vectors to remove motion vectors that do not represent global rotation, and then the remaining motion vectors may be applied to the current frame to stabilize the 360 degree image. Can be converted to dimensional rotation.

For example, a 360 degree image is stored as two-dimensional image data via equilateral rectangular projection, and a pre-defined transform can be used to convert the motion vector into three-dimensional rotation. Pre-defined transformations may be predefined based on the geometry of the two-dimensional projection. In this embodiment, a transform according to the following equation (1) can be used.

[Equation 1]

In Equation 1, Rx, Ry, and Rz represent rotations in degrees about the x, y, and z axes, respectively, width represents the total width of the field of view in pixels, and height is the pixels. Represents the total height of the field of view, and the motion vector v can be expressed as (13, 8), for example, representing 13 pixels in the x-axis and 8 pixels in the y-axis. In this embodiment, it is assumed that the frame width in the horizontal direction is 36 pixels, which corresponds to 10 ° per pixel.

Therefore, using Equation 1 above, the horizontal component of the motion vector can be converted into equivalent rotation about the z axis of (360/36) * 13 = 130 degrees. In addition, the vertical component of the motion vector may be converted into equivalent rotation about the x or y axis depending on the position of the motion vector in the frame.

The overall rotation required to stabilize the 360 degree image can be expressed as a three-dimensional rotation, that is, a rotation in three-dimensional space. Rotation can be represented by three separate rotational components, such as axes perpendicular to one another, for example the x, y and z axes as shown in FIG. The rotation obtained in step S330 may be referred to as stabilizing rotation as the camera shake may be effectively corrected to stabilize the 360 degree image.

The overall rotation applied to stabilize the 360 degree image can be determined in various ways. For example, each motion vector may be converted to equivalent rotation as described above, and the average rotation (eg, average or mode) over the entire frame may be considered a full rotation. In some embodiments, a Gaussian or median filter may be used when taking the average in consideration of neighboring values around the average or mode value. Also, according to another embodiment, the average motion vector can be calculated for the entire frame, and the average motion vector can be transformed to full rotation using a predefined transformation.

Meanwhile, Equation 1 described above may be modified as needed in other embodiments. For example, when the 360 degree image is stored in a 3D format such as a unit sphere representation, Equation 1 described above may be modified.

In operation S340, the image processing apparatus may provide 3D rotation to the image processing unit to generate a stabilized image.

In operation S350, the image processing apparatus may generate a stabilized image by applying 3D rotation to image data of the current frame.

In addition, the image processing apparatus may render and display the stabilized image or encode and store it for later playback. In some embodiments, the stabilized image may be encoded using interframe compression. In this embodiment, more effective compression may be achieved based on the rotation applied to the stabilized image data. The image stabilization process described above modifies the frames of the original 360-degree image in a way that minimizes the difference between two consecutive frames of image data, which allows the encoder to reuse more information from previous frames, thereby interframe When performing compression, lower bit rates can be used. As a result, the amount of key frames generated can be reduced, and thus the compression rate can be improved.

Meanwhile, according to another exemplary embodiment, an analysis for determining a rotation for stabilizing an image may be performed in the first image processing apparatus, and the generating of the stabilized image may be performed by physically separating the first image processing apparatus from the first image processing apparatus. It may be performed by the second image processing apparatus. For example, in some embodiments, the first image processing apparatus may set the value of the 3D rotation parameter in the metadata associated with the 360 degree image according to the determined rotation.

In operation S340, the first image processing apparatus may provide metadata and associated image data to the second image processing apparatus through an appropriate mechanism such as a broadcast signal or a network connection. The second image processing apparatus may obtain a value of the 3D rotation parameter from the metadata to determine the rotation. Thereafter, in operation S350, the second image processing apparatus may generate the stabilized 360 degree image by applying the rotation defined by the 3D rotation parameter to the 360 degree image. In addition, the second image processing apparatus according to the exemplary embodiment applies a rotation and / or translation defined by the camera control input to the rotated image data before rendering the rotated image data, thereby stabilizing the 360 degree image. Can be generated.

Referring to FIG. 5, in an equilateral rectangular projection, the distance between the upper region 511 and the lower region 512 tends to be exaggerated, so that when the equilateral rectangular projection is used, the upper region 511 and the lower portion of the frame 500 are used. The motion vector in region 512 can include potentially large errors.

Accordingly, when an equilateral rectangular projection is used, the image processing apparatus according to an embodiment may determine the motion vectors of the upper region 511 and the lower region 512 among the plurality of motion vectors when calculating rotation for stabilization of a 360 degree image. Can be removed

Referring to FIG. 6, the image processing apparatus may generate a mask by performing edge detection on a frame and dilatating the frame. The image processing apparatus may apply a mask to the frame to remove the texture-free area, which is an area substantially free of texture.

In the example shown in FIG. 6, the black pixels in the mask represent areas where no edges are detected, which may mean areas that are substantially free of texture. For example, the mask may be thresholded to include only pixel values of 1 or 0, where 1 may represent white pixels and 0 may represent black pixels. The image processing apparatus may perform filtering by comparing the position of the motion vector in the 360 degree image with the pixel value of the mask and discarding the motion vector when the mask has the pixel value 0 at the position.

Meanwhile, the present embodiment has been described in which the motion vector of the texture-free region is removed through filtering. However, according to another embodiment, the motion vector may be filtered from another type of region that may include an unreliable motion vector. It may be. Examples of other types of regions that may include unreliable motion vectors may include regions exhibiting chaotic movement such as foliage or smoke.

Referring to FIG. 7, the image processing apparatus may perform filtering by using a fact that a global rotation in a 360 degree image generates motion vectors having similar magnitudes and opposite directions on opposite sides of the unit sphere. Specifically, the image processing apparatus compares one or more motion vectors in or near the unitary sphere with one or more corresponding motion vectors on the opposite side of the sphere, referred to as " mirror points, " It can be determined.

The image processing apparatus may determine that two motion vectors opposite to each other have a magnitude within a specific threshold (eg, ± 10%), are parallel to each other, and have signs in opposite directions, and are motion vectors indicating global rotation. When the image processing apparatus determines that the motion vector represents the global rotation, the image processing apparatus may use it to determine the rotation for stabilization of the 360 degree image.

Steps S810 to S890 described with reference to FIG. 8 may be performed between steps S310 and S330 described above with reference to FIG. 3.

In operation S810, the image processing apparatus may filter motion vectors of at least one region of the plurality of motion vectors for the 360 degree image. For example, when an equilateral rectangular projection is used for a 360 degree image, the image processing apparatus may remove motion vectors in an upper region and a lower region of the 360 degree image through filtering.

In operation S820, the image processing apparatus may generate a mask for filtering the texture-free area. For example, the image processing apparatus may generate a mask by performing edge detection on a 360 degree image and expanding the same.

In operation S830, the image processing apparatus may apply a mask to the current frame to filter the motion vector of the texture-free region. For example, the image processing apparatus compares the position of the motion vector in the 360 degree image with the pixel value of the mask, and removes the motion vector if the mask has pixel value 0 (the area where no edge is detected) at that position, Filtering can be performed.

In operation S840, the image processing apparatus may detect an object moving in the 360 degree image. The image processing apparatus may detect one or more moving objects within a 360 degree image by using an appropriate object detection algorithm among existing object detection algorithms.

In operation S850, the image processing apparatus may filter a motion vector associated with the moving object. The image processing apparatus may remove the motion vector associated with the moving object among the remaining motion vectors through filtering. The motion vector associated with the moving object can be much larger in size than other motion vectors. Accordingly, the image processing apparatus may filter the motion vector so that the stabilization rotation is not distorted by the large motion vector due to the fast moving object.

In operation S860, the image processing apparatus may compare motion vectors on opposite sides of the sphere.

In operation S870, the image processing apparatus may determine whether the motion vector corresponds to the global rotation. For example, the image processing apparatus may determine that two motion vectors opposite to each other have a magnitude within a specific threshold (eg, ± 10%) and are parallel to each other and have signs in opposite directions. .

In operation S880, as the image processing apparatus determines that the motion vector corresponds to the global rotation, the image processing apparatus may maintain the motion vector.

In operation S890, as the image processing apparatus determines that the motion vector does not correspond to the global rotation, the image processing apparatus may exclude the motion vector when calculating the rotation.

In operation S910, the image processing apparatus may classify the plurality of motion vectors into a plurality of bins corresponding to a specific size range in a specific direction.

A detailed method of classifying a plurality of motion vectors into a plurality of bins by the image processing apparatus will be described with reference to FIGS. 10 to 12.

Referring to FIG. 10, FIG. 10 illustrates a motion vector for a 360 degree image after applying the mask illustrated in FIG. 6. In the present embodiment, for simplicity of explanation, only the motion vector in the horizontal (x-axis) direction is shown. However, this is merely an example, and the method applied to the present embodiment may be extended to motion vectors of other axes to determine three-dimensional rotation.

Referring to FIG. 11, the distance associated with a particular bin may be converted into an equivalent angle using a predetermined transformation as described above with reference to step S330 of FIG. 3. In the present embodiment, it can be seen that the motion vector has a value between -1 and +12.

Referring to FIG. 12, as a result of the classification, it may be confirmed that the most motion vector is included in the bin at the distance 7 corresponding to the 20th.

9, in operation S920, the image processing apparatus may identify bins including the largest number of motion vectors among the plurality of bins. As described above with reference to FIG. 12, the image processing apparatus may identify that the most motion vector is included in the bin at a distance of 7.

In operation S930, the image processing apparatus may calculate a rotation based on a weighted average based on the identified bin and the neighboring bin.

The distance 7 corresponding to the bin identified in step S920 described above is equivalent to a rotation of 0.043 radians (2.46 °). The image processing apparatus according to an embodiment may determine a rotation for stabilizing a 360 degree image by converting a distance corresponding to the identified bin into an equivalent rotation by using a predetermined transformation.

In the present embodiment, the analysis is performed based on a 360 degree image in which the actual camera rotation is measured in 0.04109753 radians. It can be seen that it is a reasonable estimate of the actual camera rotation.

Meanwhile, according to another exemplary embodiment, the image processing apparatus may calculate the rotation using a weighted average across the bins identified in step S920 and the plurality of neighboring bins in order to increase the accuracy of the obtained rotation value. As an example of the weighted average, a 3-amplitude Gaussian weighted average may be used. However, this is merely an example, and other types of weighted averages may be used according to other embodiments. Applying the weighted average in this embodiment, an estimated rotation of 0.04266 radians is obtained, which is closer to the actual camera rotation of 0.04109753.

As another alternative to the above-described method of converting a motion vector into a three-dimensional rotation, in another embodiment, to determine the entire motion field M according to the following equation 2 for a frame of a 360 degree image, a plurality of The rotation can be determined by summing the motion vectors vj.

[Equation 2]

The three-dimensional rotation for stabilizing the 360 degree image may be obtained by determining a rotation R that minimizes the entire motion field as shown in Equation 3 below.

[Equation 3]

In operation S1310, the image processing apparatus may determine at least one motion vector indicating global rotation of the 360 degree image among the plurality of motion vectors with respect to the 360 degree image.

Meanwhile, step S1310 may correspond to step S220 described above with reference to FIG. 2.

In operation S1320, the image processing apparatus may obtain 3D rotation information by converting the determined at least one motion vector.

Meanwhile, step S1320 may correspond to step S230 described above with reference to FIG. 2.

In operation S1330, the image processing apparatus may re-determine the rotation information of the 360 degree image by combining the sensor data and the rotation information regarding the shaking obtained when the 360 degree image is captured.

For example, the image processing apparatus may be set to acquire sensor data regarding shaking of the photographing apparatus while a 360 degree image is captured. The image processing apparatus may consider sensor data when determining rotation. For example, the image processing apparatus may verify rotation information obtained by analyzing the motion vector using the sensor data, or verify rotation information obtained through the sensor data using the rotation information obtained by analyzing the motion vector.

According to another example, the image processing apparatus may merge sensor data into rotation information obtained by analyzing a motion vector. For example, the result of analyzing the sensor data and the motion data may be merged by applying a weight to the sensor data and the motion vector analysis result according to the relative error margin of the sensor data with respect to the rotation information obtained by analyzing the motion vector. This approach may be advantageous in scenarios in which the rotation calculated using the motion vector may have a larger error than the measurement obtained by the sensor. For example, the case where the scene has a large area without texture may be included in the above-described scenario. In this situation, more weight may be given to the sensor data. Sensors, on the other hand, can suffer from drift problems. The drift problem can be mitigated by combining the sensor data with the rotation computed in the motion vector.

14 is a block diagram of an image processing apparatus 1400, according to an exemplary embodiment.

Referring to FIG. 14, the image processing apparatus 1400 may include at least one processor 1410 and a memory 1420. However, this is only an exemplary embodiment, and components of the image processing apparatus 1400 are not limited to the above-described example.

At least one processor 1410 may perform the processing method of the 360-degree image described above with reference to FIGS. 1 to 13. For example, the at least one processor 1410 may obtain a plurality of motion vectors for the 360 degree image. The at least one processor 1410 may determine at least one motion vector indicating global rotation of the 360 degree image among the plurality of motion vectors through filtering. In addition, the at least one processor 1410 may obtain the 3D rotation information about the 360 degree image by converting the determined at least one motion vector. The at least one processor 1410 may correct the distortion of the 360 degree image due to the shaking based on the obtained 3D rotation information.

The memory 1420 may store programs (one or more instructions) for processing and controlling the at least one processor 1410. Programs stored in the memory 1420 may be divided into a plurality of modules according to their functions.

According to an embodiment, the memory 1420 may be configured as a software module and a data learner and a data recognizer, which will be described later with reference to FIG. 15. In addition, the data learning unit and the data recognizing unit may each independently include a learning network model, or share one learning network model.

15 is a diagram for describing at least one processor 1410 according to an exemplary embodiment.

Referring to FIG. 15, at least one processor 1410 may include a data learner 1510 and a data recognizer 1520.

The data learner 1510 may learn a criterion for obtaining 3D rotation information from a plurality of motion vectors for a 360 degree image. The data recognizer 1520 may determine 3D rotation information from the plurality of motion vectors for the 360 degree image based on the criteria learned by the data learner 1510.

At least one of the data learner 1510 and the data recognizer 1520 may be manufactured in the form of at least one hardware chip and mounted on the image processing apparatus. For example, at least one of the data learner 1510 and the data recognizer 1520 may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or an existing general purpose processor (eg, a CPU). Alternatively, the processor may be manufactured as a part of an application processor or a graphics processor (eg, a GPU) and mounted on the aforementioned various image processing apparatuses.

In this case, the data learner 1510 and the data recognizer 1520 may be mounted in one image processing apparatus, or Each may be mounted on separate image processing apparatuses. For example, one of the data learner 1510 and the data recognizer 1520 may be included in the image processing apparatus, and the other may be included in the server. In addition, the data learner 1510 and the data recognizer 1520 may provide model information constructed by the data learner 1510 to the data recognizer 1520 via a wired or wireless connection. The data input to 1520 may be provided to the data learner 1510 as additional learning data.

Meanwhile, at least one of the data learner 1510 and the data recognizer 1520 may be implemented as a software module. When at least one of the data learner 1510 and the data recognizer 1520 is implemented as a software module (or a program module including instructions), the software module may be computer readable non-transitory readable. It may be stored in a non-transitory computer readable media. In this case, at least one software module may be provided by an operating system (OS) or by a predetermined application. Alternatively, some of the at least one software module may be provided by an operating system (OS), and others may be provided by a predetermined application.

16 is a block diagram of the data learner 1510, according to an exemplary embodiment.

Referring to FIG. 16, the data learner 1510 may include a data acquirer 1610, a preprocessor 1620, a training data selector 1630, a model learner 1640, and a model evaluator ( 1650). However, this is only an example, and the data learning unit 1510 may be configured with fewer components than those described above, or other components may be additionally included in the data learning unit 1510.

The data acquirer 1610 may acquire at least one 360 degree image as learning data. For example, the data acquirer 1610 acquires at least one 360 degree image from an image processing apparatus including the data learning unit 1510 or an external device that can communicate with the image processing apparatus including the data learning unit 1510. can do.

The preprocessor 1620 may process the obtained at least one 360 degree image in a preset format so that the model learner 1640, which will be described later, uses the at least one 360 degree image acquired for learning.

The training data selector 1630 may select a 360 degree image for learning from the preprocessed data. The selected 360 degree image may be provided to the model learner 1640. The training data selector 1630 may select a 360 degree image for learning from the preprocessed 360 degree images according to the set criteria.

The model learner 1640 may learn a criterion about whether to determine the 3D rotation information from the plurality of motion vectors by using some information from the 360 degree image in the plurality of layers in the learning network model.

In addition, the model learner 1640 may train the data recognition model, for example, through reinforcement learning using feedback on whether the acquired 360-degree image is suitable for learning.

In addition, when the data recognition model is trained, the model learner 1640 may store the trained data recognition model.

The model evaluator 1650 may input evaluation data into the learning network model, and if the recognition result output from the evaluation data does not satisfy a predetermined criterion, the model evaluator 1640 may retrain the model. In this case, the evaluation data may be preset data for evaluating the learning network model.

At least one of the data acquirer 1610, the preprocessor 1620, the training data selector 1630, the model learner 1640, and the model evaluator 1650 in the data learner 1510 may be at least one. It may be manufactured in the form of a hardware chip and mounted on an image processing apparatus. For example, at least one of the data acquirer 1610, the preprocessor 1620, the training data selector 1630, the model learner 1640, and the model evaluator 1650 may be artificial intelligence (AI). It may be manufactured in the form of a dedicated hardware chip, or may be manufactured as part of an existing general purpose processor (eg, a CPU or an application processor) or a graphics dedicated processor (eg, a GPU) and mounted on the above-described various image processing apparatuses.

In addition, the data acquirer 1610, the preprocessor 1620, the training data selector 1630, the model learner 1640, and the model evaluator 1650 may be mounted in one image processing apparatus or may be separate. May be mounted on the respective image processing apparatuses. For example, some of the data acquirer 1610, the preprocessor 1620, the training data selector 1630, the model learner 1640, and the model evaluator 1650 are included in the image processing apparatus, and some of the remaining data are included in the image processing apparatus. May be included in the server.

In addition, at least one of the data acquirer 1610, the preprocessor 1620, the training data selector 1630, the model learner 1640, and the model evaluator 1650 may be implemented as a software module. A program in which at least one of the data acquisition unit 1610, the preprocessor 1620, the training data selector 1630, the model learner 1640, and the model evaluator 1650 includes a software module (or instruction). Module may be stored on a computer readable non-transitory computer readable media. In this case, at least one software module may be provided by an operating system (OS) or by a predetermined application. Alternatively, some of the at least one software module may be provided by an operating system (OS), and others may be provided by a predetermined application.

17 is a block diagram of a data recognizer 1520 according to an embodiment.

Referring to FIG. 17, the data recognizer 1520 according to some embodiments includes a data acquirer 1710, a preprocessor 1720, a recognition data selector 1730, a recognition result provider 1740, and a model updater. (1750).

The data acquirer 1710 may acquire at least one 360 degree image, and the preprocessor 1720 may preprocess the obtained at least one 360 degree image. The preprocessor 1720 may generate at least one 360-degree image so that the recognition result provider 1740, which will be described later, may use the at least one 360-degree image obtained for the determination of the 3D rotation information for the plurality of motion vectors. Can be processed in a preset format. The recognition data selector 1730 may select a motion vector required for determining 3D rotation information among a plurality of motion vectors included in the preprocessed data. The selected motion vector may be provided to the recognition result provider 1740.

The recognition result provider 1740 may determine 3D rotation information based on the selected motion vector. In addition, the recognition result providing unit 1740 may provide the determined 3D rotation information.

The model updater 1750 based on the evaluation of the 3D rotation information provided by the recognition result providing unit 1740, provides information on the evaluation so that the parameters of the layers included in the learning network model are updated. Reference may be provided to the model learner 1640 described above.

Meanwhile, at least one of the data acquirer 1710, the preprocessor 1720, the recognition data selector 1730, the recognition result provider 1740, and the model updater 1750 in the data recognizer 1520 may be at least It may be manufactured in the form of one hardware chip and mounted on an image processing apparatus. For example, at least one of the data acquirer 1710, the preprocessor 1720, the recognition data selector 1730, the recognition result provider 1740, and the model updater 1750 may be a dedicated hardware chip for artificial intelligence. It may be manufactured in the form, or may be manufactured as a part of an existing general purpose processor (eg, a CPU or an application processor) or a graphics dedicated processor (eg, a GPU) and mounted on the above-described various image processing apparatuses.

In addition, the data acquirer 1710, the preprocessor 1720, the recognition data selector 1730, the recognition result provider 1740, and the model updater 1750 may be mounted in one image processing apparatus, or Each may be mounted on separate image processing apparatuses. For example, some of the data acquirer 1710, the preprocessor 1720, the recognition data selector 1730, the recognition result provider 1740, and the model updater 1750 are included in the image processing apparatus. Some may be included in the server.

In addition, at least one of the data acquirer 1710, the preprocessor 1720, the recognition data selector 1730, the recognition result provider 1740, and the model updater 1750 may be implemented as a software module. At least one of the data acquirer 1710, the preprocessor 1720, the recognition data selector 1730, the recognition result provider 1740, and the model updater 1750 includes a software module (or instruction). If implemented as a program module, the software module may be stored in a computer readable non-transitory computer readable media. In this case, at least one software module may be provided by an operating system (OS) or by a predetermined application. Alternatively, some of the at least one software module may be provided by an operating system (OS), and others may be provided by a predetermined application.

Referring to FIG. 18, in the present embodiment, the image processing apparatus includes a first apparatus 1800 for analyzing a 360 degree image to determine three-dimensional rotation information, and includes a rotation provided by the first apparatus 1800. It may include a second device 1810 for generating a stabilized image based on. In other embodiments, some or all of the components of the first device 1800 and the second device 1810 may be implemented as a single physical device.

The first device 1800 converts the motion vector obtaining unit 1801 to obtain a plurality of motion vectors for the 360 degree image, and converts the plurality of motion vectors into three-dimensional rotation and converts the three-dimensional rotation to the second device 1810. It may include a motion vector conversion unit 1802 providing the included image processing unit 1811.

The second device 1810 can include an image processing unit 1811 and a display 1812 that displays a stabilized 360 degree image rendered by the image processing unit 1811. In addition, the second device 1810 may further include an input unit 1813 configured to receive a control input of the imaging device defining the rotation and / or the transformation.

Method according to an embodiment of the present invention is implemented in the form of program instructions that can be executed by various computer means may be recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.

According to an embodiment, a device may include a processor, a memory for storing and executing program data, a persistent storage such as a disk drive, a communication port for communicating with an external device, a touch panel, a key, a user interface such as a button, and the like. Device and the like. Methods implemented by software modules or algorithms may be stored on a computer readable recording medium as computer readable codes or program instructions executable on the processor. The computer-readable recording medium may be a magnetic storage medium (eg, read-only memory (ROM), random-access memory (RAM), floppy disk, hard disk, etc.) and an optical reading medium (eg, CD-ROM). ) And DVD (Digital Versatile Disc). The computer readable recording medium can be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. The medium is readable by the computer, stored in the memory, and can be executed by the processor.

In the embodiments illustrated in the drawings, reference numerals have been used, and specific terms have been used to describe the embodiments, but the present invention is not limited to the specific terms, and the embodiments are all contemplated by those skilled in the art. It may contain elements.

An embodiment may be represented by functional block configurations and various processing steps. Such functional blocks may be implemented in various numbers of hardware or / and software configurations that perform particular functions. For example, an embodiment may include an integrated circuit configuration such as memory, processing, logic, look-up table, etc. that may execute various functions by the control of one or more microprocessors or other control devices. You can employ them. Also, an embodiment may employ the same or different types of cores, different types of CPUs. Similar to the components in the present invention may be implemented in software programming or software elements, embodiments include C, C ++, including various algorithms implemented in combinations of data structures, processes, routines or other programming constructs. It may be implemented in a programming or scripting language such as Java, an assembler, or the like. The functional aspects may be implemented with an algorithm running on one or more processors. In addition, the embodiment may employ the prior art for electronic configuration, signal processing, and / or data processing. Terms such as "mechanism", "element", "means", "configuration" can be used widely and are not limited to mechanical and physical configurations. The term may include the meaning of a series of routines of software in conjunction with a processor or the like.

Specific implementations described in the embodiments are examples, and do not limit the scope of the embodiments in any way. For brevity of description, descriptions of conventional electronic configurations, control systems, software, and other functional aspects of the systems may be omitted. In addition, the connection or connection members of the lines between the components shown in the drawings by way of example shows a functional connection and / or physical or circuit connections, in the actual device replaceable or additional various functional connections, physical It may be represented as a connection, or circuit connections. In addition, unless specifically mentioned, such as "essential", "important" may not be a necessary component for the application of the present invention.

In the specification of the embodiments (particularly in the claims), the use of the term “above” and the like indicating term may be used in the singular and the plural. In addition, when the range is described in the examples, the invention includes the invention in which the individual values belonging to the range are applied (unless stated to the contrary). . Finally, if there is no explicit order or contradiction with respect to the steps constituting the method according to the embodiment, the steps may be performed in a suitable order. The embodiments are not necessarily limited according to the description order of the steps. The use of all examples or exemplary terms (eg, etc.) in the embodiments is merely for describing the embodiments in detail, and the scope of the embodiments is limited by the above examples or exemplary terms unless the scope of the claims is defined. It is not. In addition, one of ordinary skill in the art appreciates that various modifications, combinations and changes can be made depending on design conditions and factors within the scope of the appended claims or equivalents thereof.

Claims

Obtaining a plurality of motion vectors for the 360 degree image;

Determining at least one motion vector representing a global rotation of the 360 degree image among the plurality of motion vectors through filtering;

Obtaining three-dimensional rotation information of the 360 degree image by three-dimensionally transforming the determined at least one motion vector; And

Correcting the distortion of the 360-degree image due to the shaking based on the obtained three-dimensional rotation information.
The method of claim 1, wherein the obtaining of the 3D rotation information comprises:

Classifying the determined at least one motion vector into a plurality of bins corresponding to a specific direction and a specific size range;

Selecting a bin containing the most motion vectors of the classified plurality of bins;

And converting the direction and distance of the selected bin to obtain the 3D rotation information.
The method of claim 1, wherein the obtaining of the 3D rotation information comprises:

And obtaining the 3D rotation information based on the plurality of motion vectors using a previously generated learning network model.
The method of claim 1,

Acquiring sensor data generated as a result of sensing the shaking caused when the 360-degree image is captured by a photographing apparatus;

Correcting the distortion of the 360-degree image,

And combining the obtained sensor data with the 3D rotation information to correct distortion of the 360 degree image.
Memory for storing one or more instructions; And

A processor for executing the one or more instructions stored in the memory;

The processor,

Acquire a plurality of motion vectors for a 360 degree image,

Determining at least one motion vector representing a global rotation of the 360 degree image among the plurality of motion vectors by filtering,

3D transforming the determined at least one motion vector to obtain 3D rotation information of the 360 degree image,

And correcting the distortion of the 360 degree image due to shaking based on the obtained 3D rotation information.
The method of claim 5, wherein the processor,

And removing a motion vector included in a region determined according to the type of the projection among the plurality of motion vectors.
The method of claim 5, wherein the processor,

A mask is generated based on the detected edges from the 360 degree image,

The generated mask is applied to the 360 degree image to determine an area where no texture exists in the 360 degree image.

Apparatus for processing a 360-degree image to remove a motion vector included in the region of the plurality of motion vectors, the texture does not exist.
The method of claim 5, wherein the processor,

At least one moving object is detected from the 360 degree image through a preset object detection process,

And remove a motion vector associated with the detected object among the plurality of motion vectors.
The method of claim 5, wherein the processor,

Motion vectors located on opposite sides of the unit sphere from which the 360-degree image is projected among the plurality of motion vectors are parallel to each other, have opposite signs, and represent motion vectors having a magnitude within a specific threshold. Device for processing 360-degree images, determined by motion vectors.
The method of claim 5, wherein the processor,

Classify the determined at least one motion vector into a plurality of bins corresponding to a specific direction and a specific size range,

Selecting a bin including the most motion vector among the classified bins,

And converting the direction and distance of the selected bin to obtain the 3D rotation information.
The method of claim 10, wherein the processor,

And applying the weighted average to the directions and distances of the selected bin and a plurality of bins adjacent to the selected bin to obtain the three-dimensional rotation information.
The method of claim 5, wherein the processor,

And obtaining a rotation value for minimizing the sum of the determined at least one motion vector as the 3D rotation information.
The method of claim 5, wherein the processor,

And obtaining the 3D rotation information based on the plurality of motion vectors using a previously generated learning network model.
The method of claim 5, wherein the processor,

Acquiring sensor data generated as a result of sensing the shaking generated when the 360-degree image is captured by a photographing device

And combining the obtained sensor data with the 3D rotation information to correct distortion of the 360 degree image.
A computer-readable recording medium having recorded thereon a program for executing the method of claim 1 on a computer.