CN110648299A - Image processing method, image processing apparatus, and computer-readable storage medium - Google Patents

Image processing method, image processing apparatus, and computer-readable storage medium Download PDF

Info

Publication number
CN110648299A
CN110648299A CN201810670236.4A CN201810670236A CN110648299A CN 110648299 A CN110648299 A CN 110648299A CN 201810670236 A CN201810670236 A CN 201810670236A CN 110648299 A CN110648299 A CN 110648299A
Authority
CN
China
Prior art keywords
image
panoramic
semantic information
local
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810670236.4A
Other languages
Chinese (zh)
Inventor
廖可
张宇鹏
王炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liguang Co
Ricoh Co Ltd
Original Assignee
Liguang Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liguang Co filed Critical Liguang Co
Priority to CN201810670236.4A priority Critical patent/CN110648299A/en
Publication of CN110648299A publication Critical patent/CN110648299A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06T5/73
    • G06T5/80
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The embodiment of the invention provides an image processing method, an image processing device and a computer readable storage medium, wherein the image processing method comprises the following steps: acquiring a panoramic image and one or more local images within the panoramic image; acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image; determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas; and obtaining image description information by using the panoramic semantic information and the detail semantic information.

Description

Image processing method, image processing apparatus, and computer-readable storage medium
Technical Field
The present application relates to the field of image processing, and in particular, to an image processing method, an image processing apparatus, and a computer-readable storage medium.
Background
Multi-sensor imaging systems consist of multiple and/or multiple sensors located at the same or different locations. After image or video data is acquired by the multi-sensor imaging system, a plurality of image or video information from the multi-sensor may be processed to output corresponding image processing results.
In the prior art, when an image acquired by a multi-sensor imaging system includes a panoramic image and one or more local images within a panoramic image range, the obtained panoramic image and the local images are generally subjected to fusion processing, and a panoramic fusion image after fusion is obtained. However, only the panoramic fused image obtained by simply fusing the panoramic image and the local image cannot obtain all application information desired by the user, such as related semantic information and description information.
Disclosure of Invention
To solve the above technical problem, according to an aspect of the present invention, there is provided an image processing method including: acquiring a panoramic image and one or more local images within the panoramic image; acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image; determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas; and obtaining image description information by using the panoramic semantic information and the detail semantic information.
According to another aspect of the present invention, there is provided an image processing apparatus comprising: an acquisition unit that acquires a panoramic image and one or more partial images within the panoramic image; the semantic dividing unit acquires panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic dividing areas in the panoramic image; a focus area obtaining unit, which determines one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and obtains detail semantic information according to the determined focus areas; and the description unit is used for obtaining image description information by utilizing the panoramic semantic information and the detail semantic information.
According to another aspect of the present invention, there is provided an image processing apparatus comprising: a processor; and a memory having computer program instructions stored therein, wherein the computer program instructions, when executed by the processor, cause the processor to perform the steps of: acquiring a panoramic image and one or more local images within the panoramic image; acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image; determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas; and obtaining image description information by using the panoramic semantic information and the detail semantic information.
According to another aspect of the invention, there is provided a computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the steps of: acquiring a panoramic image and one or more local images within the panoramic image; acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image; determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas; and obtaining image description information by using the panoramic semantic information and the detail semantic information.
According to the image processing method, the image processing device and the computer readable storage medium of the invention, the panoramic semantic information and the detail semantic information can be respectively acquired for the panoramic image and one or more local images within the panoramic image range, and the image description information can be obtained according to the panoramic semantic information and the detail semantic information. Therefore, the image description information obtained by the method, the device and the computer readable storage medium of the invention can give consideration to the panoramic semantic information of the scene description of the panoramic image and the detail semantic information of the detail description of the focal region of the local image, thereby improving the accuracy of the image description and being effectively applied to the fields of automatic driving, robot interaction and the like.
Drawings
The above and other objects, features, and advantages of the present invention will become more apparent from the detailed description of the embodiments of the present invention when taken in conjunction with the accompanying drawings.
FIG. 1 shows a flow diagram of an image processing method of one embodiment of the invention;
FIG. 2(a) shows a panoramic image according to one embodiment of the present invention; fig. 2(b) illustrates an infrared partial image according to an embodiment of the present invention, and fig. 2(c) illustrates a panorama fused image obtained by fusing the panorama image of fig. 2(a) and the infrared partial image of fig. 2(b) according to an embodiment of the present invention;
FIG. 3(a) shows a panoramic fusion image in accordance with one embodiment of the present invention; fig. 3(b) shows a coordinate-transformed local fusion image obtained by coordinate-transforming the local fusion image in fig. 3 (a); fig. 3(c) is a schematic diagram showing the positions of the clear area 1 and the blurred area 2 in the local fusion image after coordinate transformation in fig. 3 (b);
FIG. 4(a) shows a panoramic fusion image in accordance with one embodiment of the present invention; fig. 4(b) shows a coordinate-transformed local fusion image obtained by coordinate-transforming the local fusion image in fig. 4 (a); FIG. 4(c) shows the local fusion image after coordinate transformation of FIG. 4(b) being resampled; fig. 4(d) shows that the local fusion image resampled in fig. 4(c) is subjected to coordinate inverse transformation to obtain a panoramic image;
FIG. 5 shows a schematic diagram of a panoramic image, according to one embodiment of the present invention;
FIG. 6 illustrates a schematic location of a local image and a focal region in a panoramic image, according to one embodiment of the present invention;
FIG. 7 shows a block diagram of an image processing apparatus according to an embodiment of the invention;
fig. 8 shows a block diagram of an image processing apparatus according to an embodiment of the present invention.
Detailed Description
An image processing method, an image processing apparatus, and a computer-readable storage medium according to embodiments of the present invention will be described below with reference to the accompanying drawings. In the drawings, like reference numerals refer to like elements throughout. It should be understood that: the embodiments described herein are merely illustrative and should not be construed as limiting the scope of the invention.
An image processing method according to an embodiment of the present invention will be described below with reference to fig. 1. The image processing method of the embodiment of the present invention may be applied to a still image, a video frame in a video that changes with time, and the like, and is not limited herein. Fig. 1 shows a flow chart of the image processing method 100.
As shown in fig. 1, in step S101, a panoramic image and one or more partial images within the panoramic image are acquired.
In this step, the panoramic image and the one or more partial images may be acquired using a multi-sensor system. The series of images acquired by the multi-sensor imaging system may include a panoramic image acquired by a panoramic sensor in the multi-sensor imaging system and one or more partial images within the panoramic image acquired by one or more partial sensors in the multi-sensor imaging system. Here, the panoramic image may be acquired by photographing scene image information of, for example, 360 degrees by a wide angle technique by a panoramic sensor, and may be further mapped into a two-dimensional image by conversion of a longitude and latitude coordinate system. Accordingly, within the range of the scene shot by the panoramic image, one or more local images can also be acquired by one or more local sensors. Among these, the local sensor may be, for example: one or more of a high-definition sensor, an infrared sensor, a light field sensor, a point cloud sensor, a stereoscopic vision sensor and a laser sensor. By means of the above-mentioned local sensors, the corresponding, for example: one or more of a high-definition partial image, an infrared partial image, a light field partial image, a point cloud partial image, a stereoscopic vision partial image, and a laser partial image.
After the panoramic image and the one or more local images in the scene range of the panoramic image are acquired through the multi-sensor system, further, the one or more local images can be fused according to the position of the acquired local image to obtain a local fusion image, wherein the local fusion image is in one-to-one correspondence with the local images. And finally, the panoramic image and the fused local image can be fused to obtain a panoramic fused image.
Fig. 2 shows a schematic diagram of a panoramic image, a partial image and a panoramic fusion image according to an embodiment of the present invention. Specifically, fig. 2(a) is a panoramic image acquired by using a panoramic sensor in an embodiment of the present invention; fig. 2(b) shows an infrared partial image obtained by an infrared sensor, and fig. 2(c) shows a panoramic fused image obtained by fusing the panoramic image in fig. 2(a) and the infrared partial image in fig. 2 (b). As shown in fig. 2(c), the infrared partial image in fig. 2(b) may be subjected to a fusion process, and the processed partial fusion image may be fused to the central region of the panoramic image in fig. 2 (a).
In this step, an independent panoramic image and one or more partial images may be acquired, respectively; alternatively, a panoramic fusion image such as that shown in fig. 2(c) may be obtained in an initial stage, and a separated panoramic image and a local image are obtained by processing according to the panoramic fusion image, so as to be processed in a subsequent step.
In one example, when the image obtained at the initial stage is a panoramic fusion image, one or more local fusion images may be obtained from the panoramic fusion image based on the positions of the one or more local fusion images in the panoramic fusion image; and then the local fusion image is processed to respectively obtain a local image and/or a panoramic image. In practical applications, optionally, the position of the local fusion image in the panoramic fusion image may be obtained through position information of the local fusion image included in the panoramic fusion image, for example, the position information may be known through metadata (metadata) of the panoramic fusion image, or may be known through a related description in a picture file of the panoramic fusion image. After the position of the local fusion image (or the local image) in the panoramic fusion image is known, the local fusion image can be separated from the panoramic fusion image. Here, the acquired local fusion image is generally a local fusion image in which a local image is distorted so as to be adapted to the longitude and latitude coordinate system of the panoramic image. Thus, optionally, to obtain a local image without distortion, in one example, the one or more local fused images may be coordinate transformed to remove distortion to obtain the local image. In another example, the locally fused image may also be coordinate transformed first to remove distortion; subsequently acquiring one or more image-related features for the coordinate-transformed locally fused image (e.g., a search may be initiated from the center of the coordinate-transformed locally fused image to acquire image pixel-level features such as image resolution and/or focus information); and finally, removing the fuzzy area of the local fusion image after coordinate transformation according to the characteristics of the acquired image to obtain the local image required in the step.
Fig. 3 shows a schematic diagram of obtaining a partial image from a panoramic fusion image. Fig. 3(a) shows a panoramic fused image, with the locations of the partially fused images outlined by dashed lines, according to one embodiment of the invention. Fig. 3(b) shows the coordinate-transformed local fusion image obtained by coordinate-transforming the local fusion image in fig. 3(a) to remove distortion. Further, feature extraction may be performed in the image of fig. 3(b), to acquire features such as image resolution and/or focus information, and to remove a blurred region from the coordinate-transformed local fusion image to obtain a local image. The blurred region of the local fusion image may be, for example, a transition region of some lines or colors in the image, or an edge region of the image. Fig. 3(c) shows a schematic position diagram of the clear region 1 and the blurred region 2 in the local fusion image after coordinate transformation in fig. 3 (b). Wherein, the area 1 in the central square frame is a clear area, and the area 2 nested by the two square frames is a fuzzy area. In one example, the blurred region 2 may be processed and removed using the extracted image features to make the resulting partial image (not shown) sufficiently sharp.
Optionally, when the image acquired at the initial stage is a panoramic fusion image, the panoramic image may also be acquired based on the panoramic fusion image and a local fusion image acquired according to the position of the panoramic fusion image. Specifically, the one or more locally fused images may first be coordinate transformed to remove distortion; subsequently acquiring one or more image-related features for the coordinate-transformed locally fused image, e.g. features at the image pixel level, such as image resolution and/or focus information, may be acquired from the surroundings of the coordinate-transformed locally fused image; next, considering that the sensors used for initially acquiring the panoramic image and the local image are different, so that the images may have different image characteristics such as image resolution and focus information, and that the fused panoramic fusion image and the local fusion image may also have correspondingly different image characteristics, the local fusion image after coordinate transformation may be resampled using the acquired image-related characteristics to adjust the local fusion image after coordinate transformation to have the same characteristics (image resolution, focus information, etc.) as the panoramic fusion image around it; and finally, performing coordinate inverse transformation (namely, the coordinate transformation processing process is opposite to the coordinate transformation processing process) on the resampled local fusion image, namely, projecting the resampled local fusion image back to a coordinate system of the panoramic fusion image for fusion after performing distortion processing on the resampled local fusion image so as to replace the area of the original local fusion image in the panoramic fusion image with the processed local fusion image with the same image characteristics as the panoramic fusion image, so as to obtain the panoramic image. In one example of the invention, resampling has different modes of operation for different local sensors in a multi-sensor imaging system. For example, the high-definition sensor may perform pixel resampling of high-definition data, the infrared sensor may perform visible light acquisition processing of infrared data, the light field sensor may perform resolution and focusing information averaging and adjustment of light field data, the point cloud sensor may perform projection and pixel supplementation of point cloud data, and the stereo vision sensor may perform depth information removal and resolution adjustment of stereo vision data. The above resampling process and method are only examples, and in practical applications, any relevant resampling process method may be adopted, and is not limited herein.
Fig. 4 shows a schematic diagram of acquiring a panoramic image from a panoramic fusion image. Fig. 4(a) shows a panoramic fusion image in which the positions of the partial fusion images are outlined by dashed lines, according to an embodiment of the present invention. Fig. 4(b) shows the coordinate-transformed local fusion image obtained by coordinate-transforming the local fusion image in fig. 4(a) to remove distortion. Further, feature extraction may be performed around the image of fig. 4(b), acquiring image features such as image resolution and/or focus information. Fig. 4(c) shows a schematic diagram of resampling the coordinate-transformed locally fused image of fig. 4(b) with the extracted image features. Fig. 4(d) shows a schematic diagram of the local fused image resampled in fig. 4(c) being subjected to coordinate inverse transformation to be projected to the panoramic fused image of fig. 4(a), resulting in a panoramic image with consistent image characteristics (e.g., image resolution, focus information).
In step S102, panoramic semantic information is obtained according to the panoramic image, wherein the panoramic semantic information corresponds to a semantic division area in the panoramic image.
In this step, the panoramic image may be processed to obtain one or more pieces of panoramic semantic information of the panoramic image and corresponding semantic division regions thereof. In one example, the panoramic semantic information and the range of the corresponding semantic division area can be obtained by using image recognition and other technologies. Optionally, information related to the panoramic image, such as background information and/or scene description information of the image, may be further acquired according to the acquired panoramic semantic information.
Fig. 5 shows a schematic diagram of a panoramic image acquired according to an embodiment of the present invention. According to the example shown in fig. 5, the panoramic semantic information in the panoramic image may obtain information such as sky, ground, people, etc. according to the image recognition content, which may respectively correspond to different region ranges obtained by image recognition, and the background information and/or scene description information obtained according to this may include, for example: outdoors, crowded people, face-to-face conversations, and the like.
In step S103, one or more focus areas may be determined in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and detail semantic information may be acquired according to the determined focus areas. Optionally, one or more pieces of focus semantic information related to the one or more local images may be selected from the panoramic semantic information, a region is divided according to the selected focus semantic information and corresponding semantics thereof, a corresponding region of interest is obtained in the one or more local images through a neural network, image information processing, and the like, to serve as the one or more focus regions, and corresponding detail semantic information is obtained according to the determined focus region. Of course, the above specific operation manner is only an example, and in practical applications, the determined focus area may not be completely in the local image, for example, the focus area may only partially overlap with the local image area, or the focus area may not completely overlap with the local image area; accordingly, the selected focus semantic information for the focus area and the subsequently acquired detail semantic information may also be only partially related to the local image, or not related to the local image at all. At this time, in this step, one or more focus areas may be determined only according to the panoramic semantic information and the corresponding semantic division areas thereof, and the detailed semantic information may be acquired according to the determined focus areas.
Fig. 6 illustrates a panoramic image, a partial image and a position diagram of a focal area being acquired according to an embodiment of the present invention. According to the example shown in fig. 6, the area range where the local fusion image obtained by fusing the local image acquired by the local sensor is located is within the dotted line frame in the panoramic image, and the infrared image at the lower part of fig. 6 is an enlarged schematic view of the focal area determined in the local image converted from the local fusion image. According to the infrared image of the focus area enlarged in fig. 6, the corresponding detail semantic information can be obtained, for example: the human mood is happy.
In step S104, image description information is obtained by using the panoramic semantic information and the detail semantic information. In this step, the panoramic semantic information and the detailed semantic information may be fused, and the image description information may be obtained by combining weights. Optionally, the panoramic semantic information E for describing the scene may be fused with the detail semantic information S for describing details, and the final image description information may be obtained based on different model structures (e.g., weight averaging, bayesian estimation, data fusion neural network, reinforcement learning, etc.).
As described above, the panoramic image and the local image according to the embodiment of the present invention may be both still images, or may be a frame of video frame in a video. When the panoramic image and the local image are video frames in a video, the panoramic image and the local image may be a panoramic image in the panoramic video and one or more local images in one or more local videos, respectively, which are captured at the same time i. When the panoramic image and the local image are respectively a frame of video frame collected at the same time in the video, the obtaining of the image description information by using the panoramic semantic information and the detail semantic information may include: and respectively fusing the panoramic semantic information and the detail semantic information at different moments, and processing according to a time sequence to obtain the image description information which changes along with time. That is, the image description information for the video may be information that gradually evolves over time, rather than being fixed. At this time, the panorama semantic information in the image description information may be represented as EiThe detail semantic information may be represented as SiAnd i is time. Accordingly, the time series of the panoramic semantic information varying with i may be Ei-2,Ei-1,Ei,Ei+1…, the time series of detail semantic information that varies with i may be Si-2,Si-1,Si,Si+1… are provided. For example, examples of temporal changes in image description information may be: two people talk outdoors, are happy at time i-1, begin to quarrel at time i, and so on.
In one example, the image description information using weight averaging may be represented as:
R=WsiSi+WeiEi+Ws(i-1)Si-1+We(i-1)Ei-1+…
where R is the image semantic weighted average information (i.e., weighted image description information), WsiFor detail semantic information weight, WeiAnd i is time.
In another example, the image description information based on bayesian estimation may be expressed as:
Pr=minL(P(Psi,Pei))
wherein P issiBayesian estimation for detail semantic information, PeiBayesian estimation for panoramic semantic information, i being time, P (P)si,Pei) As a joint distribution function, PrIs the minimum likelihood estimation of the joint distribution function P, i.e. the image description information fusion value.
In another example, the reinforcement learning based image description information may be expressed as:
turple(S,A,R,P)=turple((Ssi,Sei),A,(Rsi,Rei),P)
wherein turple (x) is a reinforcement learning four element system; s, A, R is an input; p is the output; wherein S is environment information or state, and can be divided into detail semantic information SsiAnd panoramic semantic information Sei(ii) a A is the behavior or action in the state; r is the reward of each action in each state, and can be divided into rewards R brought by detail semantic informationsiReward R brought by panoramic semantic informationei(ii) a And P is the image description information in the current state or the corresponding behavior function.
According to the image processing method, the panoramic semantic information and the detail semantic information can be respectively acquired aiming at the panoramic image and one or more local images in the panoramic image range, and the image description information can be obtained according to the panoramic semantic information and the detail semantic information. Therefore, the image description information obtained by the method can give consideration to the panoramic semantic information of the panoramic image about the scene description and the detail semantic information about the detail description in the local image focus area, the image description accuracy is improved, and the method can be effectively applied to the fields of automatic driving, robot interaction and the like.
For example, in the field of robot interaction, in the prior art, corresponding panoramic semantic information can be generally obtained only from a panoramic image, so that only description about a scene is performed, and corresponding detail semantic analysis and response cannot be performed on a focus area in a targeted manner. According to the method provided by the embodiment of the invention, not only can panoramic semantic information for scene description be obtained, but also detailed semantic information for a focus area can be further obtained, and the focus area can be changed as required, so that the description of different focuses in the scene and the scene can be considered, and more accurate and targeted communication and reaction of the robot in the scene can be facilitated.
Next, an image processing apparatus according to an embodiment of the present invention is described with reference to fig. 7. Fig. 7 shows a block diagram of an image processing apparatus 700 according to an embodiment of the present invention. The image processing apparatus of the embodiment of the present invention may be applied to both a still image and a video frame in a video that changes with time, and is not limited herein. As shown in fig. 7, the image processing apparatus 700 includes an acquisition unit 710, a semantic division unit 720, a focus area acquisition unit 730, and a description unit 740. The apparatus 700 may include other components in addition to these units, however, since these components are not related to the contents of the embodiments of the present invention, illustration and description thereof are omitted herein. Further, since the specific details of the following operations performed by the image processing apparatus 700 according to the embodiment of the present invention are the same as those described above with reference to fig. 1 to 6, a repetitive description of the same details is omitted herein to avoid redundancy.
The acquisition unit 710 of the image processing apparatus 700 in fig. 7 is configured to acquire a panoramic image and one or more partial images within the panoramic image.
The acquisition unit 710 may acquire the panoramic image and the one or more partial images using a multi-sensor system. The series of images acquired by the multi-sensor imaging system may include a panoramic image acquired by a panoramic sensor in the multi-sensor imaging system and one or more partial images within the panoramic image acquired by one or more partial sensors in the multi-sensor imaging system. Here, the panoramic image may be acquired by photographing scene image information of, for example, 360 degrees by a wide angle technique by a panoramic sensor, and may be further mapped into a two-dimensional image by conversion of a longitude and latitude coordinate system. Accordingly, within the range of the scene shot by the panoramic image, one or more local images can also be acquired by one or more local sensors. Among these, the local sensor may be, for example: one or more of a high-definition sensor, an infrared sensor, a light field sensor, a point cloud sensor, a stereoscopic vision sensor and a laser sensor. By means of the above-mentioned local sensors, the corresponding, for example: one or more of a high-definition partial image, an infrared partial image, a light field partial image, a point cloud partial image, a stereoscopic vision partial image, and a laser partial image.
After the panoramic image and the one or more local images in the scene range of the panoramic image are acquired through the multi-sensor system, further, the one or more local images can be fused according to the position of the acquired local image to obtain a local fusion image, wherein the local fusion image is in one-to-one correspondence with the local images. And finally, the panoramic image and the fused local image can be fused to obtain a panoramic fused image.
Fig. 2 shows a schematic diagram of a panoramic image, a partial image and a panoramic fusion image according to an embodiment of the present invention. Specifically, fig. 2(a) is a panoramic image acquired by using a panoramic sensor in an embodiment of the present invention; fig. 2(b) shows an infrared partial image obtained by an infrared sensor, and fig. 2(c) shows a panoramic fused image obtained by fusing the panoramic image in fig. 2(a) and the infrared partial image in fig. 2 (b). As shown in fig. 2(c), the infrared partial image in fig. 2(b) may be subjected to a fusion process, and the processed partial fusion image may be fused to the central region of the panoramic image in fig. 2 (a).
In a specific operation process, the obtaining unit 710 may obtain an independent panoramic image and one or more local images, respectively; it is also possible to acquire a panoramic fusion image such as that shown in fig. 2(c) at an initial stage and process the acquired panoramic fusion image and local image separately according to the panoramic fusion image for use in the subsequent steps.
In one example, when the image obtained at the initial stage is a panoramic fusion image, one or more local fusion images may be obtained from the panoramic fusion image based on the positions of the one or more local fusion images in the panoramic fusion image; and then the local fusion image is processed to respectively obtain a local image and/or a panoramic image. In practical applications, optionally, the position of the local fusion image in the panoramic fusion image may be obtained through position information of the local fusion image included in the panoramic fusion image, for example, the position information may be known through metadata (metadata) of the panoramic fusion image, or may be known through a related description in a picture file of the panoramic fusion image. After the position of the local fusion image (or the local image) in the panoramic fusion image is known, the local fusion image can be separated from the panoramic fusion image. Here, the acquired local fusion image is generally a local fusion image in which a local image is distorted so as to be adapted to the longitude and latitude coordinate system of the panoramic image. Thus, optionally, to obtain a local image without distortion, in one example, the one or more local fused images may be coordinate transformed to remove distortion to obtain the local image. In another example, the locally fused image may also be coordinate transformed first to remove distortion; subsequently acquiring one or more image-related features for the coordinate-transformed locally fused image (e.g., a search may be initiated from the center of the coordinate-transformed locally fused image to acquire image pixel-level features such as image resolution and/or focus information); and finally, removing the fuzzy area of the local fusion image after coordinate transformation according to the characteristics of the acquired image to obtain the required local image.
Fig. 3 shows a schematic diagram of obtaining a partial image from a panoramic fusion image. Fig. 3(a) shows a panoramic fused image, with the locations of the partially fused images outlined by dashed lines, according to one embodiment of the invention. Fig. 3(b) shows the coordinate-transformed local fusion image obtained by coordinate-transforming the local fusion image in fig. 3(a) to remove distortion. Further, feature extraction may be performed in the image of fig. 3(b), to acquire features such as image resolution and/or focus information, and to remove a blurred region from the coordinate-transformed local fusion image to obtain a local image. The blurred region of the local fusion image may be, for example, a transition region of some lines or colors in the image, or an edge region of the image. Fig. 3(c) shows a schematic position diagram of the clear region 1 and the blurred region 2 in the local fusion image after coordinate transformation in fig. 3 (b). Wherein, the area 1 in the central square frame is a clear area, and the area 2 nested by the two square frames is a fuzzy area. In one example, the blurred region 2 may be processed and removed using the extracted image features to make the resulting partial image (not shown) sufficiently sharp.
Optionally, when the image acquired at the initial stage is a panoramic fusion image, the panoramic image may also be acquired based on the panoramic fusion image and a local fusion image acquired according to the position of the panoramic fusion image. Specifically, the one or more locally fused images may first be coordinate transformed to remove distortion; subsequently acquiring one or more image-related features for the coordinate-transformed locally fused image, e.g. features at the image pixel level, such as image resolution and/or focus information, may be acquired from the surroundings of the coordinate-transformed locally fused image; next, considering that the sensors used for initially acquiring the panoramic image and the local image are different, so that the images may have different image characteristics such as image resolution and focus information, and that the fused panoramic fusion image and the local fusion image may also have correspondingly different image characteristics, the local fusion image after coordinate transformation may be resampled using the acquired image-related characteristics to adjust the local fusion image after coordinate transformation to have the same characteristics (image resolution, focus information, etc.) as the panoramic fusion image around it; and finally, performing coordinate inverse transformation (namely, the coordinate transformation processing process is opposite to the coordinate transformation processing process) on the resampled local fusion image, namely, projecting the resampled local fusion image back to a coordinate system of the panoramic fusion image for fusion after performing distortion processing on the resampled local fusion image so as to replace the area of the original local fusion image in the panoramic fusion image with the processed local fusion image with the same image characteristics as the panoramic fusion image, so as to obtain the panoramic image. In one example of the invention, resampling has different modes of operation for different local sensors in a multi-sensor imaging system. For example, the high-definition sensor may perform pixel resampling of high-definition data, the infrared sensor may perform visible light acquisition processing of infrared data, the light field sensor may perform resolution and focusing information averaging and adjustment of light field data, the point cloud sensor may perform projection and pixel supplementation of point cloud data, and the stereo vision sensor may perform depth information removal and resolution adjustment of stereo vision data. The above resampling process and method are only examples, and in practical applications, any relevant resampling process method may be adopted, and is not limited herein.
Fig. 4 shows a schematic diagram of acquiring a panoramic image from a panoramic fusion image. Fig. 4(a) shows a panoramic fusion image in which the positions of the partial fusion images are outlined by dashed lines, according to an embodiment of the present invention. Fig. 4(b) shows the coordinate-transformed local fusion image obtained by coordinate-transforming the local fusion image in fig. 4(a) to remove distortion. Further, feature extraction may be performed around the image of fig. 4(b), acquiring image features such as image resolution and/or focus information. Fig. 4(c) shows a schematic diagram of resampling the coordinate-transformed locally fused image of fig. 4(b) with the extracted image features. Fig. 4(d) shows a schematic diagram of the local fused image resampled in fig. 4(c) being subjected to coordinate inverse transformation to be projected to the panoramic fused image of fig. 4(a), resulting in a panoramic image with consistent image characteristics (e.g., image resolution, focus information).
The semantic dividing unit 720 obtains panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to a semantic division area in the panoramic image.
The semantic division unit 720 may process the panoramic image to obtain one or more panoramic semantic information of the panoramic image and corresponding semantic division regions thereof. In one example, the panoramic semantic information and the range of the corresponding semantic division area can be obtained by using image recognition and other technologies. Optionally, information related to the panoramic image, such as background information and/or scene description information of the image, may be further acquired according to the acquired panoramic semantic information.
Fig. 5 shows a schematic diagram of a panoramic image acquired according to an embodiment of the present invention. According to the example shown in fig. 5, the panoramic semantic information in the panoramic image may obtain information such as sky, ground, people, etc. according to the image recognition content, which may respectively correspond to different region ranges obtained by image recognition, and the background information and/or scene description information obtained according to this may include, for example: outdoors, crowded people, face-to-face conversations, and the like.
The focus area obtaining unit 730 may determine one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and obtain detailed semantic information according to the determined focus areas. Optionally, one or more pieces of focus semantic information related to the one or more local images may be selected from the panoramic semantic information, a region is divided according to the selected focus semantic information and corresponding semantics thereof, a corresponding region of interest is obtained in the one or more local images through a neural network, image information processing, and the like, to serve as the one or more focus regions, and corresponding detail semantic information is obtained according to the determined focus region. Of course, the above specific operation manner is only an example, and in practical applications, the determined focus area may not be completely in the local image, for example, the focus area may only partially overlap with the local image area, or the focus area may not completely overlap with the local image area; accordingly, the selected focus semantic information for the focus area and the subsequently acquired detail semantic information may also be only partially related to the local image, or not related to the local image at all. At this time, the focus area acquiring unit 730 may determine one or more focus areas only from the panoramic semantic information and the corresponding semantic division areas thereof, and acquire detailed semantic information from the determined focus areas.
Fig. 6 illustrates a schematic diagram of a local image and a focal region acquired by a panoramic image according to an embodiment of the present invention. According to the example shown in fig. 6, the area range where the local fusion image obtained by fusing the local image acquired by the local sensor is located is within the dotted line frame in the panoramic image, and the infrared image at the lower part of fig. 6 is an enlarged schematic view of the focal area determined in the local image converted from the local fusion image. According to the infrared image of the focus area enlarged in fig. 6, the corresponding detail semantic information can be obtained, for example: the human mood is happy.
The description unit 740 obtains image description information using the panorama semantic information and the detail semantic information. The description unit 740 may fuse the panoramic semantic information and the detail semantic information, and obtain the image description information by combining weights. Optionally, the panoramic semantic information E for describing the scene may be fused with the detail semantic information S for describing details, and the final image description information may be obtained based on different model structures (e.g., weight averaging, bayesian estimation, data fusion neural network, reinforcement learning, etc.).
As described above, the panoramic image and the local image according to the embodiment of the present invention may be both still images, or may be a frame of video frame in a video. When the panoramic image and the local image are both video frames in the video, they may be a panoramic image in the panoramic video captured at the same time i and one or more local images in one or more local videos captured. When the panoramic image and the local image are respectively a frame of video frame collected at the same time in the video, the obtaining of the image description information by using the panoramic semantic information and the detail semantic information may include: will not be simultaneousAnd respectively fusing the carved panoramic semantic information and the detailed semantic information, and processing according to a time sequence to obtain the image description information which changes along with time. That is, the image description information for the video may be information that gradually evolves over time, rather than being fixed. At this time, the panorama semantic information in the image description information may be represented as EiThe detail semantic information may be represented as SiAnd i is time. Accordingly, the time series of the panoramic semantic information varying with i may be Ei-2,Ei-1,Ei,Ei+1…, the time series of detail semantic information that varies with i may be Si-2,Si-1,Si,Si+1… are provided. In one example, the result of the temporal change in image description information may be: two people talk outdoors, are happy at time i-1, begin to quarrel at time i, and so on.
In one example, the image description information using weight averaging may be represented as:
R=WsiSi+WeiEi+Ws(i-1)Si-1+We(i-1)Ei-1+…
where R is the image semantic weighted average information (i.e., weighted image description information), WsiFor detail semantic information weight, WeiAnd i is time.
In another example, the image description information based on bayesian estimation may be expressed as:
Pr=minL(P(Psi,Pei))
wherein P issiBayesian estimation for detail semantic information, PeiBayesian estimation for panoramic semantic information, i being time, P (P)si,Pei) As a joint distribution function, PrIs the minimum likelihood estimation of the joint distribution function P, i.e. the image description information fusion value.
In another example, the reinforcement learning based image description information may be expressed as:
turple(S,A,R,P)=turple((Ssi,Sei),A,(Rsi,Rei),P)
wherein turple (x) is a reinforcement learning four element system; s, A, R is an input; p is the output; wherein S is environment information or state, and can be divided into detail semantic information SsiAnd panoramic semantic information Sei(ii) a A is the behavior or action in the state; r is the reward of each action in each state, and can be divided into rewards R brought by detail semantic informationsiReward R brought by panoramic semantic informationei(ii) a And P is the image description information in the current state or the corresponding behavior function.
According to the image processing device of the invention, the panoramic semantic information and the detail semantic information can be respectively acquired for the panoramic image and one or more local images in the panoramic image range, and the image description information can be obtained according to the panoramic semantic information and the detail semantic information. Therefore, the image description information obtained by the method can give consideration to the panoramic semantic information of the panoramic image about the scene description and the detail semantic information about the detail description in the local image focus area, the image description accuracy is improved, and the method can be effectively applied to the fields of automatic driving, robot interaction and the like.
For example, in the field of robot interaction, in the prior art, corresponding panoramic semantic information can be generally obtained only from a panoramic image, so that only description about a scene is performed, and corresponding detail semantic analysis and response cannot be performed on a focus area in a targeted manner. According to the method provided by the embodiment of the invention, not only can panoramic semantic information for scene description be obtained, but also detailed semantic information for a focus area can be further obtained, and the focus area can be changed as required, so that the description of different focuses in the scene and the scene can be considered, and more accurate and targeted communication and reaction of the robot in the scene can be facilitated.
Next, an image processing apparatus according to an embodiment of the present invention is described with reference to fig. 8. Fig. 8 shows a block diagram of an image processing apparatus 800 according to an embodiment of the present invention. As shown in fig. 8, the apparatus 800 may be a computer or a server.
As shown in fig. 8, the image processing device 800 includes one or more processors 810 and a memory 820, although, in addition to this, the image processing device 800 may include a multi-sensor imaging system and output device (not shown), etc., which may be interconnected via a bus system and/or other form of connection mechanism. It should be noted that the components and structure of the image processing apparatus 800 shown in fig. 8 are only exemplary and not limiting, and the image processing apparatus 800 may have other components and structures as necessary.
The processor 810 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may utilize computer program instructions stored in the memory 820 to perform desired functions, which may include: acquiring a panoramic image and one or more local images within the panoramic image; acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image; determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas; and obtaining image description information by using the panoramic semantic information and the detail semantic information.
Memory 820 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 810 to implement the functions of the image processing apparatus of the embodiments of the present invention described above and/or other desired functions, and/or may execute an image processing method according to an embodiment of the present invention. Various applications and various data may also be stored in the computer-readable storage medium.
In the following, a computer readable storage medium according to an embodiment of the present invention is described, on which computer program instructions are stored, wherein the computer program instructions, when executed by a processor, implement the steps of: acquiring a panoramic image and one or more local images within the panoramic image; acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image; determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas; and obtaining image description information by using the panoramic semantic information and the detail semantic information.
Of course, the above-mentioned embodiments are merely examples and not limitations, and those skilled in the art can combine and combine some steps and apparatuses from the above-mentioned separately described embodiments to achieve the effects of the present invention according to the concepts of the present invention, and such combined and combined embodiments are also included in the present invention, and such combined and combined embodiments are not necessarily described herein.
Note that advantages, effects, and the like mentioned in the present invention are merely examples and not limitations, and they cannot be considered essential to various embodiments of the present invention. Furthermore, the foregoing detailed description of the invention is provided for the purpose of illustration and understanding only, and is not intended to be limiting, since the invention will be described in any way as it would be understood by one skilled in the art.
The block diagrams of devices, apparatuses, systems involved in the present invention are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The flowchart of steps in the present invention and the above description of the method are only given as illustrative examples and are not intended to require or imply that the steps of the various embodiments must be performed in the order presented. As will be appreciated by those skilled in the art, the order of the steps in the above embodiments may be performed in any order. Words such as "thereafter," "then," "next," etc. are not intended to limit the order of the steps; these words are only used to guide the reader through the description of these methods. Furthermore, any reference to an element in the singular, for example, using the articles "a," "an," or "the" is not to be construed as limiting the element to the singular.
In addition, the steps and devices in the embodiments are not limited to be implemented in a certain embodiment, and in fact, some steps and devices in the embodiments may be combined according to the concept of the present invention to conceive new embodiments, and these new embodiments are also included in the scope of the present invention.
The individual operations of the methods described above may be performed by any suitable means capable of performing the corresponding functions. The means may include various hardware and/or software components and/or modules including, but not limited to, a circuit, an Application Specific Integrated Circuit (ASIC), or a processor.
The various illustrative logical blocks, modules, and circuits described may be implemented or performed with a general purpose processor, a Digital Signal Processor (DSP), an ASIC, a field programmable gate array signal (FPGA) or other Programmable Logic Device (PLD), discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any commercially available processor, controller, microcontroller or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
The steps of a method or algorithm described in connection with the invention may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may reside in any form of tangible storage medium. Some examples of storage media that may be used include Random Access Memory (RAM), Read Only Memory (ROM), flash memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, and the like. A storage medium may be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. A software module may be a single instruction, or many instructions, and may be distributed over several different code segments, among different programs, and across multiple storage media.
The inventive methods herein comprise one or more acts for implementing the described methods. The methods and/or acts may be interchanged with one another without departing from the scope of the claims. In other words, unless a specific order of actions is specified, the order and/or use of specific actions may be modified without departing from the scope of the claims.
The functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored as one or more instructions on a tangible computer-readable medium. A storage media may be any available tangible media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other tangible medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. As used herein, disk (disk) and disc (disc) includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers.
Accordingly, a computer program product may perform the operations presented herein. For example, such a computer program product may be a computer-readable tangible medium having instructions stored (and/or encoded) thereon that are executable by one or more processors to perform the operations described herein. The computer program product may include packaged material.
Software or instructions may also be transmitted over a transmission medium. For example, the software may be transmitted from a website, server, or other remote source using a transmission medium such as coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, or microwave.
Further, modules and/or other suitable means for carrying out the methods and techniques described herein may be downloaded and/or otherwise obtained by a user terminal and/or base station as appropriate. For example, such a device may be coupled to a server to facilitate the transfer of means for performing the methods described herein. Alternatively, the various methods described herein can be provided via storage means (e.g., RAM, ROM, a physical storage medium such as a CD or floppy disk) so that the user terminal and/or base station can obtain the various methods when coupled to or providing storage means to the device. Further, any other suitable technique for providing the methods and techniques described herein to a device may be utilized.
Other examples and implementations are within the scope and spirit of the invention and the following claims. For example, due to the nature of software, the functions described above may be implemented using software executed by a processor, hardware, firmware, hard-wired, or any combination of these. Features implementing functions may also be physically located at various locations, including being distributed such that portions of functions are implemented at different physical locations. Also, as used herein, including in the claims, "or" as used in a list of items beginning with "at least one" indicates a separate list, such that a list of "A, B or at least one of C" means a or B or C, or AB or AC or BC, or ABC (i.e., a and B and C). Furthermore, the word "exemplary" does not mean that the described example is preferred or better than other examples.
Various changes, substitutions and alterations to the techniques described herein may be made without departing from the techniques of the teachings as defined by the appended claims. Moreover, the scope of the present claims is not intended to be limited to the particular aspects of the process, machine, manufacture, composition of matter, means, methods and acts described above. Processes, machines, manufacture, compositions of matter, means, methods, or acts, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding aspects described herein may be utilized. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or acts.
The previous description of the inventive aspects is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the invention. Thus, the present invention is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the invention to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (13)

1. An image processing method comprising:
acquiring a panoramic image and one or more local images within the panoramic image;
acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image;
determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas;
and obtaining image description information by using the panoramic semantic information and the detail semantic information.
2. The method of claim 1, wherein the acquiring the panoramic image and the one or more local images within the panoramic image comprises:
acquiring one or more local fusion images from a panoramic fusion image based on the positions of the one or more local fusion images in the panoramic fusion image, wherein the panoramic fusion image is obtained by fusing the panoramic image and the one or more local images, and the local images are fused and correspond to the local fusion images in the panoramic fusion image one to one;
and processing the local fusion image to obtain the panoramic image and/or the local image.
3. The method of claim 2, wherein the processing the locally fused image to obtain the panoramic image and/or the local image comprises:
and carrying out coordinate transformation on the one or more local fusion images to obtain the local images.
4. The method of claim 2, wherein the processing the locally fused image to obtain the panoramic image and/or the local image comprises:
performing coordinate transformation on the one or more locally fused images;
resampling the local fusion image after coordinate transformation;
and carrying out coordinate inverse transformation on the local fusion image after resampling so as to obtain the panoramic image.
5. The method of claim 1, wherein the obtaining panoramic semantic information from the panoramic image further comprises:
and acquiring background information and/or scene description information of the panoramic image according to the panoramic semantic information of the panoramic image and the corresponding semantic division area.
6. The method of claim 1, wherein the determining one or more focal regions in the one or more local images according to the panoramic semantic information and its corresponding semantic zoning regions comprises:
and selecting one or more pieces of focus semantic information related to the local images from the panoramic semantic information, and determining one or more focus areas in the one or more local images according to the selected focus semantic information and the corresponding semantic division areas.
7. The method of claim 1, wherein the deriving image description information using the panorama semantic information and the detail semantic information comprises:
and fusing the panoramic semantic information and the detail semantic information to obtain the image description information.
8. The method of claim 1, wherein,
the panoramic image and the local image are respectively a frame of video frame collected at the same time in the video.
9. The method of claim 8, wherein when the panoramic image and the local image are respectively a frame of video frames captured at the same time in a video, the obtaining image description information by using the panoramic semantic information and the detail semantic information comprises:
and respectively fusing the panoramic semantic information and the detail semantic information at different moments, and processing according to a time sequence to obtain the image description information which changes along with time.
10. The method of any one of claims 1-9,
the panoramic image is acquired by a panoramic sensor in a multi-sensor imaging system;
the one or more local images are acquired by one or more local sensors in a multi-sensor imaging system.
11. An image processing apparatus comprising:
an acquisition unit that acquires a panoramic image and one or more partial images within the panoramic image;
the semantic dividing unit acquires panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic dividing areas in the panoramic image;
a focus area obtaining unit, which determines one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and obtains detail semantic information according to the determined focus areas;
and the description unit is used for obtaining image description information by utilizing the panoramic semantic information and the detail semantic information.
12. An image processing apparatus comprising:
a processor;
and a memory having computer program instructions stored therein,
wherein the computer program instructions, when executed by the processor, cause the processor to perform the steps of:
acquiring a panoramic image and one or more local images within the panoramic image;
acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image;
determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas;
and obtaining image description information by using the panoramic semantic information and the detail semantic information.
13. A computer readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the steps of:
acquiring a panoramic image and one or more local images within the panoramic image;
acquiring panoramic semantic information according to the panoramic image, wherein the panoramic semantic information corresponds to semantic division areas in the panoramic image;
determining one or more focus areas in the one or more local images according to the panoramic semantic information and the corresponding semantic division areas thereof, and acquiring detailed semantic information according to the determined focus areas;
and obtaining image description information by using the panoramic semantic information and the detail semantic information.
CN201810670236.4A 2018-06-26 2018-06-26 Image processing method, image processing apparatus, and computer-readable storage medium Pending CN110648299A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810670236.4A CN110648299A (en) 2018-06-26 2018-06-26 Image processing method, image processing apparatus, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810670236.4A CN110648299A (en) 2018-06-26 2018-06-26 Image processing method, image processing apparatus, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN110648299A true CN110648299A (en) 2020-01-03

Family

ID=68988373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810670236.4A Pending CN110648299A (en) 2018-06-26 2018-06-26 Image processing method, image processing apparatus, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN110648299A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340515A (en) * 2020-03-02 2020-06-26 北京京东振世信息技术有限公司 Characteristic information generation and article tracing method and device
CN111913343A (en) * 2020-07-27 2020-11-10 微幻科技(北京)有限公司 Panoramic image display method and device
WO2022105027A1 (en) * 2020-11-19 2022-05-27 安徽鸿程光电有限公司 Image recognition method and system, electronic device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160055237A1 (en) * 2014-08-20 2016-02-25 Mitsubishi Electric Research Laboratories, Inc. Method for Semantically Labeling an Image of a Scene using Recursive Context Propagation
CN105740402A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Method and device for acquiring semantic labels of digital images
CN106204522A (en) * 2015-05-28 2016-12-07 奥多比公司 The combined depth of single image is estimated and semantic tagger

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160055237A1 (en) * 2014-08-20 2016-02-25 Mitsubishi Electric Research Laboratories, Inc. Method for Semantically Labeling an Image of a Scene using Recursive Context Propagation
CN106204522A (en) * 2015-05-28 2016-12-07 奥多比公司 The combined depth of single image is estimated and semantic tagger
CN105740402A (en) * 2016-01-28 2016-07-06 百度在线网络技术(北京)有限公司 Method and device for acquiring semantic labels of digital images

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340515A (en) * 2020-03-02 2020-06-26 北京京东振世信息技术有限公司 Characteristic information generation and article tracing method and device
CN111340515B (en) * 2020-03-02 2023-09-26 北京京东振世信息技术有限公司 Feature information generation and article tracing method and device
CN111913343A (en) * 2020-07-27 2020-11-10 微幻科技(北京)有限公司 Panoramic image display method and device
CN111913343B (en) * 2020-07-27 2022-05-20 微幻科技(北京)有限公司 Panoramic image display method and device
WO2022105027A1 (en) * 2020-11-19 2022-05-27 安徽鸿程光电有限公司 Image recognition method and system, electronic device, and storage medium

Similar Documents

Publication Publication Date Title
JP7262659B2 (en) Target object matching method and device, electronic device and storage medium
KR102480245B1 (en) Automated generation of panning shots
US10015469B2 (en) Image blur based on 3D depth information
CN109074632B (en) Image distortion transformation method and apparatus
EP3704508B1 (en) Aperture supervision for single-view depth prediction
JP2015522959A (en) Systems, methods, and media for providing interactive refocusing in images
CN110648299A (en) Image processing method, image processing apparatus, and computer-readable storage medium
US11508038B2 (en) Image processing method, storage medium, image processing apparatus, learned model manufacturing method, and image processing system
WO2019037038A1 (en) Image processing method and device, and server
US20170171456A1 (en) Stereo Autofocus
CN109005334A (en) A kind of imaging method, device, terminal and storage medium
CN110503619B (en) Image processing method, device and readable storage medium
CN112351196B (en) Image definition determining method, image focusing method and device
CN112333379A (en) Image focusing method and device and image acquisition equipment
KR20190120106A (en) Method for determining representative image of video, and electronic apparatus for processing the method
GB2537886A (en) An image acquisition technique
JP6395429B2 (en) Image processing apparatus, control method thereof, and storage medium
CN115314635A (en) Model training method and device for determining defocus amount
CN112203023B (en) Billion pixel video generation method and device, equipment and medium
CN110581977A (en) video image output method and device and three-eye camera
US9232132B1 (en) Light field image processing
CN113163112A (en) Fusion focus control method and system
CN112950698A (en) Depth estimation method, device, medium, and apparatus based on binocular defocused image
CN113055584B (en) Focusing method based on fuzzy degree, lens controller and camera module
JP2004257934A (en) Three-dimensional shape measuring method, three-dimensional shape measuring instrument, processing program, and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200103

RJ01 Rejection of invention patent application after publication