CN114860979A - Image retrieval method and system based on region of interest extraction - Google Patents

Image retrieval method and system based on region of interest extraction Download PDF

Info

Publication number
CN114860979A
CN114860979A CN202210575033.3A CN202210575033A CN114860979A CN 114860979 A CN114860979 A CN 114860979A CN 202210575033 A CN202210575033 A CN 202210575033A CN 114860979 A CN114860979 A CN 114860979A
Authority
CN
China
Prior art keywords
region
original
visual
retrieval
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210575033.3A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Naisilai Technology Co ltd
Original Assignee
Changzhou Naisilai Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Naisilai Technology Co ltd filed Critical Changzhou Naisilai Technology Co ltd
Priority to CN202210575033.3A priority Critical patent/CN114860979A/en
Publication of CN114860979A publication Critical patent/CN114860979A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/60Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image retrieval method and system based on region of interest extraction, and belongs to the technical field of image processing. The method comprises the following steps: constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model; extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database; establishing a retrieval interface and inputting a retrieval instruction; and searching out and storing original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction to obtain a picture/video library. The visual attention model-based region of interest detection in the invention adds a human visual attention mechanism, and is more in line with the human visual perception process.

Description

Image retrieval method and system based on region of interest extraction
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an image retrieval method and system based on region of interest extraction.
Background
In recent years, with the rapid development of network and multimedia technologies, video and image-based retrieval technologies have attracted more and more attention. In the conventional method for image retrieval based on text, annotation, and the like, a specific flow is shown in fig. 4. The method mainly adopts a manual mode to label and annotate the image video data, stores labels and the image video data in a correlation mode, and retrieves videos and image data by retrieving label keywords.
However, the above method has some drawbacks: firstly, with the sharp increase of files such as image video and the like, the workload is enormous by using a manual filling method; secondly, understanding of the image video is different for each person, so that inaccurate labeling is easily caused, and a retrieval error is easily caused; thirdly, the method cannot meet the requirements of user personalization, such as retrieval requirements of image low-level visual feature contents and the like.
Disclosure of Invention
The invention provides an image retrieval technology for extracting a region of interest based on a visual attention model, aiming at solving the problems of low efficiency, inaccuracy, incapability of meeting personalized retrieval requirements and the like of the image retrieval technology in the background technology.
The invention adopts the following technical scheme: an image retrieval method based on region of interest extraction at least comprises the following steps:
constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model;
extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database;
establishing a retrieval interface and inputting a retrieval instruction; searching out original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction and storing the original pictures/videos to obtain a picture/video library; the extraction process of extracting the region of interest is as follows:
firstly, processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model;
step two, acquiring at least one visual attention focus in the saliency map by utilizing a competition mechanism;
and step three, taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
In a further embodiment, the step one specifically includes the following steps:
step 101, filtering an original image/original video by using a multi-scale multi-channel filter, extracting visual features, and obtaining a feature map about the visual features; the characteristic diagram at least comprises: a color feature map, a brightness feature map and a direction feature map;
102, selecting a scale according to requirements, defining the center c and the periphery s of each feature map, and respectively calculating the central periphery difference of the scale of the center c and the scale of the periphery s in the corresponding feature map; the result of the central peripheral difference is an attention map corresponding to the visual features;
103, normalizing the attention images to respectively obtain normalized color attention images
Figure BDA0003660271550000021
Luminance attention map
Figure BDA0003660271550000022
And direction attention map
Figure BDA0003660271550000023
Step 104, obtaining a saliency map SM by adopting the following formula:
Figure BDA0003660271550000024
in a further embodiment, the second step specifically includes the following steps:
step 201, obtaining a plurality of significance levels in a significance map, sorting the significance levels in a sequence from strong to weak, and selecting points corresponding to the top 10 significance levels as candidate points;
step 202, comparing the saliency corresponding to the candidate points with a first threshold t in sequence, wherein the candidate points with the saliency greater than the first threshold t are the visual attention focus.
In a further embodiment, the third step specifically includes the following steps:
calculating Euclidean distance between the visual attention focuses, if the Euclidean distance is smaller than a second threshold value d, merging the corresponding two visual attention focuses by adopting the following formula,
resulting in a merged visual focus of attention (X, Y):
Figure BDA0003660271550000025
in the formula (x) i ,y i ) And (x) j ,y j ) Respectively representing the coordinates of the ith visual attention focus and the jth visual attention focus, wherein i is not equal to j; v. of s,i And v s,j The gray values of the ith and jth visual attention focuses in the saliency map are respectively represented.
In a further embodiment, the step 101 is further represented by:
firstly, filtering an image BY using a Gaussian weight matrix, performing down-sampling to obtain a Gaussian pyramid with n layers, extracting color, brightness and direction characteristics under each scale sigma of the pyramid to form corresponding RG (sigma), BY (sigma), I (sigma) and O (sigma) characteristic pyramids, wherein sigma belongs to [0, n-1 ];
wherein, the color characteristics in the color characteristic diagram are as follows: red colour
Figure BDA0003660271550000031
Green colour
Figure BDA0003660271550000032
Blue color
Figure BDA0003660271550000033
And yellow
Figure BDA0003660271550000034
The luminance characteristics in the luminance characteristic diagram are represented as: i ═ r + g + b)/3;
the directional characteristic is a four directional characteristic formed through transformation of a Gabor wavelet in four directions of θ ═ {0 °,45 °,90 °,135 ° } on the basis of the luminance characteristic, where r, g, and b are three components of red, green, and blue of the original image.
In a further embodiment, the calculation method for calculating the central perimeter difference between the central c-scale and the perimeter s-scale in the corresponding feature map is as follows:
Figure BDA0003660271550000035
wherein RG (c, s) represents the central peripheral difference of the red-green color feature map, BY (c, s) represents the central peripheral difference of the blue-yellow color feature map, I (c, s) represents the central peripheral difference of the luminance feature map, O (c, s, θ) represents the central peripheral difference of the directional feature map, r (c) and r(s) represent the center and periphery of the red feature map, respectively, g (c) and g(s) represent the center and periphery of the green feature map, respectively, b (c) and b(s) represent the center and periphery of the blue feature map, respectively, y (c) and y(s) represent the center and periphery of the yellow feature map, respectively, I (c) and I(s) represent the center and periphery of the luminance feature map, respectively, and O (c, θ) and O (s, θ) represent the center and periphery of the directional feature map, respectively.
In a further embodiment, the saliency is obtained as follows:
Figure BDA0003660271550000036
in the formula, σ c And σ s Scale factors representing the center c and the periphery s, respectively, (x, y) being the coordinates of the pixel points in the saliency map
In a further embodiment, the specific process of searching out the original image/original video meeting the requirement in the retrieval database is as follows:
and comparing the similarity of the characteristic values in the retrieval database with the input retrieval instruction, sequencing the similarity from high to low, screening out the characteristic values of a preset number which are sequenced in the front, and matching out the corresponding original image/original video based on the characteristic values.
The image retrieval system based on region-of-interest extraction for implementing the image retrieval method as described above includes:
a first module configured to construct a visual gaze model, and extract an area of interest in an original picture/original video based on the visual gaze model;
the second module is set to extract a characteristic value in the region of interest, and the characteristic value and the corresponding original picture/original video are subjected to associative storage according to a preset relation to obtain a retrieval database;
a third module, configured to establish a search interface and input a search instruction; and searching out and storing original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction to obtain a picture/video library.
In a further embodiment, the first module further comprises a fourth module connected thereto, the fourth module being configured to: processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model; obtaining at least one visual focus of attention in the saliency map using a competition mechanism; and taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
The invention has the beneficial effects that: the method for extracting the region of interest based on the visual attention model adds a human visual attention mechanism into the region of interest detection based on the visual attention model, and is more in line with the human visual perception process, wherein the method for extracting the region of interest based on the visual attention model comprises the following steps: obtaining a saliency map from physiological characteristics of the visual system of the human eye; and obtaining an attention focus by a winner by taking a full competition mechanism, taking the attention focus as a seed point for region growing and dividing, and then obtaining the region of interest by a region growing and dividing method. The problem that the region of interest extracted by the traditional method is separated from the subjective understanding of the user is solved.
Drawings
Fig. 1 is a flowchart of an image retrieval method based on region of interest extraction according to the present invention.
Fig. 2 is a flow chart of region of interest extraction based on visual fixation model in the present invention.
Fig. 3 is a flowchart of acquiring a saliency map in the present invention.
Fig. 4 is a flow chart of a prior art image retrieval technique based on text annotation.
FIG. 5 is a flow diagram of a prior art content-based image retrieval technique.
Fig. 6 is a flowchart of a prior art region-of-interest based image retrieval technique.
Detailed Description
Content-Based Image Retrieval (CBIR) is a research focus at present, the CBRI technology is to perform Image Retrieval by extracting low-level visual features such as color and shape of an Image, has strong objectivity, and overcomes the defects of the conventional Image Retrieval, and a specific flow is shown in fig. 5. However, it is difficult to represent the high-level semantic features of the image from the low-level visual features of the image, i.e. the so-called "semantic gap" problem: the information that the user obtains from the visual data is inconsistent with the user's own understanding of the visual data. Therefore, acquiring the high-level semantics of the image is the key to solve the semantic gap problem.
The detection of the region of Interest (ROI) of the image is an effective method for obtaining the high-level semantic meaning of the image. In recent years, with the development of interest detection technology, many interesting detection methods have been proposed, such as: the specific flow is shown in fig. 6, wherein the human-computer interaction based region of interest detection requires a user to participate, so that the intention of the user can be accurately obtained, but the interaction process is relatively complex; the method breaks away from the subjective understanding of a user on the image and easily causes an opposite result in the data with prominent image background.
Therefore, in order to solve the above technical problems, the present embodiment provides an image retrieval method based on region of interest extraction, and in order to solve the problem that the region of interest extraction in the conventional method is separated from the subjective understanding of the user, the region of interest detection based on the visual attention model in the present invention adds a human visual attention mechanism, which better conforms to the human visual perception process. As shown in fig. 1, the method comprises the following steps:
constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model; in the present embodiment, the present invention is applicable to both the analysis search of images and the analysis search of videos.
Extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database;
establishing a retrieval interface and inputting a retrieval instruction; and searching out original pictures/videos meeting the requirements from a retrieval database based on the retrieval instruction, and storing to obtain a picture/video library. Further shown are: and comparing the similarity of the characteristic values in the retrieval database with the input retrieval instruction, sequencing the similarity from high to low, screening out the characteristic values of a preset number which are sequenced in the front, and matching out the corresponding original image/original video based on the characteristic values.
In a further embodiment, the process of extracting the region of interest is shown in fig. 5, and includes:
firstly, processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model;
step two, acquiring at least one visual attention focus in the saliency map by utilizing a competition mechanism;
and step three, taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
The method effectively overcomes the defect that a traditional region growing and dividing method needs to select seed points manually, and simultaneously solves the problems that the image is divided inaccurately by using a visual attention mechanism and the obtained region of interest is small. The visual fixation model is adopted, non-uniform sampling is carried out on the image by utilizing a human visual attention mechanism, the central peripheral difference is calculated to obtain a characteristic diagram of the image, and the characteristic diagram is fused into a saliency map of the image.
Specifically, the step one specifically comprises the following steps:
101, filtering an original image/original video by using a multi-scale multi-channel filter, extracting visual features, and obtaining a feature map about the visual features; the characteristic diagram at least comprises: a color feature map, a brightness feature map and a direction feature map; firstly, filtering an image BY using a Gaussian weight matrix, performing down-sampling to obtain a Gaussian pyramid with n layers, extracting color, brightness and direction characteristics under each scale sigma of the pyramid to form corresponding RG (sigma), BY (sigma), I (sigma) and O (sigma) characteristic pyramids, wherein sigma belongs to [0, n-1 ];
wherein, the color characteristics in the color characteristic diagram are as follows: red colour
Figure BDA0003660271550000061
Green colour
Figure BDA0003660271550000062
Blue color
Figure BDA0003660271550000063
And yellow
Figure BDA0003660271550000064
The luminance characteristics in the luminance characteristic diagram are represented as: i ═ r + g + b)/3;
the directional characteristic is a four directional characteristic formed through transformation of a Gabor wavelet in four directions of θ ═ {0 °,45 °,90 °,135 ° } on the basis of the luminance characteristic, where r, g, and b are three components of red, green, and blue of the original image.
102, selecting a scale according to requirements, defining the center c and the periphery s of each feature map, and respectively calculating the central periphery difference of the scale of the center c and the scale of the periphery s in the corresponding feature map; the result of the central peripheral difference is an attention map corresponding to the visual features; the calculation method for calculating the central peripheral difference between the central c scale and the peripheral s scale in the corresponding feature map is as follows:
Figure BDA0003660271550000065
wherein RG (c, s) represents the central peripheral difference of the red-green color feature map, BY (c, s) represents the central peripheral difference of the blue-yellow color feature map, I (c, s) represents the central peripheral difference of the luminance feature map, O (c, s, θ) represents the central peripheral difference of the directional feature map, r (c) and r(s) represent the center and periphery of the red feature map, respectively, g (c) and g(s) represent the center and periphery of the green feature map, respectively, b (c) and b(s) represent the center and periphery of the blue feature map, respectively, y (c) and y(s) represent the center and periphery of the yellow feature map, respectively, I (c) and I(s) represent the center and periphery of the luminance feature map, respectively, and O (c, θ) and O (s, θ) represent the center and periphery of the directional feature map, respectively.
It should be noted that, since the feature maps have different sizes at different scales, the feature map at the large scale s needs to be interpolated and enlarged to obtain the same size as the feature map at the small scale c when performing the difference.
103, normalizing the attention images to respectively obtain normalized color attention images
Figure BDA0003660271550000071
Luminance attention map
Figure BDA0003660271550000072
And direction attention map
Figure BDA0003660271550000073
Step 104, obtaining a saliency map SM by adopting the following formula:
Figure BDA0003660271550000074
due to the selective and transitive nature of the attention focus, the selection and shifting of attention focus is achieved through the network contention mechanism of WTA. This ensures that all but the most active one, with the focus of attention directed by the most active part in terms of identifiable orientation points, is suppressed. Those local inhibit points are also temporarily activated while looking for the current focus of attention in the saliency map, and the next-to-saliency-area is considered the most active winner as the WTA network moves to the next focus of attention. The fixation area of the human eye thus shifts from a strong focus of attention to a weaker focus of attention, a process that is known as the shift of the point of attention. For further screening of attention focus, a method of weighting euclidean distances is proposed in the prior art, but the method is only suitable for a single object, for a plurality of object images.
Therefore, in the present embodiment, attention is paid to a method of comparing the degree of saliency of the focus with the threshold t, which is specifically expressed as: step 201, obtaining a plurality of saliency degrees in a saliency map, sorting the plurality of saliency degrees according to a sequence from strong to weak (numbering is performed according to the sequence from strong to weak), and selecting points corresponding to the top 10 saliency degrees as candidate points; in this embodiment, the saliency is obtained as follows:
Figure BDA0003660271550000075
in the formula, σ c And σ s Scale factors representing the center c and the periphery s, respectively, (x, y) are the coordinates of the pixel points in the saliency map.
Step 202, comparing the saliency corresponding to the candidate points with a first threshold t in sequence, wherein the candidate points with the saliency greater than the first threshold t are the visual attention focus.
In order to further improve the accuracy of taking the attention focus as the region growing and dividing seed point, the similar attention focuses need to be further merged for processing, which is specifically represented as follows: calculating Euclidean distance between the visual attention focuses, if the Euclidean distance is smaller than a second threshold value d, merging the corresponding two visual attention focuses by adopting the following formula,
resulting in a merged visual focus of attention (X, Y):
Figure BDA0003660271550000081
in the formula (x) i ,y i ) And (x) j ,y j ) Respectively representing the coordinates of the ith visual attention focus and the jth visual attention focus, wherein i is not equal to j; v. of s,i And v s,j The gray values of the ith and jth visual attention focuses in the saliency map are respectively represented.
In another embodiment, an image retrieval system based on region of interest extraction for implementing the above method is disclosed, comprising:
a first module configured to construct a visual gaze model, and extract an area of interest in an original picture/original video based on the visual gaze model;
the second module is set to extract a characteristic value in the region of interest, and the characteristic value and the corresponding original picture/original video are subjected to associative storage according to a preset relation to obtain a retrieval database;
a third module, configured to establish a search interface and input a search instruction; and searching out and storing original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction to obtain a picture/video library.
Wherein the first module further comprises a fourth module connected thereto, the fourth module being arranged to: processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model; obtaining at least one visual focus of attention in the saliency map using a competition mechanism; and taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.

Claims (10)

1. An image retrieval method based on region of interest extraction is characterized by at least comprising the following steps:
constructing a visual fixation model, and extracting an interested area in an original picture/original video based on the visual fixation model;
extracting characteristic values in the region of interest, and performing relevance storage on the characteristic values and corresponding original pictures/original videos according to a preset relation to obtain a retrieval database;
establishing a retrieval interface and inputting a retrieval instruction; searching out original pictures/original videos meeting the requirements in a retrieval database based on the retrieval instruction and storing the original pictures/videos to obtain a picture/video library; the extraction process of extracting the region of interest is as follows:
firstly, processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model;
step two, acquiring at least one visual attention focus in the saliency map by utilizing a competition mechanism;
and step three, taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
2. The image retrieval method based on region of interest extraction according to claim 1, wherein the first step specifically comprises the following steps:
step 101, filtering an original image/original video by using a multi-scale multi-channel filter, extracting visual features, and obtaining a feature map about the visual features; the characteristic diagram at least comprises: a color feature map, a brightness feature map and a direction feature map;
102, selecting a scale according to requirements, defining the center c and the periphery s of each feature map, and respectively calculating the central periphery difference of the scale of the center c and the scale of the periphery s in the corresponding feature map; the result of the central peripheral difference is an attention map corresponding to the visual features;
103, normalizing the attention images to respectively obtain normalized color attention images
Figure FDA0003660271540000011
Luminance attention map
Figure FDA0003660271540000012
And direction attention map
Figure FDA0003660271540000013
Step 104, obtaining a saliency map SM by adopting the following formula:
Figure FDA0003660271540000014
3. the image retrieval method based on region of interest extraction according to claim 1, wherein the second step specifically comprises the following steps:
step 201, obtaining a plurality of significance levels in a significance map, sorting the significance levels in a sequence from strong to weak, and selecting points corresponding to the top 10 significance levels as candidate points;
step 202, comparing the saliency corresponding to the candidate points with a first threshold t in sequence, wherein the candidate points with the saliency greater than the first threshold t are the visual attention focus.
4. The image retrieval method based on region of interest extraction according to claim 1, wherein the third step specifically comprises the following steps:
calculating Euclidean distances between the visual attention focuses, and if the Euclidean distances are smaller than a second threshold value d, combining the two corresponding visual attention focuses by adopting the following formula to obtain a combined visual attention focus (X, Y):
Figure FDA0003660271540000021
in the formula (x) i ,y i ) And (x) j ,y j ) Respectively representing the coordinates of the ith visual attention focus and the jth visual attention focus, wherein i is not equal to j; v. of s,i And v s,j Individual watchGray values of the visual attention focus point i and the visual attention focus point j in the saliency map are shown.
5. The image retrieval method based on region of interest extraction according to claim 2, wherein the step 101 is further represented by:
firstly, filtering an image BY using a Gaussian weight matrix, performing down-sampling to obtain a Gaussian pyramid with n layers, extracting color, brightness and direction characteristics under each scale sigma of the pyramid to form corresponding RG (sigma), BY (sigma), I (sigma) and O (sigma) characteristic pyramids, wherein sigma belongs to [0, n-1 ];
wherein, the color characteristics in the color characteristic diagram are as follows: red colour
Figure FDA0003660271540000022
Green colour
Figure FDA0003660271540000023
Blue color
Figure FDA0003660271540000024
And yellow
Figure FDA0003660271540000025
The luminance characteristics in the luminance characteristic diagram are represented as: i ═ r + g + b)/3;
the directional characteristic is a four directional characteristic formed through transformation of a Gabor wavelet in four directions of θ ═ {0 °,45 °,90 °,135 ° } on the basis of the luminance characteristic, where r, g, and b are three components of red, green, and blue of the original image.
6. The image retrieval method based on region of interest extraction as claimed in claim 2, wherein the calculation method for calculating the central peripheral difference between the central c scale and the peripheral s scale in the corresponding feature map is as follows:
Figure FDA0003660271540000031
7. the image retrieval method based on region of interest extraction as claimed in claim 3, wherein the saliency is obtained as follows:
Figure FDA0003660271540000032
in the formula, σ c And σ s Scale factors representing the center c and the periphery s, respectively, (x, y) are the coordinates of the pixel points in the saliency map.
8. The image retrieval method based on region of interest extraction as claimed in claim 1, wherein the specific process of searching out the original image/original video meeting the requirement in the retrieval database is as follows:
and comparing the similarity of the characteristic values in the retrieval database with the input retrieval instruction, sequencing the similarity from high to low, screening out the characteristic values of a preset number which are sequenced in the front, and matching out the corresponding original image/original video based on the characteristic values.
9. An image retrieval system based on region-of-interest extraction for implementing the image retrieval method according to any one of claims 1 to 8, comprising:
a first module configured to construct a visual gaze model, and extract an area of interest in an original picture/original video based on the visual gaze model;
the second module is set to extract a characteristic value in the region of interest, and the characteristic value and the corresponding original picture/original video are subjected to associative storage according to a preset relation to obtain a retrieval database;
a third module, configured to establish a search interface and input a search instruction; and searching out original pictures/videos meeting the requirements from a retrieval database based on the retrieval instruction, and storing to obtain a picture/video library.
10. An image retrieval system based on region of interest extraction as claimed in claim 9, wherein the first module further comprises a fourth module connected thereto, the fourth module configured to: processing data of an original picture/original video, and extracting a saliency map by using a visual fixation model; obtaining at least one visual focus of attention in the saliency map using a competition mechanism; and taking the visual attention focus as a seed point for region growth segmentation, and obtaining the region of interest by a region growth method.
CN202210575033.3A 2022-05-24 2022-05-24 Image retrieval method and system based on region of interest extraction Withdrawn CN114860979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210575033.3A CN114860979A (en) 2022-05-24 2022-05-24 Image retrieval method and system based on region of interest extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210575033.3A CN114860979A (en) 2022-05-24 2022-05-24 Image retrieval method and system based on region of interest extraction

Publications (1)

Publication Number Publication Date
CN114860979A true CN114860979A (en) 2022-08-05

Family

ID=82638754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210575033.3A Withdrawn CN114860979A (en) 2022-05-24 2022-05-24 Image retrieval method and system based on region of interest extraction

Country Status (1)

Country Link
CN (1) CN114860979A (en)

Similar Documents

Publication Publication Date Title
CN112101150B (en) Multi-feature fusion pedestrian re-identification method based on orientation constraint
EP2955645B1 (en) System for automated segmentation of images through layout classification
Kumar et al. Leafsnap: A computer vision system for automatic plant species identification
CN102542058B (en) Hierarchical landmark identification method integrating global visual characteristics and local visual characteristics
CN100573523C (en) A kind of image inquiry method based on marking area
Ahmad et al. Endoscopic image classification and retrieval using clustered convolutional features
CN106126585B (en) The unmanned plane image search method combined based on quality grading with perceived hash characteristics
CN108460114B (en) Image retrieval method based on hierarchical attention model
Song et al. Taking advantage of multi-regions-based diagonal texture structure descriptor for image retrieval
Grana et al. Automatic segmentation of digitalized historical manuscripts
CN112927776A (en) Artificial intelligence automatic interpretation system for medical inspection report
CN109213886B (en) Image retrieval method and system based on image segmentation and fuzzy pattern recognition
Ko et al. Microscopic cell nuclei segmentation based on adaptive attention window
CN108664968B (en) Unsupervised text positioning method based on text selection model
Fu et al. Medical image retrieval and classification based on morphological shape feature
Xue et al. Investigating CBIR techniques for cervicographic images
CN114860979A (en) Image retrieval method and system based on region of interest extraction
CN106548118A (en) The recognition and retrieval method and system of cinema projection content
Hung et al. A content-based image retrieval system integrating color, shape and spatial analysis
Amory et al. A content based image retrieval using k-means algorithm
Misra et al. Text extraction and recognition from image using neural network
Kulkarni Natural language based fuzzy queries and fuzzy mapping of feature database for image retrieval
CN114202659A (en) Fine-grained image classification method based on spatial symmetry irregular local region feature extraction
Zhu et al. Detecting text in natural scene images with conditional clustering and convolution neural network
Duan et al. Bio-inspired visual attention model and saliency guided object segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220805

WW01 Invention patent application withdrawn after publication