CN104636745B - Scale invariant feature extracting method and device, object identifying method and device - Google Patents

Scale invariant feature extracting method and device, object identifying method and device Download PDF

Info

Publication number
CN104636745B
CN104636745B CN201310553060.1A CN201310553060A CN104636745B CN 104636745 B CN104636745 B CN 104636745B CN 201310553060 A CN201310553060 A CN 201310553060A CN 104636745 B CN104636745 B CN 104636745B
Authority
CN
China
Prior art keywords
key point
image
scale invariant
invariant feature
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310553060.1A
Other languages
Chinese (zh)
Other versions
CN104636745A (en
Inventor
贺娜
刘媛
师忠超
王刚
鲁耀杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to CN201310553060.1A priority Critical patent/CN104636745B/en
Publication of CN104636745A publication Critical patent/CN104636745A/en
Application granted granted Critical
Publication of CN104636745B publication Critical patent/CN104636745B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Provide a kind of scale invariant feature extracting method and device and object identifying method and device based on this feature extracting method for extracting the scale invariant feature of image in video flowing.The image includes gray level image and corresponding anaglyph, and in time corresponding to the t1 moment, the image has temporal formerly image and can included in rear image, this feature extracting method in video streaming:Position the key point in image;Generation is around the description region of key point, and this describes region centered on key point, and in x, y, z, described in the t four-dimension, this describes region scope value of each dimension in x, y, z, t dimensions and is not zero;And for each key point, region, generation description, the scale invariant feature as the key point are described based on it.The information of time domain, Depth Domain and the plane of delineation can be intimately associated extraction four-dimension scale invariant feature by the present invention, be suitable for being applied to machine learning.

Description

Scale invariant feature extracting method and device, object identifying method and device
Technical field
The present invention relates generally to image procossing, in particular it relates to feature extracting method, object identifying method and corresponding dress Put.
Background technology
Requirement of the target identification to local feature in real scene is higher, it is desirable to feature not by neighbouring object interference and Influence of partial occlusion etc..One big difficult point of target identification is the selection of characteristics of image.
Feature for target identification would generally be influenceed by following factor:Dimensional variation, image rotation, image obscure, Compression of images and brightness change etc..
Existing local feature is a lot, including GLOH (Gradient Location Orientation Histogram, Gradient locations are towards histogram), SIFT (Scale Invariant Feature Transform, Scale invariant features transform), Steerable filter etc..These features are all based on two dimensional image domain information, it is difficult to solve the situation of some complex scenes, Such as block.
Depth information has been widely used in image processing field now, and time domain characteristic or referred to as time domain are special The application levied in video or image sequence is also a lot.
SIFT (Scale Invariant Feature Transform) is the abbreviation of Scale invariant features transform, and it is The Feature Descriptor of a kind of Scale invariant and invariable rotary in image procossing.Itd is proposed by David G.Lowe in 1999, and And its entitled " Method and apparatus for identifying scale as inventor be present invariant features in an image and use of same for locating an object in an Image " patent US6,711,293.Introduction about SIFT feature and application may be referred to the text that David G.Lowe are delivered Chapter Object recognition from local scale-invariant features, International Conference on Computer Vision, Corfu, Greece (September1999), pp.1150-1157. and Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision,60,2(2004),pp.91-110.
The technology of Object identifying or tracking is carried out in the presence of some using SIFT feature or other local features.Example Such as, Computer Vision Lab, University of Central Florida, Orlando, FL.'s P.Scovanner, S.Ali, M.Shah etc. are in Proceedings of the15th international conference On Multimedia, entitled " the A3-Dimensional SIFT Descriptor and its delivered on 2007 Application to Action Recognition " article, and Tsalakanidou, F. and Malassiotis, S. Article Real-time facial feature tracking from2D+3D video streams.Information and Telematics Institute,Center for Research and Technology Hellas, Greece.3DTV-Conference:The True Vision-Capture,Transmission and Display of3D Video,2010。
The content of the invention
Time-domain information and depth information are more closely adhered to Scale invariant it is an object of the present invention to provide one kind With the technology more suitable for directly applying to machine learning in feature extraction.
A kind of according to an aspect of the invention, there is provided chi for extracting the scale invariant feature of the first image in video flowing Invariant feature extraction method is spent, the image includes the first gray level image and corresponding first anaglyph, corresponded in time T1 moment, the image have temporal formerly image and can included in rear image, this method in video streaming:Position image In key point, obtain the coordinate (x1, y1, z1, t1) of key point, wherein x1, y1 represent seat of the key point on gray level image Mark, z1 represent the parallax of key point, and t1 represents the dimension of image in time;Around the description region of key point, this is retouched for generation Region is stated centered on key point, in x, y, z, described in the t four-dimension, this describes the model of region each dimension in x, y, z, t dimensions Value is enclosed to be not zero;And for each key point, region, generation description, the chi as the key point are described based on it Spend invariant features.
According to another aspect of the present invention, there is provided a kind of chi for extracting the scale invariant feature of the first image in video flowing Invariant feature extraction device is spent, the image includes the first gray level image and corresponding first anaglyph, corresponded in time T1 moment, the image have temporal formerly image and included in rear image, the device in video streaming:Crucial point location list Member, for positioning the key point in image, the coordinate (x1, y1, z1, t1) of key point is obtained, wherein x1, y1 represent that key point exists Coordinate on gray level image, z1 represent the parallax of key point, and t1 represents the dimension of image in time;Area generation list is described Member, generation is around the description region of key point, and this describes region centered on key point, and in x, y, z, described in the t four-dimension, this is retouched Region scope value of each dimension in x, y, z, t dimensions is stated to be not zero;And the sub- generation unit of description, for for each Key point, region, generation description are described based on it.
In accordance with a further aspect of the present invention, there is provided a kind of object identifying method for identifying the object in image of interest, The image of interest includes gray level image and corresponding anaglyph, exists in time corresponding to t1 moment, the image of interest There is temporal formerly image in video flowing and can include in rear image, the object identifying method:Extract the figure interested The scale invariant feature of picture;And the scale invariant feature of the image based on extraction, identified using grader in image Object, the wherein grader are to train what is obtained using the scale invariant feature of multiple references object, and wherein references object Scale invariant feature be to use with the same method of scale invariant feature for extracting the image of interest to comprising references object Reference picture extract what is obtained, it is described extract the image of interest scale invariant feature include:Position the key in image Point, the coordinate (x1, y1, z1, t1) of key point is obtained, wherein x1, y1 represent coordinate of the key point on gray level image, and z1 is represented The parallax of key point, t1 represent the dimension of image in time;Around the description region of key point, this describes region Yi Guan for generation Centered on key point, in x, y, z, described in the t four-dimension, this describes the scope value of region each dimension in x, y, z, t dimensions not It is zero;And for each key point, region, generation description are described based on it.
In accordance with a further aspect of the present invention, there is provided a kind of object recognition equipment for identifying the object in image of interest, The image of interest includes gray level image and corresponding anaglyph, exists in time corresponding to t1 moment, the image of interest There is temporal formerly image in video flowing and include in rear image, the object recognition equipment:Scale invariant feature extractor, For extracting the scale invariant feature of the image of interest;And grader, the Scale invariant for the image based on extraction Feature, the object in image being identified, the wherein grader is to train what is obtained using the scale invariant feature of multiple references object, And the scale invariant feature of wherein references object is using the chi that the image of interest is extracted with scale invariant feature extractor The same method of degree invariant features extracts what is obtained to the reference picture comprising references object, the scale invariant feature extractor Including:Key point positioning unit, for positioning the key point in image, the coordinate (x1, y1, z1, t1) of key point is obtained, wherein X1, y1 represent coordinate of the key point on gray level image, and z1 represents the parallax of key point, and t1 represents the dimension of image in time Degree;Area generation unit described, around the description region of key point, this describes region centered on key point for generation, in x, y, Described in z, the t four-dimension, this describes region scope value of each dimension in x, y, z, t dimensions and is not zero;And description is raw Into unit, for for each key point, region, generation description to be described based on it.
Utilize the scale invariant feature extracting method and scale invariant feature extraction element, Object identifying of the embodiment of the present invention Method and object recognition equipment, the two-dimensional signal of depth information, time-domain information and the plane of delineation can be more intimately associated Consider, extract the scale invariant feature under space-time, directly can be applied to such four-dimensional scale invariant feature such as The machine learning techniques of grader classification, time domain and depth dimension information thus, it is possible to more efficiently utilize image, facility are entered Row target identification and/or tracking.
Brief description of the drawings
From the detailed description to the embodiment of the present invention below in conjunction with the accompanying drawings, of the invention these and/or other side and Advantage will become clearer and be easier to understand, wherein:
Fig. 1 shows feature extraction according to an embodiment of the invention and the input and output of object recognition system signal Figure.
Fig. 2 schematically shows the Object identifying of feature extraction and object recognition system 1 or the example of classification results.
Fig. 3 shows the frame of the exemplary configuration of feature extraction according to an embodiment of the invention and object recognition system Figure.
Fig. 4 show it is according to an embodiment of the invention can be crucial by positioning that key point positioning unit 111 performs The overview flow chart of the illustrative methods of point.
Fig. 5 schematically shows an example of the calculating of difference of Gaussian image.
Fig. 6 schematically illustrates the schematic diagram for the operation for determining the local extremum in difference of Gaussian image.
Fig. 7 is shown according to an embodiment of the invention can be enclosed by the generation that description Area generation unit 112 performs Around the flow chart of the illustrative methods in the description region of key point.
Fig. 8 shows exemplary description that can be performed by describing sub- generation unit 113 according to embodiments of the present invention The flow chart of generation method.
Fig. 9 shows the assigned direction according to an embodiment of the invention that can be used for realizing above-mentioned steps S1131 The flow chart of method.
Figure 10 shows the designated key point of the weight according to an embodiment of the invention for considering each pixel The flow chart of the method for principal direction.
Figure 11 shows that the generation of the step S1132 according to an embodiment of the invention that can be used for realizing Fig. 8 is crucial The flow chart of the illustrative methods of description of point.
Figure 12 shows the yardstick of the scale invariant feature of the first image in extraction video flowing according to embodiments of the present invention The flow chart of invariant feature extraction method.
Figure 13 shows the object identifying method of the object in a kind of identification image of interest according to embodiments of the present invention Flow chart.
Figure 14 shows the block diagram suitable for being used for the exemplary computer system for realizing embodiment of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention, with reference to the accompanying drawings and detailed description to this hair It is bright to be described in further detail.
Before being described in further detail, in order to make it easy to understand, introducing the core concept of the present invention first.Inventor It was found that in Object identifying or tracking, it is ratio that such as depth information and time-domain information, which are used in pre-treatment or post-processing stages, More conveniently, such as using depth information tentatively noise etc. is rejected.But such depth information or time-domain information and discomfort Conjunction is directly applied in the machine learning of such as grader.In such as grader, it usually needs based on being represented with characteristic vector Sample training obtain grader, then when new image to be sorted arrives after, extract the characteristic vector in the image, Such characteristic vector is input in grader, then obtains classification results.At present, there is no depth information or time domain Information kneading is to the mature technology in characteristic vector, and inventor is desirable to provide one kind by depth information, time-domain information and existing The two-dimentional SFIT features of gray level image are more intimately associated to extract characteristic vector, to be suitable for the technology that grader uses.
Fig. 1 shows feature extraction according to an embodiment of the invention and the input and output of object recognition system signal Figure.
As shown in figure 1, the feature extraction of the embodiment of the present invention and object recognition system 1 receive the gray scale from binocular camera Figure and depth map, object output recognition result.Fig. 2 schematically illustrates feature extraction and the object of object recognition system 1 is known Other or classification results examples, wherein label 1 is vehicle, and label 2 is pedestrian, and label 3 is other objects.
Fig. 3 shows the frame of the exemplary configuration of feature extraction according to an embodiment of the invention and object recognition system Figure.
As shown in figure 3, this feature extraction and object recognition system 100 include scale invariant feature extractor 110 and classification Device 120.Scale invariant feature extractor 110 is used to extract the scale invariant feature that describes below from image, and the image can be with The reference picture as training sample, for example, the reference picture comprising vehicle, the reference picture comprising pedestrian, comprising other right Reference picture of elephant etc..The scale invariant feature quilt and corresponding object identifier extracted from these reference pictures(Or Say category identifier)It is associated, training sample data can be used as to be used to train grader 120.The image can also be made To handle the image of interest of target, such as the binocular image that binocular camera captured in real-time on in-vehicle camera obtains, it is desirable to from In identify vehicle and/or pedestrian etc., to provide auxiliary information for driver, such as remind, alarm etc..Can be by binocular camera Obtained gray level image and anaglyph sequence is fed as input to scale invariant feature extractor 110, then Scale invariant Feature extractor 110 extracts scale invariant feature from image, and will extract scale invariant feature is output to grader 120, so that grader 120 provides Object identifying result.
Explanation is needed exist for, as described later, the scale invariant feature of the embodiment of the present invention is from four-dimensional x, y, z, t The feature extracted in the image in space, i.e., except x and the y dimension embodied in the gray level image of common plane, in addition to depth Tie up z and time dimension t.Therefore, the image as processing target is image sequence, such as continuous multiple image, and except ash Degree image also includes anaglyph.By the way, depth dimension herein is represented with z, but can also use d tables such as common Show, both represent identical meanings, are interchangeable.
Grader 120 can be trained using the training sample data of multiple references object.Training sample data can be Scale invariant feature is with the vector form of object identifier associated, it is necessary to which explanation, an object images can have multiple Key point, each key point has corresponding scale invariant feature, therefore an object images can have multigroup Scale invariant special Sign.Scale invariant feature in training sample data can be directed to using above-mentioned scale invariant feature extractor 110 comprising ginseng Examine object reference picture carry out scale invariant feature extraction and obtain or utilize outside function and operation and chi The other scale invariant feature extractors 110 of identical of degree invariant feature extraction device 110 extract what is obtained.Grader 120 can be Cooperated together with coming in locally train and local with scale invariant feature extractor 10, but can also be instructed in outside It is embedded into after practicing and locally comes what is cooperated with scale invariant feature extractor 110, or is also connected to remote-operable Scale invariant feature extractor 110 simultaneously cooperates therewith.In addition, grader 120 can be on-line training, also may be used To be off-line training.
Concrete composition and corresponding function and the operation of scale invariant feature extractor 110 are described below in detail.
Scale invariant feature extractor 110 includes key point positioning unit 111, description Area generation unit, 112, description Sub- generation unit 113.
Key point positioning unit 111 is used to position the key point (key point) in image, obtains the coordinate of key point (x1, y1, z1, t1), wherein x1, y1 represent coordinate of the key point on gray level image, and z1 represents the parallax of key point, and t1 is represented The dimension of image in time.
Key point is the concept in Scale invariant features transform SIFT, in some applications also referred to as local pixel amplitude Extreme value.
Difference of Gaussian image under different scale (scale) that can be by trying to achieve current gray level image, and calculate height The local extremum of this difference positions key point.In some examples, a key point can use its yardstick(Height i.e. where it The yardstick of this difference image), position, direction etc. describe.Determination about the key point in 2-D gray image may be referred to David G.Lowe article Distinctive image features from scale-invariant keypoints, International Journal of Computer Vision, 60,2 (2004), the introduction in pp.91-110.
Describe according to an embodiment of the invention by what key point positioning unit 111 performed to be determined below with reference to Fig. 4 The illustrative methods of position key point.
As shown in figure 4, in step S1111, difference of Gaussian image is calculated.
Difference of Gaussian image can be calculated as below based on following formula (1), (2), (3)
L(x,y,σ)=G(x,y,σ)*I(x,y) (1)
D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)
=L (x, y, k σ)-L (x, y, σ) ... (3)
(x, y) is image space domain coordinate in above formula, and I (x, y) is original-gray image, and σ is yardstick coordinate, and σ values influence The smoothness of the image of processing.G (x, y, σ) is changeable scale Gaussian function, and L (x, y, σ) is to pass through yardstick to gray level image Variable Gaussian function carries out the image obtained after convolution algorithm, and D (x, y, σ) is difference of Gaussian image, by the Gauss of different scale The difference of core generates with original image convolution algorithm.
Fig. 5 schematically shows an example of the calculating of difference of Gaussian image.The difference of Gaussian of this different scale Image is also image pyramid, and it can be established for example, by following manner:For piece image, it is established in different scale Image, also referred to as sub- octave (octave), is so to reach the purpose of Scale invariant, that is to say, that under any yardstick There can be corresponding characteristic point, the yardstick of first sub- octave is artwork size, and every sub- octave below is a upper son The result of octave down-sampled (downsample), such as 1/4 (that is, length and width halve respectively) for artwork, form next height eight Degree(Namely high one layer of pyramid).
Following formula (4) represents all values of metric space.
2i-1(σ,kσ,k2σ,...,kn-1σ), wherein k=21/S (4)
In formula (4), i is the numbering of the tower of sub- octave(Which tower), the number of plies of each tower of S expressions.Can be according to picture Size determines to build several towers, and each tower several tomographic images(S is generally 3~5 layers).0th floor of No. 0 tower is original image, Up each layer is to its next layer progress Laplace transform, i.e. Gaussian convolution(Wherein, σ values are gradually big, such as can be σ, k* σ,k*k*σ,…)Result, intuitively apparently more up picture is fuzzyyer for pyramid.Picture between tower is down-sampled relation, example Such as, the 0th floor of No. 1 tower can by No. 0 tower the 3rd floor it is down-sampled obtain, then carry out the Gaussian convolution similar with No. 0 tower and grasp Make.
In step S1112, local extremum is calculated.
For each pixel in difference of Gaussian image, by its gray value and around it eight neighborhood picture element and its The gray value of nine picture elements is compared corresponding to upper and lower two layers, if the gray value of the point is the maximum in this 27 points Or minimum value, as shown in fig. 6, the picture element is a Local Extremum, that is to say, that the pixel is one potential Key point.
This above-mentioned local extremum can be carried out to each pixel in all difference of Gaussian images and determines operation.
Next in step S113, based on time domain information filtering keys point.
The filtering is verified using previous frame and/or in rear frame information to key point, it is intended to obtains stable key point.
Specifically, in one example, for each of the key point that is determined in gray level image, it is determined that in time In first gray level image it is corresponding with the key point in the gray level image point around preset range in and in time It whether there is key point in the preset range around the point corresponding with the key point in the gray level image in gray level image afterwards. When in first gray level image in time it is corresponding with the key point in the gray level image point around preset range in and All deposited in the preset range around the point corresponding with the key point in the gray level image in rear gray level image in time In key point, the key point in the gray level image is just confirmed, the gray level image is removed in otherwise being combined from key point candidate In the key point.
The step can remove some noises in potential key point, because noise is random and unstable, it In continuous frame be difficult keep it is constant.And the key point of target be usually it is stable thus will be retained after filtration step Get off.
In the examples described above, directly key point is selected from the Local Extremum found.But, in some examples, may be used To find more excellent key point based on these Local Extremums to be fitted.
Fig. 3 is returned to, description Area generation unit 112 is generated around the description region of key point, and this describes region with key Centered on point, in x, y, z, described in the t four-dimension, this describes region scope value of each dimension in x, y, z, t dimensions and is not Zero
Fig. 7 is shown according to an embodiment of the invention can be enclosed by the generation that description Area generation unit 112 performs Around the flow chart of the illustrative methods in the description region of key point.
As shown in fig. 7, in step S1121, the key point navigated to is obtained.
In step S1122, specify in the plane of delineation around the scope in the region of key point.In one example, should The size of scope can be that a value such as k × disparity, wherein k related to the parallax of key point are predetermined constant, Disparity is the parallax of key point.Wherein, the parallax value of key point is bigger(I.e. it is nearer to correspond to object distance camera for key point) When, the radius of the plane of delineation will be bigger.
It should be noted that in the plane of delineation(That is x, y plane)Region can be a rectangle, therefore calculate restriction The data of area size can be the length and width of rectangle, and in the case where the rectangle is square, then the data can be square The length of side of shape.
In step S1123, specified time domain scope, the scope can be designated as 1 frame or 2 frames, i.e., current Before image and 1 frame afterwards or 2 frames belong to the scope in description region.
In step S1124, the scope of disparity domain Depth Domain in other words is specified.For example, scope in the depth dimension can be with It is designated as 1 or 2 parallax value.
The characteristic in the description region of the embodiment of the present invention is for the description region on space-time.In addition, above-mentioned reference Fig. 7 illustrates to describe scope value of the region really in usual practice merely illustrative, can choose as needed suitable other any Scope value.
Fig. 3 is returned to, sub- generation unit 113 is described and is used for for each key point, region, generation description are described based on it Son.
Description is equally the concept in scale invariant feature, key point description is in some documents, sometimes referred to as Be characterized description son.It describes key point by extracting some characteristics of key point near zone, and obtains the region (The alternatively referred to as key point)Scale invariant feature.
, it is necessary to designated key point before description of generation key point(Region is described in other words)Direction.Follow-up Feature extraction is carried out with respect to the direction, so as to ensure the rotational invariance of feature to a certain extent.
It is exemplary below with reference to Fig. 8 descriptions can be performed by the sub- generation unit 113 of description according to embodiments of the present invention Sub- generation method is described.
As shown in figure 8, the input for describing sub- generation method surrounds key point for the key point of first prelocalization with what is generated Description region and gray level image and anaglyph sequence.
In step S1131, assigned direction, i.e. designated key point or the direction for describing region.
In step S1132, each sub-regions in generation description region are sub relative to the description of the assigned direction.
The output of the sub- generation method of description of the embodiment can be description of generation.
Describe according to an embodiment of the invention to can be used for realizing above-mentioned steps S1131's with reference to figure 9 first below The method of assigned direction.This method S1131 can be with the key point of positioning, description region, gray level image and the disparity map of generation As sequence is as input.
As shown in figure 9, in step S11311, each pixel is in x, y, z, t space-times in calculating description region The gradient direction represented by three angles.
Assuming that coordinate representation of the pixel in x, y, z, the t four-dimension is L (x, y, z, t), following formula can be utilized (5)、(6)、(7)、(8)、(9)、(10)、(11)To calculate the gradient direction in space-time.
θ(x,y,z,t)=tan-1(Ly/Lx) (5)
Lx=L(x+1,y,z,t)-L(x-1,y,z,t) (8)
Ly=L(x,y+1,z,t)-L(x,y-1,z,t) (9)
Lz=L(x,y,z+1,t)-L(x,y,z-1,t) (10)
Lt=L(x,y,z,t+1)-L(x,y,z,t-1) (11)
In the equation above, LxIt is gradient of the pixel in X-direction, LyIt is the gradient of Y-direction, LzIt is disparity domain Z side To gradient, LtIt is the gradient in time-domain direction, that is, the gradient of two interframe.Three angle, θs (x, y, z, t), α (x, y, z, T) and β (x, y, z, t) is final gradient direction of the pixel in 4 dimension spaces, and the combination of these three angles represents the four-dimension A unique angle in space.
In step S11312, based on the gradient direction that each pixel is represented by three angles in description region, generation Three-dimensional histogram.
This histogram can be built in several ways, wherein a kind of simplest mode is by tri- angles of θ, α and β Span is divided into the angle of same number.For example, each angular configurations scope is 360 degree, it is divided into 36 angular configurations Scope, then which angular configurations subrange is tri- angles of θ, α and β of each pixel respectively fall in statistics description region, by This fall into the accumulative of the number of pixels of all angles value subrange, so as to generate three-dimensional histogram.
In step S11313, the peak value of three-dimensional histogram is determined, and key is determined based on identified peak value Principal direction of the point in space-time.
In one example, it may be determined that the highest peak value in three-dimensional histogram, and with the highest peak value institute Principal direction of the direction of representative as key point in space-time.
In another example, it may be determined that the highest peak value in three-dimensional histogram, it is then determined that value is more than Other local peakings of the 80% of the value of the peak-peak, the key is calculated based on such peak-peak and local peaking Principal direction of the point in space-time(For example, the weighted value using peak value).
An exemplary method of the principal direction for determining key point is described above with reference to Fig. 9.
In another example, in the step of calculating the peak value of histogram, it is also contemplated that each pixel to key point Distance and/or gradient magnitude to assign different weights to the contribution for being each pixel in histogram.
Figure 10 shows that the weight according to an embodiment of the invention for considering each pixel can be used for realizing The method of above-mentioned steps S1131 assigned direction, this method are indicated with S1131 '.
As shown in Figure 10, in step S11311 ', compared to the step S11311 in Fig. 9, difference is each except calculating Outside the direction of the gradient of individual pixel, the amplitude of the gradient of each pixel is also calculated.
The amplitude of the gradient of each pixel can be calculated according to following formula (12).
In compared to the additional step S113121 of Fig. 9, the weight of each pixel is calculated.
In one example, can according to the pixel to the distance of key point and the assignment of the gradient of the pixel, To calculate weight, such as shown in following formula (13).
mAD(x, y, z, t) is the gradient magnitude of the pixel, and Δ d is that the pixel is empty in four-dimensional x, y, z, t to key point Between distance.
It should be noted that weight calculation formula (13) is merely illustrative, the weight calculation formula that can be taken other form, As long as meeting that the gradient magnitude of the pixel is bigger, weight is bigger, and the distance of the pixel to key point is nearer, then weight It is bigger.
In step S11312 ', three-dimensional histogram is calculated.Step S11312 compared to Fig. 9, this step S11312 ' is when statistics falls into the number of pixels of some angle subrange, according to each picture fallen into some angle subrange The weight of element is come the number of pixels after being weighted.
Figure 10 step S11313 and Fig. 9 step S11313 are similar, are used to calculate the peak value of histogram, and then determine The principal direction of key point.
The illustrative methods for the principal direction for determining key point are described above with reference to Fig. 9 and Figure 10.Return to Fig. 8, it is determined that After the principal direction of key point, step S1132 is proceeded to.
In step S1132, the subregion in generation description region is sub relative to the description of the principal direction specified.
The life of the step S1132 according to an embodiment of the invention that can be used for realizing Fig. 8 is described below with reference to Figure 11 Into the illustrative methods of description of key point.
As shown in figure 11, in step S11321, centered on key point, rotation description region so that the master of key point Direction is aligned with reference direction.
The rotation process of this step, that is, it is that the principal direction of key point after to rotate is consistent with reference direction all the time, so as to So that the generation of description in varied situations has rotational invariance.
It is more sub-regions by description region division in step S11322, and for every sub-regions, based on the son The gradient direction of each pixel in region, generate the three-dimensional histogram of the subregion.
For example, can be 4 sub-regions by description region division, such as a square region is drawn in one example It has been divided into 4 square subregions domain.Then, for for example according to from left to right, direction from top to bottom, for every sub-regions, The gradient direction for counting each pixel in the subregion falls into three angle directions(θ,α,β)On all angles subrange(Example Such as, in the case where 360 degree are divided into 36 parts, 0~36,36~72 ... 324~360 all angles subrange)Interior picture Plain number.
In another example, can be according to each in subregion equally when generating the three-dimensional histogram of subregion The gradient magnitude of individual pixel and weight is calculated to the distance of key point, and the number of pixels after being weighted.
In step S11323, the three-dimensional histogram based on each sub-regions, produce key point description son.
In one example, the Feature Descriptor per sub-regions can be by three angle directions(θ,α,β)On 36 Individual component(I.e. each angle direction has 36 components, and expression for example falls into 0~36,36~72 on the angle direction ... 324 The number of pixels of~360 all angles subrange)Composition, that is, share 108 elements, be so made up of 4 sub-regions whole Individual region then shares 108*4=432 characteristic component, constitutes the key point or this describes the scale invariant feature in region, Referred to as description of the key point.
Described above with reference to Figure 11 a kind of after the principal direction of known key point, key is generated based on sub-zone dividing The method of description of point.Example is but only for, the description of key point can be generated using other methods.
Such as in another example, can be without sub-zone dividing, but directly utilize the three-dimensional side of whole region The description of key point is obtained to histogram.For example, still so that whole 360 degree of angular ranges are divided into 36 sub- models of angle Exemplified by enclosing, then 36*3=108 characteristic component will be now obtained, and constitute the key point or this describes the Scale invariant in region Description of feature, the also referred to as key point.
It should be noted that it is merely illustrative that 360 degree of angular range is divided into 36 angle subranges, can be divided into Other number angle subranges, such as it is divided into 8 angle subranges.In addition, in some cases, angular range can also It is not 360 degree, but other spans.
Illustrate to describe the operation example for description that sub- generation unit 113 generates key point above with reference to Fig. 8-Figure 11.
Return to Fig. 3, using the scale invariant feature extractor 110 shown in Fig. 3, can by depth information, time-domain information and The two-dimensional signal of the plane of delineation is more intimately associated consideration, extracts the scale invariant feature under space-time, can directly by Such four-dimensional scale invariant feature is applied to the machine learning techniques of such as grader classification, thus, it is possible to more efficient land productivity It is convenient to carry out target identification and/or tracking with the time domain and depth dimension information of image.
It should be noted that the scale invariant feature extractor 110 shown in Fig. 3 can be used as single device to carry out four Scale invariant feature extraction work is tieed up, and nonessential and grader 120 works together.
Below with reference to the scale invariant feature of the first image in Figure 12 descriptions extraction video flowing according to embodiments of the present invention Scale invariant feature extracting method.
The image includes the first gray level image and corresponding first anaglyph, in time corresponding to t1 moment, the figure As having temporal formerly image in video streaming and in rear image.Scale invariant feature extraction is described below with reference to Figure 12 Method example.
As shown in figure 12, in step S111, position image in key point, obtain key point coordinate (x1, y1, z1, T1), wherein x1, y1 represent coordinate of the key point on gray level image, and z1 represents the parallax of key point, and t1 represents image in the time On dimension.
In one example, the key point positioned in image can include:For the first gray level image, multiple Gausses are calculated Difference image;Position the pixel amplitudes extreme value in multiple difference of Gaussian images;And based on the pixel with pixel amplitudes extreme value, Determine key point.
In one example, the key point in the positioning image can also include:For the pass determined in gray level image Key point it is each, it is determined that in first gray level image in time around point corresponding with the key point in the gray level image Around the point corresponding with the key point in the gray level image in rear gray level image in preset range and in time Preset range in whether there is key point;And ought be in time first gray level image in the key in the gray level image Point corresponding to point around preset range in and in time in rear gray level image with the key in the gray level image When key point all be present in the preset range corresponding to point around point, just confirm the key point in the gray level image, otherwise go Except the key point in the gray level image.
In step S112, around the description region of key point, this describes region centered on key point for generation, in x, y, Described in z, the t four-dimension, this describes region scope value of each dimension in x, y, z, t dimensions and is not zero.
In one example, the scope value that the description region generated is tieed up in x, y with key point parallax value not Same and different, when the parallax value of key point is larger, description region is also larger in the scope value that x, y are tieed up.
In step S113, for each key point, region, generation description, as the key point are described based on it Scale invariant feature.
In one example, for each key point, region is described based on it, generation description can include:Calculating is retouched State the gradient direction that by three angles is represented of each pixel in x, y, z, t space-times in region;Based on description region The gradient direction that interior each pixel is represented by three angles, generates three-dimensional histogram;Determine three-dimensional histogram Peak value, and principal direction of the key point in space-time is determined based on identified peak value;And based on identified key Principal direction of the point in space-time, generation description.
In one example, the principal direction based on identified key point in space-time, generation description can wrap Include:Centered on key point, rotation description region so that the principal direction of key point is aligned with reference direction;Description region is drawn It is divided into more sub-regions, and for every sub-regions, based on the gradient direction of each pixel in the subregion, generates the son The three-dimensional histogram in region;And the three-dimensional histogram based on each sub-regions, the description for producing key point are sub.
It is described for each key point in another example, region is described based on it, generation description can also wrap Include:Calculate the amplitude of the gradient that by three angles is represented of each pixel in x, y, z, t space-times in description region;Base In width of each pixel to the distance of key point and the gradient of the pixel in region described in x, y, z, t space-times Value, to calculate the weight of the pixel;Wherein, generation stereogram includes, weight and each picture based on each pixel The gradient direction represented by three angles of vegetarian refreshments, it is determined that the weighted number of the pixel in the range of predetermined direction.
Know below with reference to the object of the object in a kind of Figure 13 descriptions identification image of interest according to embodiments of the present invention Other method.
As shown in figure 13, in step s 110, the scale invariant feature of the image of interest is extracted.Step S110 can be with Method as shown in Figure 12 is realized.
In the step s 120, the scale invariant feature of the image based on extraction, identified using grader in image Object, the wherein grader are to train what is obtained using the scale invariant feature of multiple references object, and wherein references object Scale invariant feature be to use with the same method of scale invariant feature for extracting the image of interest to comprising references object Reference picture extract what is obtained.
The present invention can also be interested by a kind of scale invariant feature/identification for being used to extract the first image in video flowing The computing system of object in image is implemented.Figure 14 shows the example calculation suitable for being used for realizing embodiment of the present invention The block diagram of system 400.As shown in figure 14, computing system 400 can include:CPU(CPU)401、RAM(Deposit at random Access to memory)402、ROM(Read-only storage)403rd, system bus 404, hard disk controller 405, KBC 406, serial Interface controller 407, parallel interface controller 408, display controller 409, hard disk 410, keyboard 411, serial peripheral equipment 412nd, concurrent peripheral equipment 413 and display 414.In such devices, coupled with system bus 404 have CPU401, RAM402, ROM403, hard disk controller 405, KBC 406, serial interface controller 407, parallel interface controller 408 With display controller 409.Hard disk 410 couples with hard disk controller 405, and keyboard 411 couples with KBC 406, serial outer Portion's equipment 412 couples with serial interface controller 407, and concurrent peripheral equipment 413 couples with parallel interface controller 48, Yi Jixian Show that device 414 couples with display controller 409.It should be appreciated that the structured flowchart described in Figure 14 is used for the purpose of the purpose of example, and Do not limit the scope of the present invention.In some cases, some equipment can be increased or decreased as the case may be.
Person of ordinary skill in the field knows that the present invention can be implemented as system, device, method or computer program Product.Therefore, the present invention can be implemented as following form, i.e.,:It can be complete hardware, can also be complete software (Including firmware, resident software, microcode etc.), can also be hardware and software combine form, referred to generally herein as " circuit ", " module ", " device " or " system ".In addition, in certain embodiments, the present invention is also implemented as calculating in one or more The form of computer program product in machine computer-readable recording medium, computer-readable program generation is included in the computer-readable medium Code.
Any combination of one or more computer-readable mediums can be used.Computer-readable medium can be computer Readable signal medium or computer-readable recording medium.Computer-readable recording medium can for example be but not limited to electricity, magnetic, Optical, electromagnetic, the system of infrared ray or semiconductor, device or device, or any combination above.Computer-readable storage medium The more specifically example of matter(Non exhaustive list)Including:Electrical connection with one or more wires, portable computer magnetic Disk, hard disk, random access memory(RAM), read-only storage (ROM), erasable programmable read only memory (EPROM or sudden strain of a muscle Deposit), optical fiber, portable compact disc read-only storage (CD-ROM), light storage device, magnetic memory device or above-mentioned appoint The suitable combination of meaning.In this document, computer-readable recording medium can be it is any include or the tangible medium of storage program, The program can be commanded the either device use or in connection of execution system, device.
Computer-readable signal media can include in a base band or as carrier wave a part propagation data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium beyond storage medium is read, the computer-readable medium, which can send, propagates or transmit, to be used for By instruction execution system, device either device use or program in connection.
The program code included on computer-readable medium can use any appropriate medium to transmit, including but not limited to without Line, electric wire, optical cable, RF etc., or above-mentioned any appropriate combination.
It can be write with one or more programming languages or its combination for performing the computer that operates of the present invention Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Also include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with Fully perform, partly perform on the user computer on the user computer, the software kit independent as one performs, portion Divide and partly perform or performed completely on remote computer or server on the remote computer on the user computer. It is related in the situation of remote computer, remote computer can be by the network of any kind-include LAN (LAN) or wide Domain net (WAN)-be connected to subscriber computer, or, it may be connected to outer computer(Such as provided using Internet service Business passes through Internet connection).
Above with reference to the method, apparatus of the embodiment of the present invention(System)With the flow chart and/or frame of computer program product Figure describes the present invention.It should be appreciated that each square frame in each square frame and flow chart and/or block diagram of flow chart and/or block diagram Combination, can be realized by computer program instructions.These computer program instructions can be supplied to all-purpose computer, special The processor of computer or other programmable data processing units, so as to produce a kind of machine, these computer program instructions Performed by computer or other programmable data processing units, generate and advised in the square frame in implementation process figure and/or block diagram The device of fixed function/operation.
These computer program instructions can also be stored in can cause computer or other programmable data processing units In the computer-readable medium to work in a specific way, so, the instruction being stored in computer-readable medium just produces one Command device (the instruction of function/operation specified in the individual square frame including in implementation process figure and/or block diagram Means manufacture)(manufacture).
Computer program instructions can also be loaded into computer, other programmable data processing units or miscellaneous equipment On so that series of operation steps is performed on computer, other programmable data processing units or miscellaneous equipment, in terms of producing The process that calculation machine is realized, so that the instruction performed on computer or other programmable devices can provide implementation process figure And/or the process of function/operation specified in the square frame in block diagram.
Flow chart and block diagram in accompanying drawing show system, method and the computer journey of multiple embodiments according to the present invention Architectural framework in the cards, function and the operation of sequence product.At this point, each square frame in flow chart or block diagram can generation The part of one module of table, program segment or code, a part for the module, program segment or code include one or more use In the executable instruction of logic function as defined in realization.It should also be noted that marked at some as in the realization replaced in square frame The function of note can also be with different from the order marked in accompanying drawing generation.For example, two continuous square frames can essentially base Originally it is performed in parallel, they can also be performed in the opposite order sometimes, and this is depending on involved function.It is also noted that It is the combination of each square frame and block diagram in block diagram and/or flow chart and/or the square frame in flow chart, can uses and perform rule Fixed function or the special hardware based system of operation are realized, or can use the group of specialized hardware and computer instruction Close to realize.
It is described above various embodiments of the present invention, described above is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.In the case of without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport The principle of each embodiment, practical application or improvement to the technology in market are best being explained, or is making the art Other those of ordinary skill are understood that each embodiment disclosed herein.

Claims (10)

1. a kind of scale invariant feature extracting method for extracting the four-dimensional scale invariant feature of the first image in video flowing, the image Including the first gray level image and corresponding first anaglyph, have in video streaming corresponding to t1 moment, the image in time First image in having time and include in rear image, this method:
The key point in image is positioned, obtains the coordinate (x1, y1, z1, t1) of key point, wherein x1, y1 represent key point in ash The coordinate spent on image, z1 represent the parallax of key point, and t1 represents the dimension of image in time;
Around the description region of key point, this describes region centered on key point,, should described in the t four-dimension in x, y, z for generation Description region scope value of each dimension in x, y, z, t dimensions is not zero;And
For each key point, region, generation description, the four-dimensional scale invariant feature as the key point are described based on it.
It is described for each key point 2. scale invariant feature extracting method according to claim 1, region is described based on it, Generation description attached bag includes:
Calculate the gradient direction that by three angles is represented of each pixel in x, y, z, t space-times in description region;
The gradient direction represented based on each pixel in description region by three angles, generates three-dimensional histogram;
The peak value of three-dimensional histogram is determined, and main side of the key point in space-time is determined based on identified peak value To;And
Based on principal direction of the identified key point in space-time, generation description.
3. scale invariant feature extracting method according to claim 2, the key point based on determined by is in space-time Principal direction, generation description attached bag include:
Centered on key point, rotation description region so that the principal direction of key point is aligned with reference direction;
It is more sub-regions by description region division, and for every sub-regions, based on each pixel in the subregion Gradient direction, generate the three-dimensional histogram of the subregion;And
Three-dimensional histogram based on each sub-regions, produce description of key point.
It is described for each key point 4. scale invariant feature extracting method according to claim 2, region is described based on it, Generation description also includes:
Calculate the amplitude of the gradient that by three angles is represented of each pixel in x, y, z, t space-times in description region;
Based on described in x, y, z, t space-times in region each pixel to the distance of key point and the gradient of the pixel Amplitude, to calculate the weight of the pixel;
Wherein, generation stereogram includes, and weight and each pixel based on each pixel are represented by three angles Gradient direction, it is determined that the weighted number of the pixel in the range of predetermined direction.
5. scale invariant feature extracting method according to claim 1, wherein, the scope that the description region generated is tieed up in x, y Value is different and different with the parallax value of key point, when the parallax value of key point is larger, describes the scope that region is tieed up in x, y Value is also larger.
6. scale invariant feature extracting method according to claim 1, wherein the key point in the positioning image includes:
For the first gray level image, multiple difference of Gaussian images are calculated;
Position the pixel amplitudes extreme value in multiple difference of Gaussian images;And
Based on the pixel with pixel amplitudes extreme value, key point is determined.
7. scale invariant feature extracting method according to claim 6, wherein the key point in the positioning image also includes:
For each of the key point that is determined in gray level image, it is determined that in first gray level image in time with the gray level image In the key point corresponding to point around preset range in and in time in rear gray level image with the gray-scale map It whether there is key point in preset range corresponding to the key point as in around point;And
When in the preset range in first gray level image in time around point corresponding with the key point in the gray level image And in the preset range around the point corresponding with the key point in the gray level image in rear gray level image in time When key point all be present, the key point in the gray level image is just confirmed, otherwise remove the key point in the gray level image.
8. a kind of object identifying method for identifying the object in image of interest, the image of interest is including gray level image and correspondingly Anaglyph, in time corresponding to the t1 moment, the image of interest have in video streaming temporal formerly image and In rear image, the object identifying method includes:
Extract the four-dimensional scale invariant feature of the image of interest;And
The four-dimensional scale invariant feature of the image based on extraction, the object in image, wherein this point are identified using grader Class device is to train what is obtained using the four-dimensional scale invariant feature of multiple references object, and the wherein four-dimensional yardstick of references object Invariant features are to use the method same with the four-dimensional scale invariant feature for extracting the image of interest to comprising references object Reference picture extracts what is obtained,
The four-dimensional scale invariant feature for extracting the image of interest includes:
The key point in image is positioned, obtains the coordinate (x1, y1, z1, t1) of key point, wherein x1, y1 represent key point in ash The coordinate spent on image, z1 represent the parallax of key point, and t1 represents the dimension of image in time;
Around the description region of key point, this describes region centered on key point,, should described in the t four-dimension in x, y, z for generation Description region scope value of each dimension in x, y, z, t dimensions is not zero;And
For each key point, region, generation description are described based on it.
9. a kind of scale invariant feature extraction element for extracting the four-dimensional scale invariant feature of the first image in video flowing, the image Including the first gray level image and corresponding first anaglyph, have in video streaming corresponding to t1 moment, the image in time First image in having time and include in rear image, the device:
Key point positioning unit, for positioning the key point in image, the coordinate (x1, y1, z1, t1) of key point is obtained, wherein X1, y1 represent coordinate of the key point on gray level image, and z1 represents the parallax of key point, and t1 represents the dimension of image in time Degree;
Area generation unit described, around the description region of key point, this describes region centered on key point for generation, in x, y, Described in z, the t four-dimension, this describes region scope value of each dimension in x, y, z, t dimensions and is not zero;And
Sub- generation unit is described, for for each key point, region, generation description to be described based on it.
10. a kind of object recognition equipment for identifying the object in image of interest, the image of interest include gray level image and right The anaglyph answered, there is temporal first image in video streaming corresponding to t1 moment, the image of interest in time Include with rear image, the object recognition equipment:
Scale invariant feature extractor, for extracting the four-dimensional scale invariant feature of the image of interest;And
Grader, for the four-dimensional scale invariant feature of the image based on extraction, identify the object in image, the wherein classification Device is to train what is obtained using the four-dimensional scale invariant feature of multiple references object, and wherein the four-dimensional yardstick of references object is not Change is characterized in using the method same with the four-dimensional scale invariant feature that scale invariant feature extractor extracts the image of interest What is obtained is extracted to the reference picture comprising references object,
The scale invariant feature extractor includes:
Key point positioning unit, for positioning the key point in image, the coordinate (x1, y1, z1, t1) of key point is obtained, wherein X1, y1 represent coordinate of the key point on gray level image, and z1 represents the parallax of key point, and t1 represents the dimension of image in time Degree;
Area generation unit described, around the description region of key point, this describes region centered on key point for generation, in x, y, Described in z, the t four-dimension, this describes region scope value of each dimension in x, y, z, t dimensions and is not zero;And
Sub- generation unit is described, for for each key point, region, generation description to be described based on it.
CN201310553060.1A 2013-11-08 2013-11-08 Scale invariant feature extracting method and device, object identifying method and device Expired - Fee Related CN104636745B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310553060.1A CN104636745B (en) 2013-11-08 2013-11-08 Scale invariant feature extracting method and device, object identifying method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310553060.1A CN104636745B (en) 2013-11-08 2013-11-08 Scale invariant feature extracting method and device, object identifying method and device

Publications (2)

Publication Number Publication Date
CN104636745A CN104636745A (en) 2015-05-20
CN104636745B true CN104636745B (en) 2018-04-10

Family

ID=53215473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310553060.1A Expired - Fee Related CN104636745B (en) 2013-11-08 2013-11-08 Scale invariant feature extracting method and device, object identifying method and device

Country Status (1)

Country Link
CN (1) CN104636745B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107507203B (en) * 2017-05-02 2020-04-14 大连理工大学 Method for automatically extracting boundary straight line angle of server equipment based on thermal infrared image
CN111414925B (en) * 2020-03-18 2023-04-18 北京百度网讯科技有限公司 Image processing method, apparatus, computing device and medium
CN112002039A (en) * 2020-08-22 2020-11-27 王冬井 Automatic control method for file cabinet door based on artificial intelligence and human body perception

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101009021A (en) * 2007-01-25 2007-08-01 复旦大学 Video stabilizing method based on matching and tracking of characteristic
CN102890791A (en) * 2012-08-31 2013-01-23 浙江捷尚视觉科技有限公司 Depth information clustering-based complex scene people counting method
CN103034860A (en) * 2012-12-14 2013-04-10 南京思创信息技术有限公司 Scale-invariant feature transform (SIFT) based illegal building detection method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8155451B2 (en) * 2004-11-12 2012-04-10 Kitakyushu Foundation For The Advancement Of Industry, Science And Technology Matching apparatus, image search system, and histogram approximate restoring unit, and matching method, image search method, and histogram approximate restoring method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101009021A (en) * 2007-01-25 2007-08-01 复旦大学 Video stabilizing method based on matching and tracking of characteristic
CN102890791A (en) * 2012-08-31 2013-01-23 浙江捷尚视觉科技有限公司 Depth information clustering-based complex scene people counting method
CN103034860A (en) * 2012-12-14 2013-04-10 南京思创信息技术有限公司 Scale-invariant feature transform (SIFT) based illegal building detection method

Also Published As

Publication number Publication date
CN104636745A (en) 2015-05-20

Similar Documents

Publication Publication Date Title
CN109753885B (en) Target detection method and device and pedestrian detection method and system
JP7490141B2 (en) IMAGE DETECTION METHOD, MODEL TRAINING METHOD, IMAGE DETECTION APPARATUS, TRAINING APPARATUS, DEVICE, AND PROGRAM
CN110378838B (en) Variable-view-angle image generation method and device, storage medium and electronic equipment
CN111860138B (en) Three-dimensional point cloud semantic segmentation method and system based on full fusion network
CN104115074B (en) hologram processing method and system
CN105956560A (en) Vehicle model identification method based on pooling multi-scale depth convolution characteristics
CN104021381B (en) Human movement recognition method based on multistage characteristics
CN107886110A (en) Method for detecting human face, device and electronic equipment
CN104636745B (en) Scale invariant feature extracting method and device, object identifying method and device
CN114972016A (en) Image processing method, image processing apparatus, computer device, storage medium, and program product
Zong et al. A cascaded refined rgb-d salient object detection network based on the attention mechanism
Li et al. Self-supervised social relation representation for human group detection
Zhang et al. Urformer: Unified representation lidar-camera 3d object detection with transformer
CN117094895B (en) Image panorama stitching method and system
WO2018143277A1 (en) Image feature value output device, image recognition device, image feature value output program, and image recognition program
CN112509129A (en) Spatial view field image generation method based on improved GAN network
CN116958873A (en) Pedestrian tracking method, device, electronic equipment and readable storage medium
CN105374043B (en) Visual odometry filtering background method and device
Wang et al. GA-STIP: Action recognition in multi-channel videos with geometric algebra based spatio-temporal interest points
Cai et al. Deep representation and stereo vision based vehicle detection
KR101937585B1 (en) Cost Aggregation Apparatus and Method for Depth Image Generation, and Recording Medium thereof
Zhang et al. A novel 2D-to-3D scheme by visual attention and occlusion analysis
CN113628349B (en) AR navigation method, device and readable storage medium based on scene content adaptation
CN106056599B (en) A kind of object recognition algorithm and device based on Object Depth data
Agushinta et al. A method of cloud and image-based tracking for Indonesia fruit recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180410

Termination date: 20201108

CF01 Termination of patent right due to non-payment of annual fee