WO2019001481A1 - 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备 - Google Patents

车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备 Download PDF

Info

Publication number
WO2019001481A1
WO2019001481A1 PCT/CN2018/093165 CN2018093165W WO2019001481A1 WO 2019001481 A1 WO2019001481 A1 WO 2019001481A1 CN 2018093165 W CN2018093165 W CN 2018093165W WO 2019001481 A1 WO2019001481 A1 WO 2019001481A1
Authority
WO
WIPO (PCT)
Prior art keywords
vehicle
feature
image
region
target vehicle
Prior art date
Application number
PCT/CN2018/093165
Other languages
English (en)
French (fr)
Inventor
伊帅
王重道
唐路明
闫俊杰
王晓刚
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to JP2019562381A priority Critical patent/JP7058669B2/ja
Publication of WO2019001481A1 publication Critical patent/WO2019001481A1/zh
Priority to US16/678,870 priority patent/US11232318B2/en
Priority to US17/533,484 priority patent/US20220083802A1/en
Priority to US17/533,469 priority patent/US20220083801A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Definitions

  • the embodiments of the present application relate to artificial intelligence technologies, and in particular, to a vehicle appearance feature recognition method, device, storage medium, and electronic device, and a vehicle retrieval method, device, storage medium, and electronic device.
  • the retrieval task of the vehicle refers to giving a picture of the vehicle to be queried, and searching all the pictures of the vehicle in the vehicle picture in the large-scale vehicle picture database.
  • the purpose of the embodiment of the present application is to provide a technical solution for recognizing the appearance of a vehicle and a technical solution for vehicle retrieval.
  • a vehicle appearance feature recognition method includes: acquiring, from a to-be-identified image, a plurality of region segmentation results of a target vehicle; and based on the plurality of region segmentation results, from the The global feature data and the plurality of regional feature data are extracted from the image to be identified; and the global feature data and the plurality of the regional feature data are merged to obtain appearance feature data of the target vehicle.
  • the plurality of region segmentation results respectively correspond to regions of different orientations of the target vehicle.
  • the plurality of region segmentation results include segmentation results of the front, back, left, and right sides of the target vehicle.
  • the acquiring, by the image to be identified, the plurality of region segmentation results of the target vehicle comprising: acquiring, by using the first neural network for region extraction, the plurality of region segments of the target vehicle from the image to be identified result.
  • the first neural network has a first feature extraction layer and a first calculation layer connected at an end of the first feature extraction layer, wherein the first neural network for region extraction is used from the image to be identified
  • Obtaining a plurality of region segmentation results of the target vehicle comprising: performing feature extraction on the image to be identified by the first feature extraction layer to obtain a plurality of key points of the target vehicle;
  • the layer classifies the plurality of key points to obtain a plurality of key point clusters, and respectively fuses the feature maps of the key points in the plurality of key point clusters, and obtains the area division corresponding to the plurality of key point clusters result.
  • the extracting the global feature data and the plurality of region feature data from the to-be-identified image based on the plurality of region segmentation results including: performing feature extraction based on the plurality of region segmentation results
  • the second neural network extracts global feature data and a plurality of region feature data of the target vehicle from the image to be identified.
  • the second neural network has a first processing subnet and a plurality of second processing subnets respectively connected to the output end of the first processing subnet, wherein the first processing subnet has a first a second feature extraction layer, a first boot module, and a first pooling layer, the second processing subnet having a second computing layer, a second booting module, and a second pool connected to an output end of the first processing subnet Layer.
  • the extracting the global feature data and the plurality of regional feature data of the target vehicle from the to-be-identified image by using the second neural network for feature extraction based on the plurality of region segmentation results including: And performing a convolution operation and a pooling operation on the to-be-identified image by the second feature extraction layer to obtain a global feature map of the target vehicle; and performing convolution operation on the global feature map by using the first startup module a pooling operation, obtaining a first feature map set of the target vehicle; performing a pooling operation on the feature map in the first feature map set by the first pooling layer to obtain a global feature of the target vehicle vector.
  • the extracting global feature data and the plurality of regional feature data of the target vehicle from the to-be-identified image by using the second neural network for feature extraction based on the plurality of region segmentation results further including And multiplying the plurality of region segmentation results by the second computing layer by the second feature layer to obtain a local feature map corresponding to the plurality of region segmentation results respectively; Performing a convolution operation and a pooling operation on a plurality of local feature maps of the segmentation result, obtaining a plurality of second feature map sets corresponding to the region segmentation result; and the plurality of the regions by using the second pooling layer
  • the second feature map set of the segmentation result is subjected to a pooling operation to obtain a plurality of first region feature vectors corresponding to the region segmentation result.
  • the method further includes: using the second computing layer to The segmentation results are respectively scaled to the same size as the size of the global feature map.
  • the merging the global feature data and the plurality of the region feature data comprises: global feature data of the target vehicle and a plurality of the regions by a third neural network for feature fusion Feature data is fused.
  • the third neural network has a first fully connected layer, a third computing layer and a second fully connected layer connected to an output end of the second neural network, wherein the pass is used for feature fusion
  • the third neural network merging the global feature data of the target vehicle and the plurality of the regional feature data includes: obtaining a weight value of the first region feature vector by using the first fully connected layer; and performing, by using the third calculation
  • the layer weights the plurality of the first region feature vectors according to the weight value to obtain a corresponding plurality of second region feature vectors; and the plurality of the second region feature vectors and the second fully connected layer
  • the global feature vector performs a mapping operation to obtain an appearance feature vector of the target vehicle.
  • the obtaining the weight value of the first region feature vector by using the first fully connected layer includes: performing a splicing operation on the plurality of the first region feature vectors to obtain the spliced first region feature vector; Performing a mapping operation on the spliced first region feature vector by using the first all-connection layer to obtain a set of scalars corresponding to the plurality of first region feature vectors; normalizing scalars in the set Operation, obtaining weight values of a plurality of the first region feature vectors.
  • the first feature extraction layer is an hourglass network structure.
  • a vehicle retrieval method includes: obtaining appearance characteristic data of a target vehicle in an image to be retrieved according to the method according to the first aspect of the embodiments of the present application; and searching for a target candidate to be matched with the appearance feature data from the image of the to-be-selected vehicle image Vehicle image.
  • the searching for the target candidate vehicle image that matches the appearance feature data from the image of the candidate vehicle image comprises: determining that the appearance feature vector of the target vehicle is respectively more than the image library of the candidate vehicle a cosine distance of the appearance feature vector of the vehicle of the candidate vehicle image; determining a target candidate vehicle image that matches the target vehicle based on the cosine distance.
  • the method further includes: acquiring a shooting time and/or a shooting position of the image to be retrieved, and a shooting time and/or a shooting position of the plurality of the to-be-selected vehicle images; according to the shooting time and And/or the photographing position determines a spatiotemporal distance of the target vehicle and the plurality of vehicles in the image of the candidate vehicle; determining the image of the to-be-selected vehicle image library according to the cosine distance and the spatiotemporal distance A target vehicle image to be selected that matches the target vehicle.
  • the determining, according to the cosine distance and the space-time distance, a target candidate vehicle image that matches the target vehicle in the image of the candidate vehicle image including: according to the cosine distance, in the waiting Obtaining a plurality of the to-be-selected vehicle images in the selected vehicle image library; determining a spatiotemporal matching probability of the to-be-selected vehicle image and the target vehicle based on the shooting time and the shooting position of the to-be-selected vehicle image, respectively; A cosine distance and the spatiotemporal matching probability determine a target candidate vehicle image that matches the target vehicle.
  • a vehicle appearance feature recognizing device includes: a first acquiring module, configured to acquire a plurality of region segmentation results of the target vehicle from the image to be identified; and an extracting module, configured to extract, according to the plurality of region segmentation results, the image to be identified
  • the global feature data and the plurality of regional feature data; the fusion module is configured to fuse the global feature data and the plurality of the regional feature data to obtain appearance feature data of the target vehicle.
  • the plurality of region segmentation results respectively correspond to regions of different orientations of the target vehicle.
  • the plurality of region segmentation results include segmentation results of the front, back, left, and right sides of the target vehicle.
  • the first acquiring module includes: an obtaining submodule, configured to acquire, by using a first neural network for area extraction, a plurality of regional segmentation results of the target vehicle from the to-be-identified image.
  • the first neural network has a first feature extraction layer and a first computing layer connected at an end of the first feature extraction layer, wherein the acquiring submodule is configured to: pass the first feature Extracting, performing feature extraction on the image to be identified, and obtaining a plurality of key points of the target vehicle; classifying the plurality of key points by using the first computing layer to obtain a plurality of key point clusters, and respectively targeting The feature maps of the key points in the plurality of key point clusters are merged, and the region segmentation results corresponding to the plurality of key point clusters are obtained.
  • the extracting module includes: an extracting submodule, configured to extract, according to the plurality of regional segmentation results, a global state of the target vehicle from the to-be-identified image by using a second neural network for feature extraction Feature data and multiple region feature data.
  • the second neural network has a first processing subnet and a plurality of second processing subnets respectively connected to the output end of the first processing subnet, wherein the first processing subnet has a first a second feature extraction layer, a first boot module, and a first pooling layer, the second processing subnet having a second computing layer, a second booting module, and a second pool connected to an output end of the first processing subnet Layer.
  • the extracting sub-module includes: a first feature extracting unit, configured to perform a convolution operation and a pooling operation on the to-be-identified image by using the second feature extraction layer to obtain a global a second feature extraction unit, configured to perform a convolution operation and a pooling operation on the global feature map by using the first startup module to obtain a first feature map set of the target vehicle; And performing a pooling operation on the feature map in the first feature map set by using the first pooling layer to obtain a global feature vector of the target vehicle.
  • the extracting sub-module further includes: a first calculating unit, configured to separately multiply the plurality of regional segmentation results and the global feature graph by the second computing layer to obtain the plurality of a local feature map corresponding to each of the region segmentation results; the third feature extraction unit is configured to perform a convolution operation and a pooling operation on the local feature maps of the plurality of the region segmentation results by using the second startup module to obtain multiple a second feature map set corresponding to the region segmentation result, where the second pooling unit is configured to perform a pooling operation on the second feature map set of the plurality of the region segmentation results by using the second pooling layer to obtain a plurality of The first region feature vector corresponding to the region segmentation result.
  • a first calculating unit configured to separately multiply the plurality of regional segmentation results and the global feature graph by the second computing layer to obtain the plurality of a local feature map corresponding to each of the region segmentation results
  • the third feature extraction unit is configured to perform a convolution operation and a pooling operation on the
  • the extracting submodule further includes: a second calculating unit, configured to respectively scale, by using the second computing layer, the plurality of region segmentation results to a size equal to a size of the global feature map.
  • a second calculating unit configured to respectively scale, by using the second computing layer, the plurality of region segmentation results to a size equal to a size of the global feature map.
  • the fusion module includes: a fusion submodule, configured to fuse the global feature data of the target vehicle and the plurality of the regional feature data by using a third neural network for feature fusion.
  • the third neural network has a first fully connected layer, a third computing layer, and a second fully connected layer connected to an output end of the second neural network
  • the fusion submodule includes: a first acquiring unit, configured to acquire a weight value of the first region feature vector by using the first fully connected layer, and a third calculating unit, configured to use the third computing layer to perform, according to the weight value, a plurality of the The first region feature vectors are respectively weighted to obtain a plurality of corresponding second region feature vectors, and the mapping unit is configured to perform mapping operations on the plurality of the second region feature vectors and the global feature vector by using the second fully connected layer. Obtaining an appearance feature vector of the target vehicle.
  • the first acquiring unit is configured to: perform splicing operation on the plurality of the first region feature vectors, to obtain the spliced first region feature vector; and perform the splicing by using the first fully connected layer And performing a mapping operation on the first region feature vector to obtain a set of scalars corresponding to the plurality of first region feature vectors; performing normalization operations on the plurality of scalars in the set to obtain a plurality of the first regions The weight value of the feature vector.
  • the first feature extraction layer is an hourglass network structure.
  • a vehicle retrieval device includes: a second acquiring module, configured to acquire appearance feature data of a target vehicle in an image to be retrieved by using the device according to the third aspect of the embodiment of the present application; and a searching module, configured to search from the image database of the to-be-selected vehicle A target candidate vehicle image that matches the appearance feature data.
  • the searching module is configured to: determine a cosine distance of an appearance feature vector of the target vehicle and an appearance feature vector of a vehicle of the to-be-selected vehicle image in the to-be-selected vehicle image library; according to the cosine The distance determines a target candidate vehicle image that matches the target vehicle.
  • the device further includes: a third acquiring module, configured to acquire a shooting time and/or a shooting position of the image to be retrieved, and a shooting time and/or a shooting position of the plurality of the to-be-selected vehicle images a first determining module, configured to determine a spatiotemporal distance of the target vehicle and the plurality of the vehicles in the image of the candidate vehicle according to the shooting time and/or the shooting position; and a second determining module, configured to: Determining, in the candidate vehicle image library, a target candidate vehicle image that matches the target vehicle according to the cosine distance and the spatiotemporal distance.
  • a third acquiring module configured to acquire a shooting time and/or a shooting position of the image to be retrieved, and a shooting time and/or a shooting position of the plurality of the to-be-selected vehicle images
  • a first determining module configured to determine a spatiotemporal distance of the target vehicle and the plurality of the vehicles in the image of the candidate vehicle according to the shooting time and/
  • the second determining module is configured to: acquire, according to the cosine distance, a plurality of the to-be-selected vehicle images in the to-be-selected vehicle image library; respectively, based on a shooting time of the to-be-selected vehicle image And a shooting position, determining a space-time matching probability of the image of the candidate vehicle and the target vehicle; and determining a target candidate vehicle image that matches the target vehicle based on the cosine distance and the space-time matching probability.
  • a computer readable storage medium having stored thereon computer program instructions, wherein the program instructions are executed by a processor to implement the first aspect of the embodiments of the present application The steps of the vehicle appearance feature recognition method.
  • a computer readable storage medium having stored thereon computer program instructions, wherein the program instructions are executed by a processor to implement the second aspect of the embodiments of the present application The steps of the vehicle retrieval method.
  • an electronic device includes: a first processor, a first memory, a first communication component, and a first communication bus, the first processor, the first memory And the first communication element completes communication with each other through the first communication bus; the first memory is configured to store at least one executable instruction, the executable instruction causing the first processor to execute as in the present application.
  • an electronic device includes: a second processor, a second memory, a second communication component, and a second communication bus, the second processor, the second memory And the second communication component completes communication with each other through the second communication bus; the second memory is configured to store at least one executable instruction, the executable instruction causing the second processor to execute as in the present application The steps of the vehicle retrieval method of the second aspect of the embodiment.
  • the vehicle appearance feature recognition method of the embodiment of the present application a plurality of region segmentation results of the target vehicle are acquired from the image to be identified; and then global feature data is extracted from the image to be identified based on the plurality of region segmentation results. a plurality of regional feature data; and the fusion of the global feature data and the plurality of the regional feature data to obtain the appearance feature data of the target vehicle, and the vehicle appearance feature recognition method in the embodiment of the present application is obtained in the prior art.
  • the vehicle appearance feature identified by the embodiment of the present application includes the feature of the local region of the vehicle appearance in addition to the global feature, and can reflect the detailed information of the target vehicle through the local region feature, thereby enabling a more accurate description.
  • the appearance of the vehicle In addition, the appearance characteristics of the vehicle identified by the embodiment of the present application can directly compare the appearance features of the vehicles in different vehicle images, and solve the problem that different regions between different vehicle images cannot be compared.
  • FIG. 1 is a flow chart of an embodiment of a vehicle appearance feature recognition method according to the present application.
  • FIG. 2 is a flow chart of another embodiment of a vehicle appearance feature recognition method according to the present application.
  • FIG. 3 is a schematic diagram of a distribution of key points of a vehicle implementing the method embodiment of FIG. 2.
  • FIG. 4 is a schematic diagram of a network framework implementing the method embodiment of FIG. 2.
  • Figure 5 is a schematic illustration of the results of vehicle region segmentation implementing the method embodiment of Figure 2;
  • Figure 6 is a schematic illustration of the weighting values of the vehicle region implementing the method embodiment of Figure 2;
  • FIG. 7 is a flow chart of one embodiment of a vehicle retrieval method in accordance with the present application.
  • FIG. 8 is a flow chart of another embodiment of a vehicle retrieval method in accordance with the present application.
  • FIG. 9 is a schematic illustration of a similar distance of a vehicle implementing the method embodiment of FIG.
  • FIG. 10 is a schematic structural view of an embodiment of a vehicle appearance feature recognizing apparatus according to the present application.
  • Figure 11 is a block diagram showing another embodiment of a vehicle appearance feature recognizing apparatus according to the present application.
  • Figure 12 is a block diagram showing an embodiment of a vehicle retrieval device according to the present application.
  • Figure 13 is a block diagram showing another embodiment of a vehicle retrieval device according to the present application.
  • FIG. 14 is a schematic structural diagram of an embodiment of an electronic device suitable for implementing a terminal device or a server according to an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of another embodiment of an electronic device suitable for implementing a terminal device or a server of an embodiment of the present application.
  • Embodiments of the present application can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general purpose or special purpose computing system environments or configurations.
  • Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, servers, and the like include, but are not limited to, personal computer systems, server computer systems, thin clients, thick clients Machines, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, networked personal computers, small computer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the above, and the like.
  • Electronic devices such as terminal devices, computer systems, servers, etc., can be described in the general context of computer system executable instructions (such as program modules) being executed by a computer system.
  • program modules may include routines, programs, target programs, components, logic, data structures, and the like that perform particular tasks or implement particular abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment where tasks are performed by remote processing devices that are linked through a communication network.
  • program modules may be located on a local or remote computing system storage medium including storage devices.
  • FIG. 1 is a flow chart of an embodiment of a vehicle appearance feature recognition method according to the present application.
  • step S101 a plurality of region segmentation results of the target vehicle are acquired from the image to be identified.
  • the image to be recognized may be an image including a part of the target vehicle or an image including the entire target vehicle, etc., from the content contained in the image.
  • the image to be identified may be a still image captured, or a video image in a sequence of video frames, or a composite image or the like.
  • the plurality of region segmentation results respectively correspond to regions of different orientations of the target vehicle.
  • the plurality of region segmentation results may include, but are not limited to, segmentation results of the front, back, left, and right sides of the target vehicle.
  • the plurality of region segmentation results are not limited to the segmentation results of the four regions including the front, the back, the left, and the right of the target vehicle.
  • the plurality of region segmentation results may further include segmentation results of the front, back, left, right, top, and bottom regions of the target vehicle, and the plurality of region segmentation results may further include front, back, and front of the target vehicle.
  • the area segmentation result is a single channel weight map, and the size of the value in the region segmentation result indicates the importance degree of the corresponding position in the image to be identified, that is, the larger the value in the region segmentation result indicates the corresponding position in the image to be identified. The higher the degree of importance, the smaller the value in the segmentation result indicates that the importance of the corresponding position in the image to be recognized is lower.
  • the step S101 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by the first obtaining module 501 being executed by the processor.
  • Step S102 extracting global feature data and a plurality of regional feature data from the image to be identified based on the plurality of region segmentation results.
  • the global feature data and the plurality of regional feature data are global feature data of the target vehicle and the plurality of regional feature data
  • the global feature data may be a global feature represented by the vector
  • the regional feature data may be a regional feature represented by the vector.
  • the step S102 may be performed by a processor invoking a corresponding instruction stored in a memory, or may be performed by an extraction module 502 being executed by the processor.
  • Step S103 the global feature data and the plurality of regional feature data are merged to obtain appearance feature data of the target vehicle.
  • the dimension of the global feature vector is the same as the dimension of the region feature vector.
  • the appearance feature data of the target vehicle includes features of a plurality of partial regions of the target vehicle and features of a global region of the target vehicle.
  • step S103 may be performed by the processor invoking a corresponding instruction stored in the memory or by the fusion module 503 being executed by the processor.
  • a plurality of region segmentation results of the target vehicle are acquired from the image to be identified; and then global feature data and a plurality of region feature data are extracted from the image to be identified based on the plurality of region segmentation results. And integrating the global feature data and the plurality of regional feature data to obtain the appearance feature data of the target vehicle.
  • the vehicle appearance feature recognition method of the present embodiment is compared with the method for acquiring the appearance feature of the vehicle in the prior art.
  • the obtained vehicle appearance feature in addition to the global feature, also includes the feature of the local area of the vehicle appearance, and can reflect the detailed information of the target vehicle through the local area feature, thereby enabling a more accurate description of the appearance of the vehicle.
  • the appearance characteristics of the vehicle identified by the embodiment can enable the vehicle appearance features in different vehicle images to be directly compared, and solve the problem that different regions between different vehicle images cannot be compared.
  • the vehicle appearance feature recognition method of this embodiment may be performed by any suitable device having data processing capability, including but not limited to: a terminal device, a server, and the like.
  • FIG. 2 is a flow chart of another embodiment of a vehicle appearance feature recognition method according to the present application.
  • step S201 a plurality of region segmentation results of the target vehicle are acquired from the image to be identified by the first neural network for region extraction.
  • step S201 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the acquisition sub-module 6011 executed by the processor.
  • the first neural network may be any suitable neural network that can implement region extraction or target object recognition, and may include, but is not limited to, a convolutional neural network, an enhanced learning neural network, a generation network in an anti-neural network, and the like.
  • the configuration of the structure in the neural network can be appropriately set by a person skilled in the art according to actual needs, such as the number of layers of the convolution layer, the size of the convolution kernel, the number of channels, and the like, which is not limited in the embodiment of the present application.
  • the first neural network has a first feature extraction layer and a first calculation layer connected at the end of the first feature extraction layer.
  • the step S201 includes: performing feature extraction on the image to be recognized by the first feature extraction layer to obtain multiple key points of the target vehicle; classifying the plurality of key points through the first computing layer to obtain a plurality of key point clusters And merging the feature maps of the key points in the plurality of key point clusters respectively, and obtaining the region segmentation result corresponding to the plurality of key point clusters.
  • the vehicle key point in this embodiment is not the boundary point or corner point of the vehicle, but a significantly different position on the vehicle or a main component of the vehicle, such as a wheel, a light fixture, a logo, a rear view mirror, a license plate, and the like.
  • 3 is a schematic diagram of a distribution of key points of a vehicle implementing the method embodiment of FIG. 2. As shown in FIG.
  • the vehicle key points in this embodiment include the left front wheel 1, the left rear wheel 2, the right front wheel 3, the right rear wheel 4, the right anti-fog lamp 5, the left anti-fog lamp 6, and the right of the vehicle.
  • the first feature extraction layer performs feature extraction on the vehicle key points of the 20 vehicle key points in the input vehicle image to obtain a response feature map of the plurality of vehicle key points.
  • the first feature extraction layer may be an hourglass network structure.
  • the first feature extraction layer needs to be trained before performing this step.
  • the training process of the first feature extraction layer may be: designing a target response feature map of the marked vehicle key point as a Gaussian kernel around the marked key point position, and then inputting the vehicle image containing the marked vehicle key point into the first
  • the feature extraction layer determines whether the prediction result of the first feature extraction layer is close to the target Gaussian kernel.
  • the prediction result of the first feature extraction layer for the marked vehicle key point is a Gaussian kernel corresponding to the response feature map of the marked vehicle key point, and the difference between the prediction result and the target Gaussian kernel may be cross entropy.
  • 4 is a schematic diagram of a network framework implementing the method embodiment of FIG. 2. As shown in FIG. 4, the marker regression in the first neural network (a) is the representation of the first feature extraction layer.
  • Figure 5 is a schematic illustration of the results of vehicle region segmentation implementing the method embodiment of Figure 2; As shown in Fig. 5, the left side is three vehicle images in order, and the right side is the front segmentation result, the back segmentation result, the left segmentation result, and the right segmentation result of each vehicle image, and the vehicle in the vehicle image can be observed from the figure.
  • the segmentation result of the visible region generally has a higher response than the segmentation result of the invisible region of the vehicle, which may indicate that the first feature extraction layer can not only predict the vehicle key points, but also can see the visible vehicle key points from the invisible vehicle key. The points are separated.
  • Step S202 Extract global feature data of the target vehicle and the plurality of regional feature data from the image to be identified by the second neural network for feature extraction based on the plurality of region segmentation results.
  • the step S202 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by an extraction sub-module 6021 that is executed by the processor.
  • the second neural network may be any suitable neural network that can implement feature extraction or target object recognition, including but not limited to convolutional neural networks, enhanced learning neural networks, generation networks in anti-neural networks, and the like.
  • the setting of the optional structure in the neural network can be appropriately set by a person skilled in the art according to actual needs, such as the number of layers of the convolution layer, the size of the convolution kernel, the number of channels, and the like, which is not limited in the embodiment of the present application.
  • the second neural network has a first processing subnet and a plurality of second processing subnets respectively connected to the output end of the first processing subnet, wherein the first processing subnet has a second feature extraction layer And a first booting module and a first pooling layer, the second processing subnet having a second computing layer, a second booting module, and a second pooling layer connected to the output end of the first processing subnet.
  • the second feature extraction layer includes a 3-layer convolution layer and two Inception Modules, and the startup module can perform a convolution operation and a pooling operation.
  • the step S202 includes: performing a convolution operation and a pooling operation on the image to be recognized by the second feature extraction layer to obtain a global feature map of the target vehicle; and performing a convolution operation and a pool on the global feature map by using the first startup module. And obtaining a first feature map set of the target vehicle; performing a pooling operation on the feature map in the first feature map set by the first pooling layer to obtain a global feature vector of the target vehicle.
  • the image to be recognized is first scaled so that the size of the image to be recognized is 192*192, and then the scaled image is input to a layer consisting of a 3-layer convolution layer and two startup modules.
  • the second feature extraction layer performs a convolution operation and a pooling operation on the scaled image to obtain a global feature map with a spatial size of 12*12.
  • the first startup module performs a convolution operation and a pooling operation on the global feature map to obtain a set of feature maps with a space size of 6*6.
  • the first pooling layer performs a global average pooling operation on the feature maps in the set to obtain a 1536-dimensional global feature vector.
  • the step S202 may further include: performing, by the second computing layer, multi-dividing the plurality of regional segmentation results with the global feature image, and obtaining local feature maps corresponding to the plurality of region segmentation results respectively;
  • the local feature map of the plurality of region segmentation results is subjected to a convolution operation and a pooling operation to obtain a second feature map set corresponding to the plurality of region segmentation results; and the second feature map set of the plurality of region segmentation results by the second pooling layer Performing a pooling operation to obtain a first region feature vector corresponding to a plurality of region segmentation results.
  • the method further includes: scaling, by using the second computing layer, the plurality of region segmentation results to be the same size as the global feature image size of. Thereby, it is ensured that the dimension of the finally obtained region feature vector is the same as the dimension of the global feature vector.
  • the front segmentation result, the back segmentation result, the left segmentation result, and the right segmentation result of the vehicle are first scaled to the same size as the global feature map, that is, the size of 12*12. Then, the scaled front segmentation result, the back segmentation result, the left segmentation result, and the right segmentation result are respectively dot-multiplied with the global feature map to obtain a front feature map, a back feature map, a left feature map, and a right feature map of the vehicle.
  • the second startup module performs a convolution operation and a pooling operation on the front feature map, the back feature map, the left feature map, and the right feature map of the vehicle, respectively, to obtain a feature map set corresponding to the local feature map, and the feature map set
  • the space size of the feature map is 6*6.
  • the global maximum pooling operation is performed on the feature maps in the feature map set corresponding to the plurality of local feature maps by the second pooling layer, and the front feature vector, the back feature vector, the left feature vector and the right feature vector of the vehicle are obtained.
  • the dimension of the feature vector of the local area is 1536 dimensions.
  • the global maximum pooling operation is performed on the feature maps in the feature map set corresponding to the plurality of local feature maps respectively because the maximum response is more suitable for extracting features from a local region.
  • the second neural network can be divided into two stages, and global features and local features are extracted in the form of trunk and branch.
  • the first stage performs a convolution operation and a pooling operation on the recognized image to obtain a global feature map of the image to be identified.
  • the second phase consists of five branches, one global branch and four local area branches.
  • the global branch performs the similar processing on the global feature map to obtain the global feature vector
  • the local region branch combined with the global feature map respectively performs the similar processing on the specified region segmentation result to obtain the corresponding local feature vector.
  • Step S203 the global feature data of the target vehicle and the plurality of regional feature data are merged by the third neural network for feature fusion to obtain appearance feature data of the target vehicle.
  • step S203 may be performed by the processor invoking a corresponding instruction stored in the memory, or may be performed by the fusion sub-module 6031 being executed by the processor.
  • the third neural network may be any suitable neural network capable of implementing feature fusion, including but not limited to a convolutional neural network, an enhanced learning neural network, a generation network in an anti-neural network, and the like.
  • the setting of the optional structure in the neural network can be appropriately set by a person skilled in the art according to actual needs, such as the number of layers of the convolution layer, the size of the convolution kernel, the number of channels, and the like, which is not limited in the embodiment of the present application.
  • the third neural network has a first fully connected layer, a third calculated layer, and a second fully connected layer connected to the plurality of outputs of the second neural network.
  • the step S203 includes: acquiring weight values of the plurality of first region feature vectors by using the first fully connected layer; and respectively weighting the plurality of first region feature vectors according to the weight values by the third computing layer, to obtain corresponding a plurality of second region feature vectors; mapping the plurality of second region feature vectors and the global feature vectors by the second fully connected layer to obtain an appearance feature vector of the target vehicle.
  • acquiring the weight values of the plurality of first region feature vectors by using the first fully connected layer includes: performing a splicing operation on the plurality of first region feature vectors to obtain the spliced first region feature vector;
  • the connection layer performs a mapping operation on the spliced first region feature vector to obtain a scalar set corresponding to the first region feature vector; normalizes at least one scalar in the set to obtain a weight value of the first region feature vector.
  • the first fully connected layer performs mapping operations on the four feature vectors to obtain a scalar set.
  • the scalars in the scalar set are normalized by the Softmax function to obtain the weight values of the front eigenvector, the back eigenvector, the left eigenvector, and the right eigenvector, respectively.
  • the positive feature vector, the back feature vector, the left feature vector, and the right feature vector are respectively weighted according to the corresponding weight values, and the weighted front feature vector, back feature vector, left feature vector, and right feature vector are obtained.
  • the weighted positive feature vector, back feature vector, left feature vector, and right feature vector are combined with the global feature vector.
  • the second fully connected layer performs mapping operations on the stitched weighted local feature vector and the global feature vector to obtain a 256-dimensional vehicle appearance feature vector, as shown in part (c) of FIG.
  • the third neural network learns the weight values of the feature vectors of different vehicle regions.
  • the characteristics of different vehicle areas may have different importance.
  • the features of the vehicle visible area in the vehicle image can be retained or given greater weight.
  • the characteristics of the vehicle invisible area in the vehicle image will be eliminated or given a small weight in the competition process. .
  • the orientation of the vehicle in the vehicle image is the left front, and the left and front sides of the vehicle can be seen.
  • the characteristics of the two faces are relatively important, and the weight values of the corresponding feature vectors are relatively larger, and the rear of the vehicle is The right side is invisible, although the features of the two faces can be extracted, but the weights of the feature vectors of the two faces are relatively small.
  • the vehicle key points in the vehicle visible area of the vehicle image contribute more to the final vehicle appearance feature vector, and the influence of the vehicle key points of the vehicle invisible area on the final vehicle appearance feature vector in the vehicle image is passed. It is weakened by a relatively small weight value. Thereby, the appearance of the vehicle can be described more accurately.
  • Figure 6 is a schematic illustration of the weighting values of the vehicle region implementing the method embodiment of Figure 2;
  • part (a) represents an image of three different shooting angles of one vehicle input, and weight values of the front side of the vehicle, the back of the vehicle, the left side of the vehicle, and the right side of the vehicle in the image of each shooting angle
  • part (b) represents the projection result of the vehicle appearance feature of the selected vehicle image in the test set in the two-dimensional space
  • the part (c) represents the image of the three different shooting angles of the other vehicle input, and the image of each shooting angle.
  • the weight value of the front of the vehicle, the back of the vehicle, the left side of the vehicle, and the right side of the vehicle is a schematic illustration of the weighting values of the vehicle region implementing the method embodiment of Figure 2;
  • part (b) represents an image of three different shooting angles of one vehicle input, and weight values of the front side of the vehicle, the back of the vehicle, the left side of the vehicle, and the right side of the vehicle in the image of each shooting angle
  • the appearance characteristics of the same vehicle can be aggregated regardless of the shooting angle of the vehicle image. Therefore, the appearance characteristics of the vehicle identified in the embodiment are independent of the shooting angle of the image of the vehicle to be identified, and the appearance characteristics of the vehicle in different vehicle images can be directly compared, and the different regions between different vehicle images cannot be compared.
  • parts (a) and (c) of the figure show the input vehicle image and the learning weights of the corresponding two clusters, and the local area features of the vehicle appearance are fused based on these learning weights, from which the vehicle image can be observed. The weight of the visible surface of the vehicle is higher than the weight of the invisible surface of the vehicle.
  • an alternative training strategy may be adopted to train the second neural network and the third neural network.
  • the training strategy includes four steps, which may be: (i) the first phase of the second neural network. And the global branch of the second stage can be trained from random initialization and supervised by the global features of the entire image region. (ii) After the training of the backbone of the first phase is completed, the parameters of the global branch initialization of the second phase can be used to train the four partial branches of the second phase, since the global branches of the second phase have the same structure as the local branches. In addition, the training of the four partial branches is separately supervised by a given classification label. (iii) After the completion of the first phase of the backbone network and the second phase of the branch training, the third neural network is trained. (iv) Initialize a neural network having parameters learned through the above steps, and combine the parameters to fine tune. Existing vehicle databases and Softmax classification losses can be used when training the second neural network and the third god to network.
  • the appearance characteristics of the vehicle identified in this embodiment can be used not only to describe the vehicle, but also to analyze the vehicle attributes, such as a coarse model, a subdivision model, a vehicle color, etc., and, in addition, The classification, recognition and retrieval of the vehicle are performed by using the appearance characteristics of the vehicle identified in the embodiment.
  • a plurality of region segmentation results of the target vehicle are acquired from the image to be recognized by the first neural network for region extraction, and then, based on the plurality of region segmentation results, by using the feature
  • the extracted second neural network extracts global feature data and a plurality of regional feature data of the target vehicle from the image to be identified, and then performs global feature data and a plurality of regional feature data of the target vehicle through a third neural network for feature fusion.
  • the vehicle appearance feature data of the target vehicle is obtained by the fusion, and the vehicle appearance feature recognition method of the embodiment is compared with the method for obtaining the appearance feature of the vehicle in the prior art, and the vehicle appearance feature identified by the embodiment may include a vehicle in addition to the global feature.
  • the feature of the local area of the appearance can reflect the detailed information of the target vehicle through the local area feature, thereby enabling a more accurate description of the appearance of the vehicle.
  • the appearance characteristics of the vehicle identified by the embodiment can enable the vehicle appearance features in different vehicle images to be directly compared, and solve the problem that different regions between different vehicle images cannot be compared.
  • the vehicle appearance feature recognition method of this embodiment may be performed by any suitable device having data processing capability, including but not limited to: a terminal device, a server, and the like.
  • FIG. 7 is a flow chart of one embodiment of a vehicle retrieval method in accordance with the present application.
  • step S301 the appearance feature data of the target vehicle in the image to be retrieved is acquired by the vehicle appearance feature recognition method.
  • the appearance feature data of the target vehicle in the image to be retrieved may be acquired by the vehicle appearance feature recognition method provided in the above embodiment 1 or the second embodiment.
  • the appearance feature data may be data represented by a vector.
  • the image to be retrieved may be an image including a part of the target vehicle or an image including the entire target vehicle.
  • the image to be retrieved may be a still image captured, or a video image in a sequence of video frames, or a composite image or the like.
  • the step S301 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a second acquisition module 701 executed by the processor.
  • Step S302 searching for a target candidate vehicle image that matches the appearance feature data from the to-be-selected vehicle image library.
  • step S302 may be performed by the processor invoking a corresponding instruction stored in the memory or by the lookup module 702 being executed by the processor.
  • the vehicle appearance feature identification method provided in the foregoing embodiment 1 or the second embodiment may be used to acquire appearance characteristic data of the vehicle in the plurality of to-be-selected vehicle images in the image database of the candidate to be selected, and the target is obtained.
  • the appearance characteristic data of the vehicle is respectively compared with the appearance characteristic data of the vehicle in the image of the candidate vehicle, and the target candidate vehicle image matching the appearance characteristic data of the target vehicle is obtained.
  • An exemplary embodiment of the present application is directed to a vehicle retrieval method for obtaining appearance characteristic data of a target vehicle in an image to be retrieved by the vehicle appearance feature recognition method provided in the above-described first embodiment or the second embodiment, and from the vehicle to be selected Finding the image of the target vehicle to be selected matching the appearance feature data in the image library can improve the accuracy of the vehicle retrieval.
  • the vehicle retrieval method of this embodiment may be performed by any suitable device having data processing capabilities, including but not limited to: a terminal device, a server, and the like.
  • FIG. 8 is a flow chart of another embodiment of a vehicle retrieval method in accordance with the present application.
  • step S401 the appearance feature data of the target vehicle in the image to be retrieved is acquired by the vehicle appearance feature recognition method.
  • the step S401 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a second acquisition module 804 being executed by the processor.
  • step S401 is the same as the above step S301, details are not described herein again.
  • Step S402 determining a cosine distance of the appearance feature vector of the target vehicle and the appearance feature vector of the vehicle of the to-be-selected vehicle image in the vehicle image library to be selected.
  • step S402 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a lookup module 805 executed by the processor.
  • a person skilled in the art can calculate the cosine distance of the appearance feature vector of the target vehicle and the appearance feature vector of the vehicle in the image of the candidate vehicle according to the existing cosine distance calculation formula.
  • Step S403 determining a target candidate vehicle image that matches the target vehicle according to the cosine distance.
  • the step S403 may be performed by a processor invoking a corresponding instruction stored in the memory, or may be performed by a lookup module 805 executed by the processor.
  • the image of the candidate vehicle is determined to be the target.
  • the vehicle image of the target candidate vehicle to be matched.
  • a person skilled in the art can obtain a first preset threshold by testing.
  • embodiments of the present application are not limited thereto.
  • the method further includes: acquiring a shooting time and/or a shooting position of the image to be retrieved, and a shooting time and/or a shooting position of the plurality of to-be-selected vehicle images; determining the target vehicle and the plurality according to the shooting time and/or the shooting position. a time-space distance of the vehicle in the image of the candidate vehicle; determining a target vehicle image to be selected that matches the target vehicle in the image library of the candidate vehicle according to the cosine distance and the space-time distance.
  • the accuracy of the vehicle search can be further improved.
  • determining, according to the cosine distance and the space-time distance, the target candidate vehicle image that matches the target vehicle in the image library of the selected vehicle including: acquiring, according to the cosine distance, the plurality of to-be-selected vehicle images in the image database of the to-be-selected vehicle; Determining a spatiotemporal matching probability of the to-be-selected vehicle image and the target vehicle based on the shooting time and the shooting position of the image of the candidate vehicle; determining a target candidate vehicle image that matches the target vehicle based on the cosine distance and the spatiotemporal matching probability.
  • the spatio-temporal information of the vehicle image can greatly enhance the recall rate of vehicle retrieval. If the shooting time and the shooting location of the vehicle image to be retrieved are known, the probability of occurrence of the vehicle in the vehicle image at another time and at another location can be obtained by statistical modeling. This is very effective for retrieval tasks.
  • the space-time matching probability is determined by the shooting time of the vehicle image to be selected and the shooting time and shooting position of the target vehicle image.
  • the space-time matching probability refers to the probability that the target vehicle appears in the shooting time and shooting position of the image of the vehicle to be selected, which is obtained by statistical modeling according to the shooting time and shooting location of the vehicle image.
  • the spatiotemporal matching probability refers to a conditional probability of a vehicle transition time interval between two cameras, which can be calculated by the following formula 1.
  • vehicle appearance features may not be sufficient to distinguish a vehicle from other vehicles, particularly if the vehicle has the same exterior without personalized decoration.
  • the shooting time and shooting position of the vehicle image are easily available.
  • the vehicle transition time interval can be modeled as a random variable that satisfies the probability distribution. Due to the Gaussian-like and long-tailed properties of the vehicle transfer time interval, a lognormal distribution can be used to simulate this random variable. Given l, the camera that the vehicle leaves, e indicates the camera that the vehicle enters, and the conditional probability of the vehicle transition time interval ⁇ between l and e can be calculated by the following formula 1:
  • ⁇ l, e , ⁇ l, e represent the estimated parameters of each pair of cameras (l, e), respectively
  • the vehicle transfer time interval ⁇ is the absolute value of the shooting time of the two vehicle images, and the estimated parameters can be maximized by The log likelihood function is calculated:
  • Contains samples of vehicle transfer time intervals between two cameras in the training set.
  • the space-time distance between the two vehicle images can be calculated according to the following formula 2:
  • D a represents the cosine distance of the vehicle appearance feature vector between the two vehicle images
  • D s represents the space-time distance between the two vehicle images
  • D represents the similar distance of the vehicle between the two vehicle images
  • The size is 2 and the size of ⁇ is 0.1. Among them, the smaller the similar distance of the vehicle between the two vehicle images, the more similar the vehicle between the two vehicle images.
  • the image of the candidate vehicle to be selected is a target candidate vehicle image that matches the target vehicle.
  • a person skilled in the art can obtain a second preset threshold by testing.
  • embodiments of the present application are not limited thereto.
  • FIG. 9 is a schematic illustration of a similar distance of a vehicle implementing the method embodiment of FIG.
  • the image in the box of the first row above is the image of the candidate vehicle in the top five sorted according to the cosine distance
  • the image of the leftmost row in the first row is the image of the target vehicle, below
  • a row of images is a result of reordering based on the temporal and spatial distances of the image of the candidate vehicle and the image of the target vehicle.
  • the conditional probability of the vehicle transfer time interval is calculated according to the shooting time value of the target vehicle image and the candidate vehicle image and the number of the shooting camera of the target vehicle image and the candidate vehicle image by Formula One.
  • the space-time distance between the target vehicle image and the image of the vehicle to be selected is calculated according to the conditional probability of the vehicle transition time interval by formula 2, and then the target is calculated according to the known cosine distance and the calculated space-time distance by formula 3.
  • a similar distance of the vehicle between the vehicle image and the image of the vehicle to be selected is obtained.
  • An exemplary embodiment of the present application is directed to a vehicle retrieval method for obtaining appearance characteristic data of a target vehicle in an image to be retrieved by the vehicle appearance feature recognition method provided in the above-described first embodiment or the second embodiment, and from the vehicle to be selected Finding the image of the target vehicle to be selected matching the appearance feature data in the image library can improve the accuracy of the vehicle retrieval.
  • the vehicle retrieval method of this embodiment may be performed by any suitable device having data processing capabilities, including but not limited to: a terminal device, a server, and the like.
  • any of the methods provided by the embodiments of the present application may be executed by a processor, such as the processor, by executing a corresponding instruction stored in the memory to perform any of the methods mentioned in the embodiments of the present application. This will not be repeated below.
  • the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed.
  • the foregoing steps include the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.
  • FIG. 10 is a schematic structural view showing an embodiment of a vehicle appearance feature recognizing apparatus according to the present application.
  • the vehicle appearance feature recognition method flow as in the first embodiment can be performed.
  • the vehicle appearance feature recognition apparatus includes a first acquisition module 501, an extraction module 502, and a fusion module 503.
  • a first obtaining module 501 configured to acquire, from the image to be identified, a plurality of region segmentation results of the target vehicle
  • the extracting module 502 is configured to extract global feature data and a plurality of regional feature data from the image to be identified based on the plurality of region segmentation results;
  • the fusion module 503 is configured to fuse the global feature data and the plurality of regional feature data to obtain appearance feature data of the target vehicle.
  • the vehicle appearance feature recognition device obtains a plurality of region segmentation results of the target vehicle from the image to be recognized including the target vehicle; and then extracts global feature data from the image to be identified based on the plurality of region segmentation results. And the plurality of regional feature data; and the global feature data and the plurality of regional feature data are merged to obtain the appearance feature data of the target vehicle, and the vehicle appearance feature identified in the embodiment includes the feature of the local region of the vehicle appearance, which can be more accurate. Describe the appearance of the vehicle.
  • the appearance characteristics of the vehicle identified by the embodiment can enable the vehicle appearance features in different vehicle images to be directly compared, and solve the problem that different regions between different vehicle images cannot be compared.
  • FIG. 11 is a schematic structural view showing another embodiment of the vehicle appearance feature recognizing apparatus according to the present application.
  • the vehicle appearance feature recognition method flow as in the second embodiment can be performed.
  • the vehicle appearance feature recognition apparatus includes a first acquisition module 601, an extraction module 602, and a fusion module 603.
  • the first obtaining module 601 is configured to acquire a plurality of region segmentation results of the target vehicle from the to-be-identified image
  • the extracting module 602 is configured to extract global feature data from the image to be identified and based on the plurality of region segmentation results.
  • the area feature data is used by the fusion module 603 to fuse the global feature data and the plurality of regional feature data to obtain appearance feature data of the target vehicle.
  • the plurality of region segmentation results respectively correspond to regions of different orientations of the target vehicle.
  • the plurality of region segmentation results include segmentation results of the front, back, left, and right sides of the target vehicle.
  • the first obtaining module 601 includes: an obtaining submodule 6011, configured to acquire, by using the first neural network for region extraction, a plurality of region segmentation results of the target vehicle from the to-be-identified image.
  • the first neural network has a first feature extraction layer and a first computing layer connected to the end of the first feature extraction layer, wherein the obtaining sub-module 6011 is configured to: perform feature on the image to be recognized by the first feature extraction layer Extracting, obtaining a plurality of key points of the target vehicle; classifying the plurality of key points through the first computing layer, obtaining a plurality of key point clusters, and respectively merging the feature maps of the key points in the plurality of key point clusters to obtain The result of the region segmentation corresponding to multiple keypoint clusters.
  • the extracting module 602 includes: an extracting submodule 6021, configured to extract global feature data and multiple regions of the target vehicle from the image to be identified by using the second neural network for feature extraction based on the plurality of region segmentation results.
  • Feature data configured to extract global feature data and multiple regions of the target vehicle from the image to be identified by using the second neural network for feature extraction based on the plurality of region segmentation results.
  • the second neural network has a first processing subnet and a plurality of second processing subnets respectively connected to the output end of the first processing subnet, wherein the first processing subnet has a second feature extraction layer, A boot module and a first pooling layer, the second processing subnet having a second computing layer, a second booting module, and a second pooling layer connected to the output of the first processing subnet.
  • the extracting sub-module 6021 includes: a first feature extracting unit 6022, configured to perform a convolution operation and a pooling operation on the image to be recognized by the second feature extraction layer, to obtain a global feature map of the target vehicle; and second feature extraction
  • the unit 6023 is configured to perform a convolution operation and a pooling operation on the global feature map by using the first startup module to obtain a first feature map set of the target vehicle.
  • the first pooling unit 6024 is configured to pass the first pooling layer pair The feature map in a feature map set is subjected to a pooling operation to obtain a global feature vector of the target vehicle.
  • the extracting sub-module 6021 further includes: a first calculating unit 6026, configured to perform point multiplication of the plurality of regional segmentation results and the global feature graph by the second computing layer, to obtain a partial corresponding to the plurality of regional segmentation results respectively a feature map; a third feature extraction unit 6027, configured to perform a convolution operation and a pooling operation on the local feature map of the plurality of region segmentation results by the second startup module, to obtain a second feature map set corresponding to the plurality of region segmentation results;
  • the second pooling unit 6028 is configured to perform a pooling operation on the second feature map set of the plurality of region splitting results by using the second pooling layer to obtain the first region feature vector corresponding to the plurality of region segmentation results.
  • the extracting sub-module 6021 further includes: a second calculating unit 6025, configured to respectively scale the plurality of region segmentation results to the same size as the global feature graph by the second computing layer.
  • a second calculating unit 6025 configured to respectively scale the plurality of region segmentation results to the same size as the global feature graph by the second computing layer.
  • the fusion module 603 includes: a fusion submodule 6031, configured to fuse the global feature data of the target vehicle and the plurality of regional feature data by using a third neural network for feature fusion.
  • the third neural network has a first fully connected layer, a third computing layer, and a second fully connected layer connected to the output end of the second neural network
  • the fusion submodule 6031 includes: a first obtaining unit 6032 And a weight value for acquiring the first region feature vector by using the first fully connected layer; the third calculating unit 6033, configured to respectively weight the plurality of first region feature vectors according to the weight value by the third computing layer, to obtain corresponding a plurality of second region feature vectors; a mapping unit 6034, configured to perform mapping operations on the plurality of second region feature vectors and the global feature vectors by using the second fully connected layer to obtain an appearance feature vector of the target vehicle.
  • the first obtaining unit 6032 is configured to: perform splicing operation on the plurality of first region feature vectors to obtain the spliced first region feature vector; and splicing the first region eigenvector by using the first fully connected layer pair Performing a mapping operation to obtain a set of scalars corresponding to the plurality of first region feature vectors; normalizing the plurality of scalars in the set to obtain weight values of the plurality of first region feature vectors.
  • the first feature extraction layer is an hourglass network structure.
  • FIG. 12 is a schematic structural view showing an embodiment of a vehicle retrieval device according to the present application.
  • the vehicle retrieval method flow as in the third embodiment can be performed.
  • the vehicle retrieval device includes a second acquisition module 701 and a lookup module 702.
  • a second obtaining module 701 configured to acquire appearance characteristic data of a target vehicle in an image to be retrieved by using the apparatus according to the fifth embodiment or the sixth embodiment;
  • the searching module 702 is configured to search for a target candidate vehicle image that matches the appearance feature data from the to-be-selected vehicle image library.
  • An exemplary embodiment of the present application is directed to a vehicle retrieval device that acquires appearance characteristic data of a target vehicle in an image to be retrieved by the vehicle appearance feature recognition device provided in the above-described fifth embodiment or the above-described sixth embodiment, and from the vehicle to be selected Finding the image of the target vehicle to be selected matching the appearance feature data in the image library can improve the accuracy of the vehicle retrieval.
  • FIG. 13 is a block diagram showing another embodiment of a vehicle retrieval device according to the present application.
  • the vehicle retrieval method flow as in the fourth embodiment can be performed.
  • the vehicle retrieval device includes a second acquisition module 804 and a lookup module 805.
  • the second obtaining module 804 is configured to obtain appearance feature data of the target vehicle in the image to be retrieved by using the device according to the fifth embodiment or the sixth embodiment;
  • the searching module 805 is configured to search and select the appearance feature data from the image of the to-be-selected vehicle image. Target image of the candidate vehicle.
  • the searching module 805 is configured to: determine a cosine distance of the appearance feature vector of the target vehicle and the appearance feature vector of the vehicle of the to-be-selected vehicle image in the to-be-selected vehicle image library; and determine the matching with the target vehicle according to the cosine distance.
  • Target candidate vehicle image is configured to: determine a cosine distance of the appearance feature vector of the target vehicle and the appearance feature vector of the vehicle of the to-be-selected vehicle image in the to-be-selected vehicle image library.
  • the apparatus of the embodiment further includes: a third obtaining module 801, configured to acquire a shooting time and/or a shooting position of the image to be retrieved, and a shooting time and/or a shooting position of the plurality of to-be-selected vehicle images;
  • the module 802 is configured to determine a spatiotemporal distance of the vehicle in the target vehicle and the plurality of to-be-selected vehicle images according to the shooting time and/or the shooting position; and the second determining module 803 is configured to determine the to-be-selected vehicle image library according to the cosine distance and the spatiotemporal distance.
  • the image of the target candidate vehicle that matches the target vehicle.
  • the second determining module 803 is configured to: acquire, according to the cosine distance, a plurality of to-be-selected vehicle images in the to-be-selected vehicle image library; determine the to-be-selected vehicle image based on the shooting time and the shooting position of the to-be-selected vehicle image respectively.
  • a time-space matching probability with the target vehicle; based on the cosine distance and the space-time matching probability, the target candidate vehicle image that matches the target vehicle is determined.
  • the embodiment of the present application further provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
  • electronic device 900 includes one or more first processors, first communication elements, etc., one or more first processors, such as one or more central processing units (CPUs) 901, and/or One or more image processor (GPU) 913 or the like, the first processor may be loaded into executable memory in read only memory (ROM) 902 or loaded from storage portion 908 into random access memory (RAM) 903. Execute instructions to perform various appropriate actions and processes.
  • the read only memory 902 and the random access memory 903 are collectively referred to as a first memory.
  • the first communication component includes a communication component 912 and/or a communication interface 909.
  • the communication component 912 can include, but is not limited to, a network card, which can include, but is not limited to, an IB (Infiniband) network card
  • the communication interface 909 includes a communication interface of a network interface card such as a LAN card, a modem, etc.
  • the communication interface 909 is via a network such as the Internet. Perform communication processing.
  • the first processor can communicate with read only memory 902 and/or random access memory 903 to execute executable instructions, connect to communication component 912 via first communication bus 904, and communicate with other target devices via communication component 912, thereby completing
  • the operation corresponding to any vehicle appearance feature recognition method provided by the embodiment of the present application, for example, acquiring a plurality of region segmentation results of the target vehicle from the image to be identified; and extracting a global image from the image to be identified based on the plurality of region segmentation results Feature data and a plurality of regional feature data; fusing the global feature data and the plurality of regional feature data to obtain appearance feature data of the target vehicle.
  • RAM 903 various programs and data required for the operation of the device can be stored.
  • the CPU 901 or the GPU 913, the ROM 902, and the RAM 903 are connected to each other through the first communication bus 904.
  • ROM 902 is an optional module.
  • the RAM 903 stores executable instructions or writes executable instructions to the ROM 902 at runtime, the executable instructions causing the first processor to perform operations corresponding to the above-described communication methods.
  • An input/output (I/O) interface 905 is also coupled to the first communication bus 904.
  • the communication component 912 can be integrated or can be configured to have multiple sub-modules (e.g., multiple IB network cards) and be on a communication bus link.
  • the following components are connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, etc.; an output portion 907 including, for example, a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a storage portion 908 including a hard disk or the like. And a communication interface 909 including a network interface card such as a LAN card, a modem, or the like.
  • Driver 910 is also connected to I/O interface 905 as needed.
  • a removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 910 as needed so that a computer program read therefrom is installed into the storage portion 908 as needed.
  • FIG. 14 is only an optional implementation manner.
  • the number and type of components in the foregoing FIG. 14 may be selected, deleted, added, or replaced according to actual needs; Different function components can also be implemented in separate settings or integrated settings, such as GPU and CPU detachable settings or GPU can be integrated on the CPU, communication components can be separated, or integrated on the CPU or GPU. ,and many more.
  • These alternative embodiments are all within the scope of the present application.
  • embodiments of the present application include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code comprising the corresponding execution
  • the instruction corresponding to the method step provided by the embodiment of the present application obtains, for example, a plurality of region segmentation results of the target vehicle from the image to be identified; and extracts global feature data and multiple regions from the image to be identified based on the plurality of region segmentation results.
  • the computer program can be downloaded and installed from the network via a communication component, and/or installed from the removable media 911.
  • the above-described functions defined in the method of the embodiments of the present application are executed when the computer program is executed by the first processor.
  • the embodiment of the present application further provides an electronic device, such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
  • an electronic device such as a mobile terminal, a personal computer (PC), a tablet computer, a server, and the like.
  • FIG. 15 a block diagram of another embodiment of an electronic device 1000 suitable for use in implementing a terminal device or server of an embodiment of the present application. As shown in FIG.
  • the electronic device 1000 includes one or more second processors, a second communication component, etc., the one or more second processors such as one or more central processing units (CPUs) 1001, and / or one or more image processor (GPU) 1013, etc., the second processor may be loaded into random access memory (RAM) 1003 according to executable instructions stored in read only memory (ROM) 1002 or from storage portion 1008 The executable instructions execute various appropriate actions and processes.
  • the second read only memory 1002 and the random access memory 1003 are collectively referred to as a second memory.
  • the second communication component includes a communication component 1012 and/or a communication interface 1009.
  • the communication component 1012 may include, but is not limited to, a network card, which may include, but is not limited to, an IB (Infiniband) network card.
  • the communication interface 1009 includes a communication interface of a network interface card such as a LAN card, a modem, etc., and the communication interface 1009 is via an Internet interface such as The network performs communication processing.
  • the second processor can communicate with read only memory 1002 and/or random access memory 1003 to execute executable instructions, connect to communication component 1012 via second communication bus 1004, and communicate with other target devices via communication component 1012, thereby completing The operation corresponding to any one of the vehicle retrieval methods provided by the embodiment of the present application, for example, obtaining the appearance feature data of the target vehicle in the image to be retrieved according to the method according to the first embodiment or the second embodiment; Finding a target candidate vehicle image that matches the appearance feature data.
  • RAM 1003 various programs and data required for the operation of the device can be stored.
  • the CPU 1001 or the GPU 1013, the ROM 1002, and the RAM 1003 are connected to each other through the second communication bus 1004.
  • ROM 1002 is an optional module.
  • the RAM 1003 stores executable instructions or writes executable instructions to the ROM 1002 at runtime, the executable instructions causing the second processor to perform operations corresponding to the above-described communication methods.
  • An input/output (I/O) interface 1005 is also coupled to the second communication bus 1004.
  • the communication component 1012 can be integrated or can be configured to have multiple sub-modules (e.g., multiple IB network cards) and be on a communication bus link.
  • the following components are connected to the I/O interface 1005: an input portion 1006 including a keyboard, a mouse, etc.; an output portion 1007 including a cathode ray tube (CRT), a liquid crystal display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk or the like And a communication interface 1009 including a network interface card such as a LAN card, modem, or the like.
  • Driver 1010 is also coupled to I/O interface 1005 as needed.
  • a removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 1010 as needed so that a computer program read therefrom is installed into the storage portion 1008 as needed.
  • FIG. 15 is only an optional implementation manner.
  • the number and type of components in the foregoing FIG. 15 may be selected, deleted, added, or replaced according to actual needs; Different function components can also be implemented in separate settings or integrated settings, such as GPU and CPU detachable settings or GPU can be integrated on the CPU, communication components can be separated, or integrated on the CPU or GPU. ,and many more.
  • These alternative embodiments are all within the scope of the present application.
  • embodiments of the present application include a computer program product comprising a computer program tangibly embodied on a machine readable medium, the computer program comprising program code for executing the method illustrated in the flowchart, the program code comprising the corresponding execution
  • the instruction corresponding to the method step provided in the embodiment of the present application for example, obtains the appearance feature data of the target vehicle in the image to be retrieved according to the method described in the foregoing Embodiment 1 or Embodiment 2; The target candidate vehicle image that matches the appearance feature data.
  • the computer program can be downloaded and installed from the network via a communication component, and/or installed from the removable medium 1011.
  • the above-described functions defined in the method of the embodiments of the present application are executed when the computer program is executed by the second processor.
  • the methods and apparatus of the present application may be implemented in a number of ways.
  • the methods and apparatus of the present application can be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described sequence of steps for the method is for illustrative purposes only, and the steps of the method of the present application are not limited to the order specifically described above unless otherwise specifically stated.
  • the present application can also be implemented as a program recorded in a recording medium, the programs including machine readable instructions for implementing the method according to the present application.
  • the present application also covers a recording medium storing a program for executing the method according to the present application.

Abstract

本申请实施例提供一种车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备。其中,所述车辆外观特征识别方法包括:从待识别图像中,获取目标车辆的多个区域分割结果;基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据;对所述全局特征数据和多个所述区域特征数据进行融合,得到所述目标车辆的外观特征数据。通过本申请实施例,能够更加准确地描述车辆的外观。此外,本实施例识别得到的车辆外观特征能够使得不同车辆图像中的车辆外观特征可直接进行比对,解决了不同车辆图像之间的不同区域无法比对的问题。

Description

车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备
本申请要求在2017年6月28日提交中国专利局、申请号为CN201710507778.5、发明名称为“车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及人工智能技术,尤其涉及一种车辆外观特征识别方法、装置、存储介质和电子设备,以及,一种车辆检索方法、装置、存储介质和电子设备。
背景技术
车辆的检索任务指的是给出一张待查询的车辆图片,在大规模的车辆图片数据库中检索车辆图片中车辆的所有图片。
发明内容
本申请实施例的目的在于,提供一种车辆外观特征识别的技术方案和车辆检索的技术方案。
根据本申请实施例的第一方面,提供了一种车辆外观特征识别方法,包括:从待识别图像中,获取目标车辆的多个区域分割结果;基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据;对所述全局特征数据和多个所述区域特征数据进行融合,得到所述目标车辆的外观特征数据。
可选地,所述多个区域分割结果分别对应所述目标车辆的不同方位的区域。
可选地,所述多个区域分割结果包括所述目标车辆的前面、后面、左面和右面的分割结果。
可选地,所述从待识别图像中,获取目标车辆的多个区域分割结果,包括:通过用于区域提取的第一神经网络从待识别图像中,获取所述目标车辆的多个区域分割结果。
可选地,所述第一神经网络具有第一特征提取层和连接在所述第一特征提取层末端的第一计算层,其中,通过用于区域提取的第一神经网络从待识别图像中,获取所述目标车辆的多个区域分割结果,包括:通过所述第一特征提取层对所述待识别图像进行特征提取,得到所述目标车辆的多个关键点;通过所述第一计算层对所述多个关键点进行分类,得到多个关键点集群,并且分别针对多个所述关键点集群中的关键点的特征图进行融合,获得多个所述关键点集群对应的区域分割结果。
可选地,所述基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据,包括:基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据。
可选地,所述第二神经网络具有第一处理子网和分别与所述第一处理子网的输出端连接的多个第二处理子网,其中,所述第一处理子网具有第二特征提取层、第一启动模块和第一池化层,所述第二处理子网具有与所述第一处理子网的输出端连接的第二计算层、第二启动模块和第二池化层。
可选地,所述基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据,包括:通过所述第二特征提取层对所述待识别图像进行卷积操作和池化操作,获得所述目标车辆的全局特征图;通过所述第一启动模块对所述全局特征图进行卷积操作和池化操作,获得所述目标车辆的第一特征图集合;通过所述第一池化层对所述第一特征图集合中的特征图进行池化操作,获得所述目标车辆的全局特征向量。
可选地,所述基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提 取所述目标车辆的全局特征数据和多个区域特征数据,还包括:通过所述第二计算层将所述多个区域分割结果分别与所述全局特征图进行点乘,获得所述多个区域分割结果分别对应的局部特征图;通过所述第二启动模块对多个所述区域分割结果的局部特征图进行卷积操作和池化操作,获得多个所述区域分割结果对应的第二特征图集合;通过所述第二池化层对多个所述区域分割结果的第二特征图集合进行池化操作,获得多个所述区域分割结果对应的第一区域特征向量。
可选地,所述通过所述第二计算层将所述多个区域分割结果分别与所述全局特征图进行点乘之前,所述方法还包括:通过所述第二计算层将所述多个区域分割结果分别缩放到与所述全局特征图的尺寸相同的尺寸。
可选地,所述对所述全局特征数据和多个所述区域特征数据进行融合,包括:通过用于特征融合的第三神经网络对所述目标车辆的全局特征数据和多个所述区域特征数据进行融合。
可选地,所述第三神经网络具有与所述第二神经网络的输出端连接的第一全连接层、第三计算层和第二全连接层,其中,所述通过用于特征融合的第三神经网络对所述目标车辆的全局特征数据和多个所述区域特征数据进行融合,包括:通过所述第一全连接层获取第一区域特征向量的权重值;通过所述第三计算层根据所述权重值,对多个所述第一区域特征向量分别加权,获得相应的多个第二区域特征向量;通过所述第二全连接层对多个所述第二区域特征向量和全局特征向量进行映射操作,获得所述目标车辆的外观特征向量。
可选地,所述通过所述第一全连接层获取第一区域特征向量的权重值,包括:对多个所述第一区域特征向量进行拼接操作,获得拼接后的第一区域特征向量;通过所述第一全连接层对所述拼接后的第一区域特征向量进行映射操作,获得多个所述第一区域特征向量对应的标量的集合;对所述集合中的标量进行归一化操作,获得多个所述第一区域特征向量的权重值。
可选地,所述第一特征提取层为沙漏型网络结构。
根据本申请实施例的第二方面,提供了一种车辆检索方法。所述方法包括:通过根据本申请实施例的第一方面所述的方法获取待检索图像中目标车辆的外观特征数据;从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像。
可选地,所述从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像,包括:确定所述目标车辆的外观特征向量分别与所述待选车辆图像库中多个待选车辆图像的车辆的外观特征向量的余弦距离;根据所述余弦距离确定与所述目标车辆匹配的目标待选车辆图像。
可选地,所述方法还包括:获取所述待检索图像的拍摄时间和/或拍摄位置以及所述多个所述待选车辆图像的拍摄时间和/或拍摄位置;根据所述拍摄时间和/或所述拍摄位置确定所述目标车辆与所述多个所述待选车辆图像中的车辆的时空距离;根据所述余弦距离和所述时空距离确定所述待选车辆图像库中与所述目标车辆匹配的目标待选车辆图像。
可选地,所述根据所述余弦距离和所述时空距离确定所述待选车辆图像库中与所述目标车辆匹配的目标待选车辆图像,包括:根据所述余弦距离,在所述待选车辆图像库中获取多个所述待选车辆图像;分别基于所述待选车辆图像的拍摄时间及拍摄位置,确定所述待选车辆图像与所述目标车辆的时空匹配概率;基于所述余弦距离和所述时空匹配概率,确定与所述目标车辆匹配的目标待选车辆图像。
根据本申请实施例的第三方面,提供了一种车辆外观特征识别装置。所述装置包括:第一获取模块,用于从待识别图像中,获取目标车辆的多个区域分割结果;提取模块,用于基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据;融合模块,用于对所述全局特征数据和多个所述区域特征数据进行融合,得到所述目标车辆的外观特征数据。
可选地,所述多个区域分割结果分别对应所述目标车辆的不同方位的区域。
可选地,所述多个区域分割结果包括所述目标车辆的前面、后面、左面和右面的分割结果。
可选地,所述第一获取模块,包括:获取子模块,用于通过用于区域提取的第一神经网络从待识别图 像中,获取所述目标车辆的多个区域分割结果。
可选地,所述第一神经网络具有第一特征提取层和连接在所述第一特征提取层末端的第一计算层,其中,所述获取子模块,用于:通过所述第一特征提取层对所述待识别图像进行特征提取,得到所述目标车辆的多个关键点;通过所述第一计算层对所述多个关键点进行分类,得到多个关键点集群,并且分别针对多个所述关键点集群中的关键点的特征图进行融合,获得多个所述关键点集群对应的区域分割结果。
可选地,所述提取模块,包括:提取子模块,用于基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据。
可选地,所述第二神经网络具有第一处理子网和分别与所述第一处理子网的输出端连接的多个第二处理子网,其中,所述第一处理子网具有第二特征提取层、第一启动模块和第一池化层,所述第二处理子网具有与所述第一处理子网的输出端连接的第二计算层、第二启动模块和第二池化层。
可选地,所述提取子模块,包括:第一特征提取单元,用于通过所述第二特征提取层对所述待识别图像进行卷积操作和池化操作,获得所述目标车辆的全局特征图;第二特征提取单元,用于通过所述第一启动模块对所述全局特征图进行卷积操作和池化操作,获得所述目标车辆的第一特征图集合;第一池化单元,用于通过所述第一池化层对所述第一特征图集合中的特征图进行池化操作,获得所述目标车辆的全局特征向量。
可选地,所述提取子模块,还包括:第一计算单元,用于通过所述第二计算层将所述多个区域分割结果分别与所述全局特征图进行点乘,获得所述多个区域分割结果分别对应的局部特征图;第三特征提取单元,用于通过所述第二启动模块对多个所述区域分割结果的局部特征图进行卷积操作和池化操作,获得多个所述区域分割结果对应的第二特征图集合;第二池化单元,用于通过所述第二池化层对多个所述区域分割结果的第二特征图集合进行池化操作,获得多个所述区域分割结果对应的第一区域特征向量。
可选地,所述提取子模块,还包括:第二计算单元,用于通过所述第二计算层将所述多个区域分割结果分别缩放到与所述全局特征图的尺寸相同的尺寸。
可选地,所述融合模块,包括:融合子模块,用于通过用于特征融合的第三神经网络对所述目标车辆的全局特征数据和多个所述区域特征数据进行融合。
可选地,所述第三神经网络具有与所述第二神经网络的输出端连接的第一全连接层、第三计算层和第二全连接层,其中,所述融合子模块,包括:第一获取单元,用于通过所述第一全连接层获取第一区域特征向量的权重值;第三计算单元,用于通过所述第三计算层根据所述权重值,对多个所述第一区域特征向量分别加权,获得相应的多个第二区域特征向量;映射单元,用于通过所述第二全连接层对多个所述第二区域特征向量和全局特征向量进行映射操作,获得所述目标车辆的外观特征向量。
可选地,所述第一获取单元,用于:对多个所述第一区域特征向量进行拼接操作,获得拼接后的第一区域特征向量;通过所述第一全连接层对所述拼接后的第一区域特征向量进行映射操作,获得多个所述第一区域特征向量对应的标量的集合;对所述集合中的多个标量进行归一化操作,获得多个所述第一区域特征向量的权重值。
可选地,所述第一特征提取层为沙漏型网络结构。
根据本申请实施例的第四方面,提供了一种车辆检索装置。所述装置包括:第二获取模块,用于通过根据本申请实施例第三方面所述的装置获取待检索图像中目标车辆的外观特征数据;查找模块,用于从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像。
可选地,所述查找模块,用于:确定所述目标车辆的外观特征向量分别与所述待选车辆图像库中的待选车辆图像的车辆的外观特征向量的余弦距离;根据所述余弦距离确定与所述目标车辆匹配的目标待选车辆图像。
可选地,所述装置还包括:第三获取模块,用于获取所述待检索图像的拍摄时间和/或拍摄位置以及所述多个所述待选车辆图像的拍摄时间和/或拍摄位置;第一确定模块,用于根据所述拍摄时间和/或所述拍摄 位置确定所述目标车辆与所述多个所述待选车辆图像中的车辆的时空距离;第二确定模块,用于根据所述余弦距离和所述时空距离确定所述待选车辆图像库中与所述目标车辆匹配的目标待选车辆图像。
可选地,所述第二确定模块,用于:根据所述余弦距离,在所述待选车辆图像库中获取多个所述待选车辆图像;分别基于所述待选车辆图像的拍摄时间及拍摄位置,确定所述待选车辆图像与所述目标车辆的时空匹配概率;基于所述余弦距离和所述时空匹配概率,确定与所述目标车辆匹配的目标待选车辆图像。
根据本申请实施例的第五方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现本申请实施例的第一方面所述的车辆外观特征识别方法的步骤。
根据本申请实施例的第六方面,提供了一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现本申请实施例的第二方面所述的车辆检索方法的步骤。
根据本申请实施例的第七方面,提供了一种电子设备,包括:第一处理器、第一存储器、第一通信元件和第一通信总线,所述第一处理器、所述第一存储器和所述第一通信元件通过所述第一通信总线完成相互间的通信;所述第一存储器用于存放至少一可执行指令,所述可执行指令使所述第一处理器执行如本申请实施例的第一方面所述的车辆外观特征识别方法的步骤。
根据本申请实施例的第八方面,提供了一种电子设备,包括:第二处理器、第二存储器、第二通信元件和第二通信总线,所述第二处理器、所述第二存储器和所述第二通信元件通过所述第二通信总线完成相互间的通信;所述第二存储器用于存放至少一可执行指令,所述可执行指令使所述第二处理器执行如本申请实施例的第二方面所述的车辆检索方法的步骤。
根据本申请实施例的车辆外观特征识别方法,从待识别图像中,获取目标车辆的多个区域分割结果;然后基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据;并对所述全局特征数据和多个所述区域特征数据进行融合,得到所述目标车辆的外观特征数据,本申请实施例的车辆外观特征识别方法与现有技术中获取车辆外观特征的方法相比,本申请实施例识别得到的车辆外观特征除了全局特征,还包括车辆外观的局部区域的特征,能够通过局部区域特征体现目标车辆的细节信息,从而能够更加准确地描述车辆的外观。此外,本申请实施例识别得到的车辆外观特征能够使得不同车辆图像中的车辆外观特征可直接进行比对,解决了不同车辆图像之间的不同区域无法比对的问题。
下面通过附图和实施例,对本申请的技术方案做进一步的详细描述。
附图说明
构成说明书的一部分的附图描述了本申请的实施例,并且连同描述一起用于解释本申请的原理。
参照附图,根据下面的详细描述,可以更加清楚地理解本申请,其中:
图1是根据本申请车辆外观特征识别方法一个实施例的流程图。
图2是根据本申请车辆外观特征识别方法另一个实施例的流程图。
图3是实施图2的方法实施例的车辆关键点的分布的示意图。
图4是实施图2的方法实施例的网络框架的示意图。
图5是实施图2的方法实施例的车辆区域分割结果的示意图。
图6是实施图2的方法实施例的车辆区域的权重值的示意图。
图7是根据本申请车辆检索方法一个实施例的流程图。
图8是根据本申请车辆检索方法另一个实施例的流程图。
图9是实施图8的方法实施例的车辆的相似距离的示意图。
图10是根据本申请车辆外观特征识别装置一个实施例的结构示意图。
图11是根据本申请车辆外观特征识别装置另一个实施例的结构示意图。
图12是根据本申请车辆检索装置一个实施例的结构示意图。
图13是根据本申请车辆检索装置另一个实施例的结构示意图。
图14是适于用来实现本申请实施例的终端设备或服务器的电子设备一个实施例的结构示意图。
图15是适于用来实现本申请实施例的终端设备或服务器的电子设备另一个实施例的结构示意图。
具体实施方式
现在将参照附图来详细描述本申请的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本申请的范围。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本申请及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本申请实施例可以应用于终端设备、计算机系统、服务器等电子设备,其可与众多其它通用或专用计算系统环境或配置一起操作。适于与终端设备、计算机系统、服务器等电子设备一起使用的众所周知的终端设备、计算系统、环境和/或配置的例子包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
终端设备、计算机系统、服务器等电子设备可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
图1是根据本申请车辆外观特征识别方法一个实施例的流程图。
参照图1,步骤S101,从待识别图像中,获取目标车辆的多个区域分割结果。
在本实施例中,从图像包含的内容来讲,待识别图像可为包括目标车辆的一部分的图像或包括整个目标车辆的图像等。从图像的类别来讲,待识别图像可为拍摄的静态图像,或者为视频帧序列中的视频图像,也可以是合成的图像等。多个区域分割结果分别对应目标车辆的不同方位的区域。可选地,多个区域分割结果可以包括但不限于目标车辆的前面、后面、左面和右面的分割结果。当然,在本申请实施例中,多个区域分割结果不限制于包括目标车辆的前面、后面、左面和右面这四个区域的分割结果。例如,多个区域分割结果还可包括目标车辆的前面、后面、左面、右面、上面和下面这六个区域的分割结果,多个区域分割结果还可包括目标车辆的正前、正后、正左、正右、左前、右前、左后和右后这八个区域的分割结果。其中,区域分割结果是一张单通道的权重图,区域分割结果中数值的大小表示待识别图像中该对应位置的重要程度,即区域分割结果中数值越大表示待识别图像中该对应位置的重要程度越高,区域分割结果中数值越小表示待识别图像中该对应位置的重要程度越低。
在一个可选示例中,该步骤S101可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的第一获取模块501执行。
步骤S102,基于多个区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据。
其中,全局特征数据和多个区域特征数据为目标车辆的全局特征数据和多个区域特征数据,全局特征数据可为向量表示的全局特征,区域特征数据可为向量表示的区域特征。
在一个可选示例中,该步骤S102可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取模块502执行。
步骤S103,对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据。
其中,在全局特征数据和区域特征数据均使用向量表示的情况下,全局特征向量的维数与区域特征向量的维数相同。目标车辆的外观特征数据包括目标车辆的多个局部区域的特征和目标车辆的全局区域的特征。
在一个可选示例中,该步骤S103可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的融合模块503执行。
根据本实施例的车辆外观特征识别方法,从待识别图像中,获取目标车辆的多个区域分割结果;然后基于多个区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据;并对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据,本实施例的车辆外观特征识别方法与现有技术中获取车辆外观特征的方法相比,本实施例识别得到的车辆外观特征除了全局特征,还包括车辆外观的局部区域的特征,能够通过局部区域特征体现目标车辆的细节信息,从而能够更加准确地描述车辆的外观。此外,本实施例识别得到的车辆外观特征能够使得不同车辆图像中的车辆外观特征可直接进行比对,解决了不同车辆图像之间的不同区域无法比对的问题。
本实施例的车辆外观特征识别方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。
图2是根据本申请车辆外观特征识别方法另一个实施例的流程图。
参照图2,步骤S201,通过用于区域提取的第一神经网络从待识别图像中,获取目标车辆的多个区域分割结果。
在一个可选示例中,该步骤S201可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的获取子模块6011执行。
其中,第一神经网络可以是任意适当的可实现区域提取或目标对象识别的神经网络,可以包括但不限于卷积神经网络、增强学习神经网络、对抗神经网络中的生成网络等等。神经网络中结构的设置可以由本领域技术人员根据实际需求适当设定,如卷积层的层数、卷积核的大小、通道数等等,本申请实施例对此不作限制。本申请实施例中,第一神经网络具有第一特征提取层和连接在第一特征提取层末端的第一计算层。
可选地,该步骤S201包括:通过第一特征提取层对待识别图像进行特征提取,得到目标车辆的多个关键点;通过第一计算层对多个关键点进行分类,得到多个关键点集群,并且分别针对多个关键点集群中关键点的特征图进行融合,获得多个关键点集群对应的区域分割结果。
由于车辆是纯色的,并且一些车辆的色谱相当类似,因此,从车辆的颜色上很难将车辆区分开。本实施例基于车辆关键点提取车辆的区域特征。通过这种方式,车辆的很多细节特征便能够更好地从区域特征中体现出来。本实施例中的车辆关键点不是车辆的边界点或拐角点,而是车辆上具有明显区别的位置或者车辆的主要零部件,例如,车轮、灯具、标识、后视镜、车牌等等。图3是实施图2的方法实施例的车辆关键点的分布的示意图。如图3所示,本实施例中的车辆关键点包括车辆的左前轮1、左后轮2、右前轮3、右后轮4、右防雾灯5、左防雾灯6、右前头灯7、左前头灯8、正面汽车标识9、正面牌照10、左后视镜11、右后视镜12、车顶的右前角13、车顶的左前角14、车顶的左后角15、车顶的右后角16、左车尾灯17、右车尾灯18、背面汽车标识19和背面牌照20。籍此,车辆的细节特征能够从区域特征中反映出来,从而能够更加准确地描述车辆的外观。
在可选的实施方式中,第一特征提取层针对输入的车辆图像中20个车辆关键点中的车辆关键 点进行特征提取,得到多个车辆关键点的响应特征图。其中,第一特征提取层可以为沙漏型网络结构。在执行该步骤之前,需要对该第一特征提取层进行训练。第一特征提取层的训练过程可以为:将标记的车辆关键点的目标响应特征图设计为在标记的关键点位置周围的高斯内核,然后,将含有标记的车辆关键点的车辆图像输入第一特征提取层,确定第一特征提取层的预测结果是否接近目标高斯内核,如果不接近目标高斯内核,根据预测结果与目标高斯内核的差异调整第一特征提取层的参数,并进行反复迭代训练。其中,第一特征提取层针对标记的车辆关键点的预测结果为标记的车辆关键点的响应特征图对应的高斯内核,预测结果与目标高斯内核的差异可为交叉熵。图4是实施图2的方法实施例的网络框架的示意图。如图4所示,第一神经网络(a)中的标志回归器就是第一特征提取层的表现形式。
在本实施例中,特定角度拍摄的车辆图像中的车辆总会存在一些不可见的区域。为了处理存在不可见的车辆关键点的问题,可充分利用车辆关键点之间的几何关系,将20个车辆关键点分配为4个集群,例如,C1=[5,6,7,8,9,10,13,14];C2=[15,16,17,18,19,20];C3=[1,2,6,8,11,14,15,17]以及C4=[3,4,5,7,12,13,16,18],4个集群中的车辆关键点分别对应于车辆的正面、背面、左面和右面,然后,对多个集群中的关键点的特征图进行融合,获得车辆的正面分割结果、背面分割结果、左面分割结果和右面分割结果,如图4的(a)部分所示。
图5是实施图2的方法实施例的车辆区域分割结果的示意图。如图5所示,左侧依次是三张车辆图像,右侧依次是每张车辆图像的正面分割结果、背面分割结果、左面分割结果和右面分割结果,从图中可观察到车辆图像中车辆可见区域的分割结果一般比车辆不可见区域的分割结果具有更高的响应,这可以说明第一特征提取层不仅能够预测车辆关键点,而且还能够把可见的车辆关键点从不可见的车辆关键点中区分开来。
步骤S202,基于多个区域分割结果,通过用于特征提取的第二神经网络从待识别图像中提取目标车辆的全局特征数据和多个区域特征数据。
在一个可选示例中,该步骤S202可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的提取子模块6021执行。
其中,第二神经网络可以是任意适当的可实现特征提取或目标对象识别的神经网络,包括但不限于卷积神经网络、增强学习神经网络、对抗神经网络中的生成网络等等。神经网络中可选结构的设置可以由本领域技术人员根据实际需求适当设定,如卷积层的层数、卷积核的大小、通道数等等,本申请实施例对此不作限制。在本实施例中,第二神经网络具有第一处理子网和分别与第一处理子网的输出端连接的多个第二处理子网,其中,第一处理子网具有第二特征提取层、第一启动模块和第一池化层,第二处理子网具有与第一处理子网的输出端连接的第二计算层、第二启动模块和第二池化层。其中,第二特征提取层包括3层卷积层和2个启动模块(Inception Module),启动模块可进行卷积操作和池化操作。
可选地,该步骤S202包括:通过第二特征提取层对待识别图像进行卷积操作和池化操作,获得目标车辆的全局特征图;通过第一启动模块对全局特征图进行卷积操作和池化操作,获得目标车辆的第一特征图集合;通过第一池化层对第一特征图集合中的特征图进行池化操作,获得目标车辆的全局特征向量。
在可选的实施方式中,首先对待识别图像进行缩放,以使得待识别图像的尺寸大小为192*192,然后,将缩放后的图像输入到由3层卷积层和2个启动模块构成的第二特征提取层,第二特征提取层对缩放后的图像进行卷积操作和池化操作,得到空间大小为12*12的全局特征图。再然后,第一启动模块对全局特征图再进行卷积操作和池化操作,得到空间大小为6*6的特征图的集合。最后,第一池化层对集合中的特征图进行全局平均池化操作,获得1536维的全局特征向量。
可选地,该步骤S202还可以包括:通过第二计算层将多个区域分割结果分别与全局特征图进 行点乘,获得多个区域分割结果分别对应的局部特征图;通过第二启动模块对多个区域分割结果的局部特征图进行卷积操作和池化操作,获得多个区域分割结果对应的第二特征图集合;通过第二池化层对多个区域分割结果的第二特征图集合进行池化操作,获得多个区域分割结果对应的第一区域特征向量。
可选地,通过第二计算层将多个区域分割结果分别与全局特征图进行点乘之前,方法还包括:通过第二计算层将多个区域分割结果分别缩放到与全局特征图的尺寸相同的尺寸。籍此,可确保最终获得的区域特征向量的维度与全局特征向量的维度相同。
在可选的实施方式中,首先将车辆的正面分割结果、背面分割结果、左面分割结果和右面分割结果分别缩放至与全局特征图的尺寸相同的尺寸,也即是12*12的大小。然后,将缩放后的正面分割结果、背面分割结果、左面分割结果和右面分割结果分别与全局特征图进行点乘,获得车辆的正面特征图、背面特征图、左面特征图和右面特征图。再然后,第二启动模块对车辆的正面特征图、背面特征图、左面特征图和右面特征图分别进行卷积操作和池化操作,获得局部特征图分别对应的特征图集合,该特征图集合中特征图的空间大小为6*6。最后,通过第二池化层对多个局部特征图对应的特征图集合中的特征图分别进行全局最大池化操作,获得车辆的正面特征向量、背面特征向量、左面特征向量和右面特征向量,并且局部区域的特征向量的维度均为1536维。之所以对多个局部特征图对应的特征图集合中特征图分别进行全局最大池化操作,是因为最大的响应更适合于从一个局部区域提取特征。
如图4的(b)部分所示,可将第二神经网络划分为两个阶段,全局特征和局部特征均以主干和分支形式提取得到的。其中,第一阶段对待识别图像进行卷积操作和池化操作,获得待识别图像的全局特征图。第二阶段包括5个分支,一个全局分支和四个局部区域分支。其中,全局分支对全局特征图进行上述实施例类似的处理,得到全局特征向量,局部区域分支结合全局特征图分别针对指定的区域分割结果进行上述实施例类似地处理,获得相应的局部特征向量。
步骤S203,通过用于特征融合的第三神经网络对目标车辆的全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据。
在一个可选示例中,该步骤S203可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的融合子模块6031执行。
其中,第三神经网络可以是任意适当的可实现特征融合的神经网络,包括但不限于卷积神经网络、增强学习神经网络、对抗神经网络中的生成网络等等。神经网络中可选结构的设置可以由本领域技术人员根据实际需求适当设定,如卷积层的层数、卷积核的大小、通道数等等,本申请实施例对此不作限制。在本实施例中,第三神经网络具有与第二神经网络的多个输出端连接的第一全连接层、第三计算层和第二全连接层。
可选地,该步骤S203包括:通过第一全连接层获取多个第一区域特征向量的权重值;通过第三计算层根据权重值,对多个第一区域特征向量分别加权,获得相应的多个第二区域特征向量;通过第二全连接层对多个第二区域特征向量和全局特征向量进行映射操作,获得目标车辆的外观特征向量。
可选地,通过第一全连接层获取多个第一区域特征向量的权重值,包括:对多个第一区域特征向量进行拼接操作,获得拼接后的第一区域特征向量;通过第一全连接层对拼接后的第一区域特征向量进行映射操作,获得第一区域特征向量对应的标量的集合;对集合中的至少一个标量进行归一化操作,获得第一区域特征向量的权重值。
在可选的实施方式中,包括以下操作:
对车辆的正面特征向量、背面特征向量、左面特征向量和右面特征向量进行拼接操作,然后,将拼接后的正面特征向量、背面特征向量、左面特征向量和右面特征向量输入第一全连接层,第 一全连接层对这四个特征向量进行映射操作,获得标量集合。
通过Softmax函数对标量集合中的标量进行归一化操作,分别获得正面特征向量、背面特征向量、左面特征向量和右面特征向量的权重值。
根据相应的权重值分别对正面特征向量、背面特征向量、左面特征向量和右面特征向量进行加权,获得加权后的正面特征向量、背面特征向量、左面特征向量和右面特征向量。
对加权后的正面特征向量、背面特征向量、左面特征向量和右面特征向量与全局特征向量进行拼接操作。
第二全连接层对拼接后的加权局部特征向量和全局特征向量进行映射操作,获得256维的车辆外观特征向量,如图4的(c)部分所示。
在特征融合的过程中,第三神经网络会学习得到不同车辆区域的特征向量的权重值。不同车辆区域的特征会有不同的重要性,车辆图像中车辆可见区域的特征能够保留或给予更大权重,车辆图像中车辆不可见区域的特征会在竞争过程中被淘汰掉或给予很小权重。例如,车辆图像中的车辆的朝向是左前方,可以看到车辆的左面和前面,这两个面的特征就相对重要,相应的特征向量的权重值相对要大一些,而该车辆的后面和右面是不可见的,虽然也能提取出这两个面的特征,但这两个面的特征向量的权重值相对要小一些。通过这种方式,车辆图像中车辆可见区域的车辆关键点对最终的车辆外观特征向量的贡献要多一些,而车辆图像中车辆不可见区域的车辆关键点对最终的车辆外观特征向量的影响通过相对较小的权重值而被削弱。籍此,能够更加准确地描述车辆的外观。
图6是实施图2的方法实施例的车辆区域的权重值的示意图。如图6所示,(a)部分表示输入的一个车辆的3张不同拍摄角度的图像,以及每种拍摄角度的图像中车辆正面、车辆背面、车辆左面和车辆右面的权重值,(b)部分表示测试集合中选定的车辆图像的车辆外观特征在二维空间中的投射结果,(c)部分表示输入的另一个车辆的3张不同拍摄角度的图像,以及每种拍摄角度的图像中车辆正面、车辆背面、车辆左面和车辆右面的权重值。从图中可知,不管车辆图像的拍摄角度是怎样的,相同车辆的外观特征能够聚合在一起。由此得到,本实施例识别得到的车辆外观特征与待识别车辆图像的拍摄角度无关,不同车辆图像中的车辆外观特征可直接进行比对,解决了不同车辆图像之间的不同区域无法比对的问题。此外,图中(a)部分和(c)部分示出了输入的车辆图像和对应的两个集群的学习权重,车辆外观的局部区域特征基于这些学习权重进行融合,从中可观察到车辆图像中车辆可见的面的权重值比车辆不可见的面的权重值要高一些。
另外,可采纳一种可供选择的训练策略对第二神经网络和第三神经网络进行训练,该训练策略包括四个步骤,可以为:(i)第二神经网络的第一阶段的主干网和第二阶段的全局分支可从随机初始化进行训练,并通过整个图像区域的全局特征监督。(ii)第一阶段的主干网训练完成之后,可使用第二阶段的全局分支初始化的参数来训练第二阶段的四个局部分支,因为第二阶段的全局分支与局部分支具有相同的结构。此外,通过给定的分类标签对四个局部分支的训练分别进行监督。(iii)第一阶段的主干网和第二阶段的分支训练完成之后,训练第三神经网络。(iv)初始化具有通过上述步骤学习得到的参数的神经网络,并且将参数结合起来微调。当训练第二神经网络和第三神将网络时可使用现有的车辆数据库和Softmax分类损失。
在可选的应用中,本实施例识别得到的车辆外观特征不仅可用来描述车辆,而且还可用来进行车辆属性分析,例如,粗分车型,细分车型,车辆颜色等等,此外,还可利用本实施例识别得到的车辆外观特征进行车辆的分类、识别和检索。
根据本实施例的车辆外观特征识别方法,通过用于区域提取的第一神经网络从待识别图像中,获取目标车辆的多个区域分割结果,然后,基于多个区域分割结果,通过用于特征提取的第二神经网络从待识别图像中提取目标车辆的全局特征数据和多个区域特征数据,再通过用于特征融合 的第三神经网络对目标车辆的全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据,本实施例的车辆外观特征识别方法与现有技术中获取车辆外观特征的方法相比,本实施例识别得到的车辆外观特征除了全局特征,还可以包括车辆外观的局部区域的特征,能够通过局部区域特征体现目标车辆的细节信息,从而能够更加准确地描述车辆的外观。此外,本实施例识别得到的车辆外观特征能够使得不同车辆图像中的车辆外观特征可直接进行比对,解决了不同车辆图像之间的不同区域无法比对的问题。
本实施例的车辆外观特征识别方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。
图7是根据本申请车辆检索方法一个实施例的流程图。
参照图7,步骤S301,通过车辆外观特征识别方法获取待检索图像中目标车辆的外观特征数据。
在本实施例中,可通过上述实施例一或上述实施例二提供的车辆外观特征识别方法来获取待检索图像中目标车辆的外观特征数据。其中,外观特征数据可为使用向量表示的数据。从图像包含的内容来讲,待检索图像可为包括目标车辆的一部分的图像或包括整个目标车辆的图像等。从图像的类别来讲,待检索图像可为拍摄的静态图像,或者为视频帧序列中的视频图像,也可以是合成的图像等。
在一个可选示例中,该步骤S301可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的第二获取模块701执行。
步骤S302,从待选车辆图像库中查找与外观特征数据匹配的目标待选车辆图像。
在一个可选示例中,该步骤S302可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的查找模块702执行。
在可选的实施方式中,可通过上述实施例一或上述实施例二提供的车辆外观特征识别方法来获取待选车辆图像库中多个待选车辆图像中车辆的外观特征数据,并将目标车辆的外观特征数据分别与待选车辆图像中车辆的外观特征数据进行比对,获得与目标车辆的外观特征数据匹配的目标待选车辆图像。
本申请的示例性实施例旨在提出一种车辆检索方法,通过上述实施一或上述实施例二提供的车辆外观特征识别方法来获取待检索图像中目标车辆的外观特征数据,并从待选车辆图像库中查找与外观特征数据匹配的目标待选车辆图像,能够提高车辆检索的准确率。
本实施例的车辆检索方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。
图8是根据本申请车辆检索方法另一个实施例的流程图。
参考图8,步骤S401,通过车辆外观特征识别方法获取待检索图像中目标车辆的外观特征数据。
在一个可选示例中,该步骤S401可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的第二获取模块804执行。
由于该步骤S401与上述步骤S301相同,在此不再赘述。
步骤S402,确定目标车辆的外观特征向量分别与待选车辆图像库中的待选车辆图像的车辆的外观特征向量的余弦距离。
在一个可选示例中,该步骤S402可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的查找模块805执行。
在本实施例中,本领域技术人员可根据现有的余弦距离计算公式来计算得到目标车辆的外观特征向量分别与待选车辆图像中车辆的外观特征向量的余弦距离。
步骤S403,根据余弦距离确定与目标车辆匹配的目标待选车辆图像。
在一个可选示例中,该步骤S403可以由处理器调用存储器存储的相应指令执行,也可以由被处理器运行的查找模块805执行。
在可选的实施方式中,当目标车辆的外观特征向量与待选车辆图像中车辆的外观特征向量的余弦距离大于或等于第一预设阈值时,便可确定该待选车辆图像为与目标车辆匹配的目标待选车辆图像。其中,本领域技术人员可通过测试得到第一预设阈值。当然,本申请的实施例不限于此。
可选地,方法还包括:获取待检索图像的拍摄时间和/或拍摄位置以及多个待选车辆图像的拍摄时间和/或拍摄位置;根据拍摄时间和/或拍摄位置确定目标车辆与多个待选车辆图像中的车辆的时空距离;根据余弦距离和时空距离确定待选车辆图像库中与目标车辆匹配的目标待选车辆图像。籍此,可进一步提高车辆检索的准确率。
可选地,根据余弦距离和时空距离确定待选车辆图像库中与目标车辆匹配的目标待选车辆图像,包括:根据余弦距离,在待选车辆图像库中获取多个待选车辆图像;分别基于待选车辆图像的拍摄时间及拍摄位置,确定待选车辆图像与目标车辆的时空匹配概率;基于余弦距离和时空匹配概率,确定与目标车辆匹配的目标待选车辆图像。
其中,车辆图像的时空信息能大大加强车辆检索的召回率。如果已知要检索的车辆图像的拍摄时间和拍摄地点,可通过统计建模的方式得到车辆图像中的车辆在另一时间和另一地点出现的概率。这对检索任务非常有效。其中,时空匹配概率是由待选车辆图像与目标车辆图像的拍摄时间和拍摄位置共同决定的。简单来说,时空匹配概率指的是目标车辆在待选车辆图像的拍摄时间和拍摄位置出现的概率,它是根据车辆图像的拍摄时间和拍摄地点通过统计建模的方式获得的。可选地,时空匹配概率指的是两个摄像头之间的车辆转移时间间隔的条件概率,可通过以下公式一计算得到。
在实际的应用场景中,车辆外观特征可能不足以将一辆车从其它车辆中区分开,特别是在车辆具有相同外型而没有个性化装饰的情况下。但是,在监控应用中,车辆图像的拍摄时间和拍摄位置是容易获取得到的。通过分析两个摄像头之间的车辆转移时间间隔,本申请的发明人发现针对至少一对摄像头,车辆转移时间间隔可被模拟为一个满足概率分布的随机变量。由于车辆转移时间间隔的类高斯和长尾属性,对数正态分布可被用来模拟这个随机变量。给定l表示车辆离开的摄像头,e表示车辆进入的摄像头,l与e之间车辆转移时间间隔τ的条件概率可通过以下公式一计算得到:
Figure PCTCN2018093165-appb-000001
其中,μ l,e,σ l,e分别表示每对摄像头(l,e)的估计参数,车辆转移时间间隔τ为两张车辆图像的拍摄时间的绝对值,估计参数可通过最大化以下的对数似然函数计算得到:
Figure PCTCN2018093165-appb-000002
其中,τ n∈τ(n=1,2,3,...,N)表示从训练集合中采样得到的每对摄像头(l,e)的两个摄像头之间的车辆转移时间间隔,τ包含训练集合中两个摄像头之间的车辆转移时间间隔样本。
在得到l与e之间车辆转移时间间隔τ的条件概率之后,可根据以下公式二计算得到两张车辆图像之间的车辆的时空距离:
Figure PCTCN2018093165-appb-000003
其中,条件概率越高,两张车辆图像之间的车辆的时空距离越小。
最后,可根据以下公式三计算得到两张车辆图像之间的相似距离:
D=Da+βD s   公式三
其中,D a表示两张车辆图像之间的车辆外观特征向量的余弦距离,D s表示两张车辆图像之间的车辆的时空距离,D表示两张车辆图像之间的车辆的相似距离,α的大小为2,β的大小为0.1。其中,两张车辆图像之间的车辆的相似距离越小,则两张车辆图像之间的车辆越相似。
当目标车辆与待选车辆图像中车辆的相似距离小于或等于第二预设阈值时,便可确定该待选车辆图像为与目标车辆匹配的目标待选车辆图像。其中,本领域技术人员可通过测试得到第二预设阈值。当然,本申请的实施例不限于此。
图9是实施图8的方法实施例的车辆的相似距离的示意图。如图9所示,上面的第一排的方框中的图像是根据余弦距离获得的排序在前五的待选车辆图像,上面的第一排最左边的图像是目标车辆的图像,下面的一排图像是基于待选车辆图像与目标车辆图像的时空距离进行重新排序所得到的结果。可选地,通过公式一根据目标车辆图像和待选车辆图像的拍摄时间数值以及目标车辆图像和待选车辆图像的拍摄摄像头的编号计算得到车辆转移时间间隔的条件概率。然后,通过公式二根据车辆转移时间间隔的条件概率计算得到目标车辆图像与待选车辆图像之间的车辆的时空距离,再通过公式三根据已知的余弦距离和计算得到的时空距离计算得到目标车辆图像与待选车辆图像之间的车辆的相似距离。最后,根据目标车辆图像与待选车辆图像之间的车辆的相似距离对待选车辆图像的排序结果进行重新排序,获得待选车辆图像重新排序的结果。
本申请的示例性实施例旨在提出一种车辆检索方法,通过上述实施一或上述实施例二提供的车辆外观特征识别方法来获取待检索图像中目标车辆的外观特征数据,并从待选车辆图像库中查找与外观特征数据匹配的目标待选车辆图像,能够提高车辆检索的准确率。
本实施例的车辆检索方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。或者,本申请实施例提供的任一方法可以由处理器执行,如处理器通过调用存储器存储的相应指令来执行本申请实施例提及的任一方法。下文不再赘述。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
基于相同的技术构思,图10是示出根据本申请车辆外观特征识别装置一个实施例的结构示意图。可用以执行如实施例一的车辆外观特征识别方法流程。
参照图10,该车辆外观特征识别装置包括第一获取模块501、提取模块502和融合模块503。
第一获取模块501,用于从待识别图像中,获取目标车辆的多个区域分割结果;
提取模块502,用于基于多个区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据;
融合模块503,用于对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据。
通过本实施例提供的车辆外观特征识别装置,从包括有目标车辆的待识别图像中,获取目标车辆的多个区域分割结果;然后基于多个区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据;并对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据,本实施例识别得到的车辆外观特征包括车辆外观的局部区域的特征,能够更加准确地描述车辆的外观。此外,本实施例识别得到的车辆外观特征能够使得不同车辆图像中的车辆外观特征可直接进行比对,解决了不同车辆图像之间的不同区域无法比对的问题。
基于相同的技术构思,图11是示出根据本申请车辆外观特征识别装置另一个实施例的结构示意图。可用以执行如实施例二的车辆外观特征识别方法流程。
参照图11,该车辆外观特征识别装置包括第一获取模块601、提取模块602和融合模块603。其中,第一获取模块601,用于从待识别图像中,获取目标车辆的多个区域分割结果;提取模块602,用于基于多个 区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据;融合模块603,用于对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据。
可选地,多个区域分割结果分别对应目标车辆的不同方位的区域。
可选地,多个区域分割结果包括目标车辆的前面、后面、左面和右面的分割结果。
可选地,第一获取模块601,包括:获取子模块6011,用于通过用于区域提取的第一神经网络从待识别图像中,获取目标车辆的多个区域分割结果。
可选地,第一神经网络具有第一特征提取层和连接在第一特征提取层末端的第一计算层,其中,获取子模块6011,用于:通过第一特征提取层对待识别图像进行特征提取,得到目标车辆的多个关键点;通过第一计算层对多个关键点进行分类,得到多个关键点集群,并且分别针对多个关键点集群中的关键点的特征图进行融合,获得多个关键点集群对应的区域分割结果。
可选地,提取模块602,包括:提取子模块6021,用于基于多个区域分割结果,通过用于特征提取的第二神经网络从待识别图像中提取目标车辆的全局特征数据和多个区域特征数据。
可选地,第二神经网络具有第一处理子网和分别与第一处理子网的输出端连接的多个第二处理子网,其中,第一处理子网具有第二特征提取层、第一启动模块和第一池化层,第二处理子网具有与第一处理子网的输出端连接的第二计算层、第二启动模块和第二池化层。
可选地,提取子模块6021,包括:第一特征提取单元6022,用于通过第二特征提取层对待识别图像进行卷积操作和池化操作,获得目标车辆的全局特征图;第二特征提取单元6023,用于通过第一启动模块对全局特征图进行卷积操作和池化操作,获得目标车辆的第一特征图集合;第一池化单元6024,用于通过第一池化层对第一特征图集合中的特征图进行池化操作,获得目标车辆的全局特征向量。
可选地,提取子模块6021,还包括:第一计算单元6026,用于通过第二计算层将多个区域分割结果分别与全局特征图进行点乘,获得多个区域分割结果分别对应的局部特征图;第三特征提取单元6027,用于通过第二启动模块对多个区域分割结果的局部特征图进行卷积操作和池化操作,获得多个区域分割结果对应的第二特征图集合;第二池化单元6028,用于通过第二池化层对多个区域分割结果的第二特征图集合进行池化操作,获得多个区域分割结果对应的第一区域特征向量。
可选地,提取子模块6021,还包括:第二计算单元6025,用于通过第二计算层将多个区域分割结果分别缩放到与全局特征图的尺寸相同的尺寸。
可选地,融合模块603,包括:融合子模块6031,用于通过用于特征融合的第三神经网络对目标车辆的全局特征数据和多个区域特征数据进行融合。
可选地,第三神经网络具有与第二神经网络的输出端连接的第一全连接层、第三计算层和第二全连接层,其中,融合子模块6031,包括:第一获取单元6032,用于通过第一全连接层获取第一区域特征向量的权重值;第三计算单元6033,用于通过第三计算层根据权重值,对多个第一区域特征向量分别加权,获得相应的多个第二区域特征向量;映射单元6034,用于通过第二全连接层对多个第二区域特征向量和全局特征向量进行映射操作,获得目标车辆的外观特征向量。
可选地,第一获取单元6032,用于:对多个第一区域特征向量进行拼接操作,获得拼接后的第一区域特征向量;通过第一全连接层对拼接后的第一区域特征向量进行映射操作,获得多个第一区域特征向量对应的标量的集合;对集合中的多个标量进行归一化操作,获得多个第一区域特征向量的权重值。
可选地,第一特征提取层为沙漏型网络结构。
需要说明的是,对于本申请实施例提供的车辆外观特征识别装置还涉及的具体细节已在本申请实施例提供的车辆外观特征识别方法中作了详细的说明,在此不在赘述。
基于相同的技术构思,图12是示出根据本申请车辆检索装置一个实施例的结构示意图。可用以执行如实施例三的车辆检索方法流程。
参照图12,该车辆检索装置包括第二获取模块701和查找模块702。
第二获取模块701,用于通过根据实施例五或实施例六的装置获取待检索图像中目标车辆的外观特征数据;
查找模块702,用于从待选车辆图像库中查找与外观特征数据匹配的目标待选车辆图像。
本申请的示例性实施例旨在提出一种车辆检索装置,通过上述实施五或上述实施例六提供的车辆外观特征识别装置来获取待检索图像中目标车辆的外观特征数据,并从待选车辆图像库中查找与外观特征数据匹配的目标待选车辆图像,能够提高车辆检索的准确率。
基于相同的技术构思,图13是示出根据本申请车辆检索装置另一个实施例的结构示意图。可用以执行如实施例四的车辆检索方法流程。
参照图13,该车辆检索装置包括第二获取模块804和查找模块805。第二获取模块804,用于通过根据实施例五或实施例六的装置获取待检索图像中目标车辆的外观特征数据;查找模块805,用于从待选车辆图像库中查找与外观特征数据匹配的目标待选车辆图像。
可选地,查找模块805,用于:确定目标车辆的外观特征向量分别与待选车辆图像库中的待选车辆图像的车辆的外观特征向量的余弦距离;根据余弦距离确定与目标车辆匹配的目标待选车辆图像。
可选地,本实施例装置还包括:第三获取模块801,用于获取待检索图像的拍摄时间和/或拍摄位置以及多个待选车辆图像的拍摄时间和/或拍摄位置;第一确定模块802,用于根据拍摄时间和/或拍摄位置确定目标车辆与多个待选车辆图像中的车辆的时空距离;第二确定模块803,用于根据余弦距离和时空距离确定待选车辆图像库中与目标车辆匹配的目标待选车辆图像。
可选地,第二确定模块803,用于:根据余弦距离,在待选车辆图像库中获取多个待选车辆图像;分别基于待选车辆图像的拍摄时间及拍摄位置,确定待选车辆图像与目标车辆的时空匹配概率;基于余弦距离和时空匹配概率,确定与目标车辆匹配的目标待选车辆图像。
需要说明的是,对于本申请实施例提供的车辆检索装置还涉及的具体细节已在本申请实施例提供的车辆检索方法中作了详细的说明,在此不在赘述。
本申请实施例还提供了一种电子设备,例如可以是移动终端、个人计算机(PC)、平板电脑、服务器等。下面参考图14,其示出了适于用来实现本申请实施例的终端设备或服务器的电子设备900一个实施例的结构示意图。如图14所示,电子设备900包括一个或多个第一处理器、第一通信元件等,一个或多个第一处理器例如:一个或多个中央处理单元(CPU)901,和/或一个或多个图像处理器(GPU)913等,第一处理器可以根据存储在只读存储器(ROM)902中的可执行指令或者从存储部分908加载到随机访问存储器(RAM)903中的可执行指令而执行各种适当的动作和处理。本实施例中,只读存储器902和随机访问存储器903统称为第一存储器。第一通信元件包括通信组件912和/或通信接口909。其中,通信组件912可包括但不限于网卡,网卡可包括但不限于IB(Infiniband)网卡,通信接口909包括诸如LAN卡、调制解调器等的网络接口卡的通信接口,通信接口909经由诸如因特网的网络执行通信处理。
第一处理器可与只读存储器902和/或随机访问存储器903中通信以执行可执行指令,通过第一通信总线904与通信组件912相连、并经通信组件912与其他目标设备通信,从而完成本申请实施例提供的任一项车辆外观特征识别方法对应的操作,例如,从待识别图像中,获取目标车辆的多个区域分割结果;基于多个区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据;对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据。
此外,在RAM 903中,还可存储有装置操作所需的各种程序和数据。CPU901或GPU913、ROM902以及RAM903通过第一通信总线904彼此相连。在有RAM903的情况下,ROM902为可选模块。RAM903存储可执行指令,或在运行时向ROM902中写入可执行指令,可执行指令使第一处理器执行上述通信方法对应的操作。输入/输出(I/O)接口905也连接至第一通信总线904。通信组件912可以集成设置,也可以设置为具有多个子模块(例如多个IB网卡),并在通信总线链接上。
以下部件连接至I/O接口905:包括键盘、鼠标等的输入部分906;包括诸如阴极射线管(CRT)、液晶 显示器(LCD)等以及扬声器等的输出部分907;包括硬盘等的存储部分908;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信接口909。驱动器910也根据需要连接至I/O接口905。可拆卸介质911,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器910上,以便于从其上读出的计算机程序根据需要被安装入存储部分908。
需要说明的,如图14所示的架构仅为一种可选实现方式,在可选实践过程中,可根据实际需要对上述图14的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如GPU和CPU可分离设置或者可将GPU集成在CPU上,通信元件可分离设置,也可集成设置在CPU或GPU上,等等。这些可替换的实施方式均落入本申请的保护范围。
特别地,根据本申请实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的方法的程序代码,程序代码可包括对应执行本申请实施例提供的方法步骤对应的指令,例如,从待识别图像中,获取目标车辆的多个区域分割结果;基于多个区域分割结果,从待识别图像中提取全局特征数据和多个区域特征数据;对全局特征数据和多个区域特征数据进行融合,得到目标车辆的外观特征数据。在这样的实施例中,该计算机程序可以通过通信元件从网络上被下载和安装,和/或从可拆卸介质911被安装。在该计算机程序被第一处理器执行时,执行本申请实施例的方法中限定的上述功能。
本申请实施例还提供了一种电子设备,例如可以是移动终端、个人计算机(PC)、平板电脑、服务器等。下面参考图15,其示出了适于用来实现本申请实施例的终端设备或服务器的电子设备1000另一个实施例的结构示意图。如图15所示,电子设备1000包括一个或多个第二处理器、第二通信元件等,所述一个或多个第二处理器例如:一个或多个中央处理单元(CPU)1001,和/或一个或多个图像处理器(GPU)1013等,第二处理器可以根据存储在只读存储器(ROM)1002中的可执行指令或者从存储部分1008加载到随机访问存储器(RAM)1003中的可执行指令而执行各种适当的动作和处理。本实施例中,第二只读存储器1002和随机访问存储器1003统称为第二存储器。第二通信元件包括通信组件1012和/或通信接口1009。其中,通信组件1012可包括但不限于网卡,所述网卡可包括但不限于IB(Infiniband)网卡,通信接口1009包括诸如LAN卡、调制解调器等的网络接口卡的通信接口,通信接口1009经由诸如因特网的网络执行通信处理。
第二处理器可与只读存储器1002和/或随机访问存储器1003中通信以执行可执行指令,通过第二通信总线1004与通信组件1012相连、并经通信组件1012与其他目标设备通信,从而完成本申请实施例提供的任一项车辆检索方法对应的操作,例如,通过根据上述实施例一或实施例二所述的方法获取待检索图像中目标车辆的外观特征数据;从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像。
此外,在RAM 1003中,还可存储有装置操作所需的各种程序和数据。CPU1001或GPU1013、ROM1002以及RAM1003通过第二通信总线1004彼此相连。在有RAM1003的情况下,ROM1002为可选模块。RAM1003存储可执行指令,或在运行时向ROM1002中写入可执行指令,可执行指令使第二处理器执行上述通信方法对应的操作。输入/输出(I/O)接口1005也连接至第二通信总线1004。通信组件1012可以集成设置,也可以设置为具有多个子模块(例如多个IB网卡),并在通信总线链接上。
以下部件连接至I/O接口1005:包括键盘、鼠标等的输入部分1006;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分1007;包括硬盘等的存储部分1008;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信接口1009。驱动器1010也根据需要连接至I/O接口1005。可拆卸介质1011,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器1010上,以便于从其上读出的计算机程序根据需要被安装入存储部分1008。
需要说明的,如图15所示的架构仅为一种可选实现方式,在可选实践过程中,可根据实际需要对上述图15的部件数量和类型进行选择、删减、增加或替换;在不同功能部件设置上,也可采用分离设置或集成设置等实现方式,例如GPU和CPU可分离设置或者可将GPU集成在CPU上,通信元件可分离设置,也可 集成设置在CPU或GPU上,等等。这些可替换的实施方式均落入本申请的保护范围。
特别地,根据本申请实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请实施例包括一种计算机程序产品,其包括有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的方法的程序代码,程序代码可包括对应执行本申请实施例提供的方法步骤对应的指令,例如,通过根据上述实施例一或实施例二所述的方法获取待检索图像中目标车辆的外观特征数据;从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像。在这样的实施例中,该计算机程序可以通过通信元件从网络上被下载和安装,和/或从可拆卸介质1011被安装。在该计算机程序被第二处理器执行时,执行本申请实施例的方法中限定的上述功能。
需要指出,根据实施的需要,可将本申请中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本申请实施例的目的。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于系统实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
可能以许多方式来实现本申请的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本申请的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本申请的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本申请实施为记录在记录介质中的程序,这些程序包括用于实现根据本申请的方法的机器可读指令。因而,本申请还覆盖存储用于执行根据本申请的方法的程序的记录介质。
本申请的描述是为了示例和描述起见而给出的,而并不是无遗漏的或者将本申请限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言是显然的。选择和描述实施例是为了更好说明本申请的原理和实际应用,并且使本领域的普通技术人员能够理解本申请从而设计适于特定用途的带有各种修改的各种实施例。

Claims (40)

  1. 一种车辆外观特征识别方法,其特征在于,所述方法包括:
    从待识别图像中,获取目标车辆的多个区域分割结果;
    基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据;
    对所述全局特征数据和多个所述区域特征数据进行融合,得到所述目标车辆的外观特征数据。
  2. 根据权利要求1所述的方法,其特征在于,所述多个区域分割结果分别对应所述目标车辆的不同方位的区域。
  3. 根据权利要求2所述的方法,其特征在于,所述多个区域分割结果包括所述目标车辆的前面、后面、左面和右面的分割结果。
  4. 根据权利要求1~3中任意一项权利要求所述的方法,其特征在于,所述从待识别图像中,获取目标车辆的多个区域分割结果,包括:
    通过用于区域提取的第一神经网络从待识别图像中,获取所述目标车辆的多个区域分割结果。
  5. 根据权利要求4所述的方法,其特征在于,所述第一神经网络具有第一特征提取层和连接在所述第一特征提取层末端的第一计算层,
    其中,通过用于区域提取的第一神经网络从待识别图像中,获取所述目标车辆的多个区域分割结果,包括:
    通过所述第一特征提取层对所述待识别图像进行特征提取,得到所述目标车辆的多个关键点;
    通过所述第一计算层对所述多个关键点进行分类,得到多个关键点集群,并且分别针对多个所述关键点集群中关键点的特征图进行融合,获得多个所述关键点集群对应的区域分割结果。
  6. 根据权利要求1~5中任意一项权利要求所述的方法,其特征在于,所述基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据,包括:
    基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据。
  7. 根据权利要求6所述的方法,其特征在于,所述第二神经网络具有第一处理子网和分别与所述第一处理子网的输出端连接的多个第二处理子网,
    其中,所述第一处理子网具有第二特征提取层、第一启动模块和第一池化层,所述第二处理子网具有与所述第一处理子网的输出端连接的第二计算层、第二启动模块和第二池化层。
  8. 根据权利要求7所述的方法,其特征在于,所述基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据,包括:
    通过所述第二特征提取层对所述待识别图像进行卷积操作和池化操作,获得所述目标车辆的全局特征图;
    通过所述第一启动模块对所述全局特征图进行卷积操作和池化操作,获得所述目标车辆的第一特征图集合;
    通过所述第一池化层对所述第一特征图集合中的特征图进行池化操作,获得所述目标车辆的全局特征向量。
  9. 根据权利要求8所述的方法,其特征在于,所述基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据,还包括:
    通过所述第二计算层将所述多个区域分割结果分别与所述全局特征图进行点乘,获得所述多个区域分割结果分别对应的局部特征图;
    通过所述第二启动模块对多个所述区域分割结果的局部特征图进行卷积操作和池化操作,获得多个所述区域分割结果对应的第二特征图集合;
    通过所述第二池化层对多个所述区域分割结果的第二特征图集合进行池化操作,获得多个所述区域分割结果对应的第一区域特征向量。
  10. 根据权利要求9所述的方法,其特征在于,所述通过所述第二计算层将所述多个区域分割结果分别与所述全局特征图进行点乘之前,所述方法还包括:
    通过所述第二计算层将所述多个区域分割结果分别缩放到与所述全局特征图的尺寸相同的尺寸。
  11. 根据权利要求1~10中任意一项权利要求所述的方法,其特征在于,所述对所述全局特征数据和多个所述区域特征数据进行融合,包括:
    通过用于特征融合的第三神经网络对所述目标车辆的全局特征数据和多个所述区域特征数据进行融合。
  12. 根据权利要求11所述的方法,其特征在于,所述第三神经网络具有与所述第二神经网络的输出端连接的第一全连接层、第三计算层和第二全连接层,
    其中,所述通过用于特征融合的第三神经网络对所述目标车辆的全局特征数据和多个所述区域特征数据进行融合,包括:
    通过所述第一全连接层获取第一区域特征向量的权重值;
    通过所述第三计算层根据所述权重值,对多个所述第一区域特征向量分别加权,获得相应的多个第二区域特征向量;
    通过所述第二全连接层对多个所述第二区域特征向量和全局特征向量进行映射操作,获得所述目标车辆的外观特征向量。
  13. 根据权利要求12所述的方法,其特征在于,所述通过所述第一全连接层获取第一区域特征向量的权重值,包括:
    对多个所述第一区域特征向量进行拼接操作,获得拼接后的第一区域特征向量;
    通过所述第一全连接层对所述拼接后的第一区域特征向量进行映射操作,获得多个所述第一区域特征向量对应的标量的集合;
    对所述集合中的标量进行归一化操作,获得多个所述第一区域特征向量的权重值。
  14. 根据权利要求5所述的方法,其特征在于,所述第一特征提取层为沙漏型网络结构。
  15. 一种车辆检索方法,其特征在于,所述方法包括:
    通过根据权利要求1~14中任意一项权利要求所述的方法获取待检索图像中目标车辆的外观特征数据;
    从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像。
  16. 根据权利要求15所述的方法,其特征在于,所述从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像,包括:
    确定所述目标车辆的外观特征向量分别与所述待选车辆图像库中的待选车辆图像的车辆的外观特征向量的余弦距离;
    根据所述余弦距离确定与所述目标车辆匹配的目标待选车辆图像。
  17. 根据权利要求16所述的方法,其特征在于,所述方法还包括:
    获取所述待检索图像的拍摄时间和/或拍摄位置以及所述多个所述待选车辆图像的拍摄时间和/或拍摄位置;
    根据所述拍摄时间和/或所述拍摄位置确定所述目标车辆与所述多个所述待选车辆图像中的车辆的时空距离;
    根据所述余弦距离和所述时空距离确定所述待选车辆图像库中与所述目标车辆匹配的目标待选车辆图像。
  18. 根据权利要求17所述的方法,其特征在于,所述根据所述余弦距离和所述时空距离确定所述待选车辆图像库中与所述目标车辆匹配的目标待选车辆图像,包括:
    根据所述余弦距离,在所述待选车辆图像库中获取多个所述待选车辆图像;
    分别基于所述待选车辆图像的拍摄时间及拍摄位置,确定所述待选车辆图像与所述目标车辆的时空匹配概率;
    基于所述余弦距离和所述时空匹配概率,确定与所述目标车辆匹配的目标待选车辆图像。
  19. 一种车辆外观特征识别装置,其特征在于,所述装置包括:
    第一获取模块,用于从待识别图像中,获取目标车辆的多个区域分割结果;
    提取模块,用于基于所述多个区域分割结果,从所述待识别图像中提取全局特征数据和多个区域特征数据;
    融合模块,用于对所述全局特征数据和多个所述区域特征数据进行融合,得到所述目标车辆的外观特征数据。
  20. 根据权利要求19所述的装置,其特征在于,所述多个区域分割结果分别对应所述目标车辆的不同方位的区域。
  21. 根据权利要求20所述的装置,其特征在于,所述多个区域分割结果包括所述目标车辆的前面、后面、左面和右面的分割结果。
  22. 根据权利要求19~21中任意一项权利要求所述的装置,其特征在于,所述第一获取模块,包括:
    获取子模块,用于通过用于区域提取的第一神经网络从待识别图像中,获取所述目标车辆的多个区域分割结果。
  23. 根据权利要求22所述的装置,其特征在于,所述第一神经网络具有第一特征提取层和连接在所述第一特征提取层末端的第一计算层,
    其中,所述获取子模块,用于:
    通过所述第一特征提取层对所述待识别图像进行特征提取,得到所述目标车辆的多个关键点;
    通过所述第一计算层对所述多个关键点进行分类,得到多个关键点集群,并且分别针对多个所述关键点集群中的关键点的特征图进行融合,获得多个所述关键点集群对应的区域分割结果。
  24. 根据权利要求19~23中任意一项权利要求所述的装置,其特征在于,所述提取模块,包括:
    提取子模块,用于基于所述多个区域分割结果,通过用于特征提取的第二神经网络从所述待识别图像中提取所述目标车辆的全局特征数据和多个区域特征数据。
  25. 根据权利要求24所述的装置,其特征在于,所述第二神经网络具有第一处理子网和分别与所述第一处理子网的输出端连接的多个第二处理子网,
    其中,所述第一处理子网具有第二特征提取层、第一启动模块和第一池化层,所述第二处理子网具有与所述第一处理子网的输出端连接的第二计算层、第二启动模块和第二池化层。
  26. 根据权利要求25所述的装置,其特征在于,所述提取子模块,包括:
    第一特征提取单元,用于通过所述第二特征提取层对所述待识别图像进行卷积操作和池化操作,获得所述目标车辆的全局特征图;
    第二特征提取单元,用于通过所述第一启动模块对所述全局特征图进行卷积操作和池化操作,获得所述目标车辆的第一特征图集合;
    第一池化单元,用于通过所述第一池化层对所述第一特征图集合中的特征图进行池化操作,获得所述目标车辆的全局特征向量。
  27. 根据权利要求26所述的装置,其特征在于,所述提取子模块,还包括:
    第一计算单元,用于通过所述第二计算层将所述多个区域分割结果分别与所述全局特征图进行点乘,获得所述多个区域分割结果分别对应的局部特征图;
    第三特征提取单元,用于通过所述第二启动模块对多个所述区域分割结果的局部特征图进行卷积操作和池化操作,获得多个所述区域分割结果对应的第二特征图集合;
    第二池化单元,用于通过所述第二池化层对多个所述区域分割结果的第二特征图集合进行池化操作, 获得多个所述区域分割结果对应的第一区域特征向量。
  28. 根据权利要求27所述的装置,其特征在于,所述提取子模块,还包括:
    第二计算单元,用于通过所述第二计算层将所述多个区域分割结果分别缩放到与所述全局特征图的尺寸相同的尺寸。
  29. 根据权利要求19~28中任意一项权利要求所述的装置,其特征在于,所述融合模块,包括:
    融合子模块,用于通过用于特征融合的第三神经网络对所述目标车辆的全局特征数据和多个所述区域特征数据进行融合。
  30. 根据权利要求29所述的装置,其特征在于,所述第三神经网络具有与所述第二神经网络的输出端连接的第一全连接层、第三计算层和第二全连接层,
    其中,所述融合子模块,包括:
    第一获取单元,用于通过所述第一全连接层获取第一区域特征向量的权重值;
    第三计算单元,用于通过所述第三计算层根据所述权重值,对多个所述第一区域特征向量分别加权,获得相应的多个第二区域特征向量;
    映射单元,用于通过所述第二全连接层对多个所述第二区域特征向量和全局特征向量进行映射操作,获得所述目标车辆的外观特征向量。
  31. 根据权利要求30所述的装置,其特征在于,所述第一获取单元,用于:
    对多个所述第一区域特征向量进行拼接操作,获得拼接后的第一区域特征向量;
    通过所述第一全连接层对所述拼接后的第一区域特征向量进行映射操作,获得多个所述第一区域特征向量对应的标量的集合;
    对所述集合中的多个标量进行归一化操作,获得多个所述第一区域特征向量的权重值。
  32. 根据权利要求23所述的装置,其特征在于,所述第一特征提取层为沙漏型网络结构。
  33. 一种车辆检索装置,其特征在于,所述装置包括:
    第二获取模块,用于通过根据权利要求19~32中任意一项权利要求所述的装置获取待检索图像中目标车辆的外观特征数据;
    查找模块,用于从待选车辆图像库中查找与所述外观特征数据匹配的目标待选车辆图像。
  34. 根据权利要求33所述的装置,其特征在于,所述查找模块,用于:
    确定所述目标车辆的外观特征向量分别与所述待选车辆图像库中的待选车辆图像的车辆的外观特征向量的余弦距离;
    根据所述余弦距离确定与所述目标车辆匹配的目标待选车辆图像。
  35. 根据权利要求34所述的装置,其特征在于,所述装置还包括:
    第三获取模块,用于获取所述待检索图像的拍摄时间和/或拍摄位置以及所述多个所述待选车辆图像的拍摄时间和/或拍摄位置;
    第一确定模块,用于根据所述拍摄时间和/或所述拍摄位置确定所述目标车辆与所述多个所述待选车辆图像中的车辆的时空距离;
    第二确定模块,用于根据所述余弦距离和所述时空距离确定所述待选车辆图像库中与所述目标车辆匹配的目标待选车辆图像。
  36. 根据权利要求35所述的装置,其特征在于,所述第二确定模块,用于:
    根据所述余弦距离,在所述待选车辆图像库中获取多个所述待选车辆图像;
    分别基于所述待选车辆图像的拍摄时间及拍摄位置,确定所述待选车辆图像与所述目标车辆的时空匹配概率;
    基于所述余弦距离和所述时空匹配概率,确定与所述目标车辆匹配的目标待选车辆图像。
  37. 一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实 现权利要求1~14中任意一项权利要求所述的车辆外观特征识别方法的步骤。
  38. 一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述程序指令被处理器执行时实现权利要求15~18中任意一项权利要求所述的车辆检索方法的步骤。
  39. 一种电子设备,包括:第一处理器、第一存储器、第一通信元件和第一通信总线,所述第一处理器、所述第一存储器和所述第一通信元件通过所述第一通信总线完成相互间的通信;
    所述第一存储器用于存放至少一可执行指令,所述可执行指令使所述第一处理器执行如权利要求1~14中任意一项权利要求所述的车辆外观特征识别方法的步骤。
  40. 一种电子设备,包括:第二处理器、第二存储器、第二通信元件和第二通信总线,所述第二处理器、所述第二存储器和所述第二通信元件通过所述第二通信总线完成相互间的通信;
    所述第二存储器用于存放至少一可执行指令,所述可执行指令使所述第二处理器执行如权利要求15~18中任意一项权利要求所述的车辆检索方法的步骤。
PCT/CN2018/093165 2017-06-28 2018-06-27 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备 WO2019001481A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2019562381A JP7058669B2 (ja) 2017-06-28 2018-06-27 車両外観特徴識別及び車両検索方法、装置、記憶媒体、電子デバイス
US16/678,870 US11232318B2 (en) 2017-06-28 2019-11-08 Methods and apparatuses for vehicle appearance feature recognition, methods and apparatuses for vehicle retrieval, storage medium, and electronic devices
US17/533,484 US20220083802A1 (en) 2017-06-28 2021-11-23 Methods and apparatuses for vehicle appearance feature recognition, methods and apparatuses for vehicle retrieval, storage medium, and electronic devices
US17/533,469 US20220083801A1 (en) 2017-06-28 2021-11-23 Methods and apparatuses for vehicle appearance feature recognition, methods and apparatuses for vehicle retrieval, storage medium, and electronic devices

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710507778.5 2017-06-28
CN201710507778.5A CN108229468B (zh) 2017-06-28 2017-06-28 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/678,870 Continuation US11232318B2 (en) 2017-06-28 2019-11-08 Methods and apparatuses for vehicle appearance feature recognition, methods and apparatuses for vehicle retrieval, storage medium, and electronic devices

Publications (1)

Publication Number Publication Date
WO2019001481A1 true WO2019001481A1 (zh) 2019-01-03

Family

ID=62658096

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/093165 WO2019001481A1 (zh) 2017-06-28 2018-06-27 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备

Country Status (4)

Country Link
US (3) US11232318B2 (zh)
JP (1) JP7058669B2 (zh)
CN (1) CN108229468B (zh)
WO (1) WO2019001481A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340515A (zh) * 2020-03-02 2020-06-26 北京京东振世信息技术有限公司 一种特征信息生成和物品溯源方法和装置
CN111611414A (zh) * 2019-02-22 2020-09-01 杭州海康威视数字技术股份有限公司 车辆检索方法、装置及存储介质
JP2020181268A (ja) * 2019-04-23 2020-11-05 エヌ・ティ・ティ・コミュニケーションズ株式会社 物体対応付け装置、物体対応付けシステム、物体対応付け方法及びコンピュータプログラム
JP2022503426A (ja) * 2019-09-27 2022-01-12 ベイジン センスタイム テクノロジー デベロップメント カンパニー, リミテッド 人体検出方法、装置、コンピュータ機器及び記憶媒体
JP2022511221A (ja) * 2019-09-24 2022-01-31 北京市商▲湯▼科技▲開▼▲発▼有限公司 画像処理方法、画像処理装置、プロセッサ、電子機器、記憶媒体及びコンピュータプログラム
EP3968180A4 (en) * 2019-05-06 2022-07-06 Tencent Technology (Shenzhen) Company Limited METHOD AND APPARATUS FOR IMAGE PROCESSING, COMPUTER READABLE MEDIUM AND ELECTRONIC DEVICE
US11429809B2 (en) 2019-09-24 2022-08-30 Beijing Sensetime Technology Development Co., Ltd Image processing method, image processing device, and storage medium

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018033137A1 (zh) * 2016-08-19 2018-02-22 北京市商汤科技开发有限公司 在视频图像中展示业务对象的方法、装置和电子设备
WO2019045033A1 (ja) * 2017-09-04 2019-03-07 日本電気株式会社 情報処理システム、情報処理方法及び記憶媒体
CN109086690B (zh) * 2018-07-13 2021-06-22 北京旷视科技有限公司 图像特征提取方法、目标识别方法及对应装置
CN110851640B (zh) * 2018-07-24 2023-08-04 杭州海康威视数字技术股份有限公司 一种图像搜索方法、装置及系统
CN109145777A (zh) * 2018-08-01 2019-01-04 北京旷视科技有限公司 车辆重识别方法、装置及系统
EP3623996A1 (en) * 2018-09-12 2020-03-18 Aptiv Technologies Limited Method for determining a coordinate of a feature point of an object in a 3d space
CN109583408A (zh) * 2018-12-07 2019-04-05 高新兴科技集团股份有限公司 一种基于深度学习的车辆关键点对齐方法
CN109800321B (zh) * 2018-12-24 2020-11-10 银江股份有限公司 一种卡口图像车辆检索方法及系统
CN109740541B (zh) * 2019-01-04 2020-08-04 重庆大学 一种行人重识别系统与方法
CN109547843B (zh) * 2019-02-01 2022-05-17 腾讯音乐娱乐科技(深圳)有限公司 对音视频进行处理的方法和装置
US11003947B2 (en) * 2019-02-25 2021-05-11 Fair Isaac Corporation Density based confidence measures of neural networks for reliable predictions
CN110110718B (zh) * 2019-03-20 2022-11-22 安徽名德智能科技有限公司 一种人工智能图像处理装置
CN111753601B (zh) * 2019-03-29 2024-04-12 华为技术有限公司 一种图像处理的方法、装置以及存储介质
CN110097108B (zh) * 2019-04-24 2021-03-02 佳都新太科技股份有限公司 非机动车的识别方法、装置、设备及存储介质
CN110472656B (zh) * 2019-07-03 2023-09-05 平安科技(深圳)有限公司 车辆图像分类方法、装置、计算机设备及存储介质
CN110348463B (zh) * 2019-07-16 2021-08-24 北京百度网讯科技有限公司 用于识别车辆的方法和装置
CN112307833A (zh) * 2019-07-31 2021-02-02 浙江商汤科技开发有限公司 识别智能行驶设备的行驶状态的方法及装置、设备
CN110458238A (zh) * 2019-08-02 2019-11-15 南通使爱智能科技有限公司 一种证件圆弧点检测和定位的方法及系统
CN110458086A (zh) * 2019-08-07 2019-11-15 北京百度网讯科技有限公司 车辆重识别方法及装置
CN110543841A (zh) * 2019-08-21 2019-12-06 中科视语(北京)科技有限公司 行人重识别方法、系统、电子设备及介质
CN110659374A (zh) * 2019-09-19 2020-01-07 江苏鸿信系统集成有限公司 一种基于神经网络提取车辆特征值及属性的以图搜图方法
CN113129330A (zh) * 2020-01-14 2021-07-16 北京地平线机器人技术研发有限公司 一种可移动设备的轨迹预测方法及装置
CN111192461B (zh) * 2020-01-21 2022-06-28 北京筑梦园科技有限公司 一种车牌识别方法、服务器、停车收费方法及系统
CN111368639B (zh) * 2020-02-10 2022-01-11 浙江大华技术股份有限公司 车辆越线判定方法、装置、计算机设备和存储介质
CN111340882B (zh) * 2020-02-20 2024-02-20 盈嘉互联(北京)科技有限公司 基于图像的室内定位方法及装置
CN113807147A (zh) * 2020-06-15 2021-12-17 北京达佳互联信息技术有限公司 一种目标检测及其网络的训练方法、装置
CN111723768B (zh) * 2020-06-30 2023-08-11 北京百度网讯科技有限公司 车辆重识别的方法、装置、设备和存储介质
CN111931768A (zh) * 2020-08-14 2020-11-13 中国科学院重庆绿色智能技术研究院 一种自适应样本分布的车辆识别方法及系统
CN113780165A (zh) * 2020-09-10 2021-12-10 深圳市商汤科技有限公司 车辆识别方法及装置、电子设备及存储介质
CN112731558B (zh) * 2020-12-16 2021-12-10 中国科学技术大学 一种地震面波与接收函数的联合反演方法及装置
CN112541463A (zh) * 2020-12-21 2021-03-23 上海眼控科技股份有限公司 模型训练方法、外观分割方法、设备及存储介质
CN112766407B (zh) * 2021-01-29 2023-12-05 北京达佳互联信息技术有限公司 一种图像识别方法、装置及存储介质
CN112905824A (zh) * 2021-02-08 2021-06-04 智慧眼科技股份有限公司 目标车辆追踪方法、装置、计算机设备及存储介质
CN113205546A (zh) * 2021-04-30 2021-08-03 四川云从天府人工智能科技有限公司 获得目标车辆运动轨迹的方法、系统、介质及设备
CN113569912A (zh) * 2021-06-28 2021-10-29 北京百度网讯科技有限公司 车辆识别方法、装置、电子设备及存储介质
CN113569911A (zh) * 2021-06-28 2021-10-29 北京百度网讯科技有限公司 车辆识别方法、装置、电子设备及存储介质
CN113806576B (zh) * 2021-08-10 2023-08-18 深圳市广电信义科技有限公司 基于图像的车辆检索方法、装置、电子设备及存储介质
CN113537167B (zh) * 2021-09-15 2021-12-03 成都数联云算科技有限公司 一种车辆外观识别方法及系统及装置及介质
CN114579805B (zh) * 2022-03-01 2023-03-28 北京赛思信安技术股份有限公司 一种基于注意力机制的卷积神经网络相似视频检索方法
CN115471732B (zh) * 2022-09-19 2023-04-18 温州丹悦线缆科技有限公司 电缆的智能化制备方法及其系统
CN115512154A (zh) * 2022-09-21 2022-12-23 东南大学 基于深度学习神经网络的高速公路车辆图像检索方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106023220A (zh) * 2016-05-26 2016-10-12 史方 一种基于深度学习的车辆外观部件图像分割方法
CN106384100A (zh) * 2016-09-28 2017-02-08 武汉大学 一种基于部件的精细车型识别方法
CN106778867A (zh) * 2016-12-15 2017-05-31 北京旷视科技有限公司 目标检测方法和装置、神经网络训练方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6516832B2 (ja) * 2015-04-08 2019-05-22 株式会社日立製作所 画像検索装置、システム及び方法
US9767381B2 (en) * 2015-09-22 2017-09-19 Xerox Corporation Similarity-based detection of prominent objects using deep CNN pooling layers as features
CN105160333B (zh) * 2015-09-30 2018-08-17 深圳市华尊科技股份有限公司 一种车型识别方法及识别装置
CN106469299B (zh) * 2016-08-31 2019-07-19 北京邮电大学 一种车辆搜索方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106023220A (zh) * 2016-05-26 2016-10-12 史方 一种基于深度学习的车辆外观部件图像分割方法
CN106384100A (zh) * 2016-09-28 2017-02-08 武汉大学 一种基于部件的精细车型识别方法
CN106778867A (zh) * 2016-12-15 2017-05-31 北京旷视科技有限公司 目标检测方法和装置、神经网络训练方法和装置

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611414A (zh) * 2019-02-22 2020-09-01 杭州海康威视数字技术股份有限公司 车辆检索方法、装置及存储介质
CN111611414B (zh) * 2019-02-22 2023-10-24 杭州海康威视数字技术股份有限公司 车辆检索方法、装置及存储介质
JP2020181268A (ja) * 2019-04-23 2020-11-05 エヌ・ティ・ティ・コミュニケーションズ株式会社 物体対応付け装置、物体対応付けシステム、物体対応付け方法及びコンピュータプログラム
JP7253967B2 (ja) 2019-04-23 2023-04-07 エヌ・ティ・ティ・コミュニケーションズ株式会社 物体対応付け装置、物体対応付けシステム、物体対応付け方法及びコンピュータプログラム
EP3968180A4 (en) * 2019-05-06 2022-07-06 Tencent Technology (Shenzhen) Company Limited METHOD AND APPARATUS FOR IMAGE PROCESSING, COMPUTER READABLE MEDIUM AND ELECTRONIC DEVICE
JP2022511221A (ja) * 2019-09-24 2022-01-31 北京市商▲湯▼科技▲開▼▲発▼有限公司 画像処理方法、画像処理装置、プロセッサ、電子機器、記憶媒体及びコンピュータプログラム
JP7108123B2 (ja) 2019-09-24 2022-07-27 北京市商▲湯▼科技▲開▼▲発▼有限公司 画像処理方法、画像処理装置、プロセッサ、電子機器、記憶媒体及びコンピュータプログラム
US11429809B2 (en) 2019-09-24 2022-08-30 Beijing Sensetime Technology Development Co., Ltd Image processing method, image processing device, and storage medium
JP2022503426A (ja) * 2019-09-27 2022-01-12 ベイジン センスタイム テクノロジー デベロップメント カンパニー, リミテッド 人体検出方法、装置、コンピュータ機器及び記憶媒体
JP7101829B2 (ja) 2019-09-27 2022-07-15 ベイジン・センスタイム・テクノロジー・デベロップメント・カンパニー・リミテッド 人体検出方法、装置、コンピュータ機器及び記憶媒体
CN111340515A (zh) * 2020-03-02 2020-06-26 北京京东振世信息技术有限公司 一种特征信息生成和物品溯源方法和装置
CN111340515B (zh) * 2020-03-02 2023-09-26 北京京东振世信息技术有限公司 一种特征信息生成和物品溯源方法和装置

Also Published As

Publication number Publication date
CN108229468B (zh) 2020-02-21
JP7058669B2 (ja) 2022-04-22
US20220083801A1 (en) 2022-03-17
US11232318B2 (en) 2022-01-25
US20220083802A1 (en) 2022-03-17
US20200074205A1 (en) 2020-03-05
CN108229468A (zh) 2018-06-29
JP2020520512A (ja) 2020-07-09

Similar Documents

Publication Publication Date Title
WO2019001481A1 (zh) 车辆外观特征识别及车辆检索方法、装置、存储介质、电子设备
CN108875522B (zh) 人脸聚类方法、装置和系统及存储介质
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
CN111797893B (zh) 一种神经网络的训练方法、图像分类系统及相关设备
US9466013B2 (en) Computer vision as a service
US20220101644A1 (en) Pedestrian re-identification method, device, electronic device and computer-readable storage medium
CN108256479B (zh) 人脸跟踪方法和装置
CN108427927B (zh) 目标再识别方法和装置、电子设备、程序和存储介质
EP2774119B1 (en) Improving image matching using motion manifolds
US20150039583A1 (en) Method and system for searching images
Jerripothula et al. Cats: Co-saliency activated tracklet selection for video co-localization
WO2019080411A1 (zh) 电子装置、人脸图像聚类搜索方法和计算机可读存储介质
US9626585B2 (en) Composition modeling for photo retrieval through geometric image segmentation
US20090228510A1 (en) Generating congruous metadata for multimedia
US20220415023A1 (en) Model update method and related apparatus
Ma et al. Robust topological navigation via convolutional neural network feature and sharpness measure
Wang et al. Posediffusion: Solving pose estimation via diffusion-aided bundle adjustment
WO2019100348A1 (zh) 图像检索方法和装置以及图像库的生成方法和装置
CN112766284A (zh) 图像识别方法和装置、存储介质和电子设备
Gao et al. Occluded person re-identification based on feature fusion and sparse reconstruction
Carvalho et al. Analysis of object description methods in a video object tracking environment
AU2011265494A1 (en) Kernalized contextual feature
Gao et al. Data-driven lightweight interest point selection for large-scale visual search
Ewerth et al. Estimating relative depth in single images via rankboost
Sun et al. Convolutional neural network-based coarse initial position estimation of a monocular camera in large-scale 3D light detection and ranging maps

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18822700

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019562381

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 01/04/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18822700

Country of ref document: EP

Kind code of ref document: A1