WO2023024726A1 - 安检ct目标物识别方法和装置 - Google Patents

安检ct目标物识别方法和装置 Download PDF

Info

Publication number
WO2023024726A1
WO2023024726A1 PCT/CN2022/104606 CN2022104606W WO2023024726A1 WO 2023024726 A1 WO2023024726 A1 WO 2023024726A1 CN 2022104606 W CN2022104606 W CN 2022104606W WO 2023024726 A1 WO2023024726 A1 WO 2023024726A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional
target
views
semantic description
description set
Prior art date
Application number
PCT/CN2022/104606
Other languages
English (en)
French (fr)
Inventor
陈志强
张丽
孙运达
郑娟
王璐
杨涛
李栋
Original Assignee
同方威视技术股份有限公司
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 同方威视技术股份有限公司, 清华大学 filed Critical 同方威视技术股份有限公司
Priority to KR1020247003300A priority Critical patent/KR20240025683A/ko
Publication of WO2023024726A1 publication Critical patent/WO2023024726A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N23/00Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
    • G01N23/02Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material
    • G01N23/04Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material
    • G01N23/046Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material using tomography, e.g. computed tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N23/00Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
    • G01N23/02Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material
    • G01N23/04Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01VGEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
    • G01V5/00Prospecting or detecting by the use of ionising radiation, e.g. of natural or induced radioactivity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/05Recognition of patterns representing particular kinds of hidden objects, e.g. weapons, explosives, drugs

Definitions

  • This application relates to the field of security inspection computerized tomography (Computed Tomography, CT), in particular to a security inspection CT target recognition method and device.
  • Computerized Tomography Computerized tomography
  • CT equipment is often used in the identification of targets such as contraband.
  • the traditional technology is mainly: use CT reconstruction technology to obtain a 3D tomographic image containing material attribute information, divide the 3D image into several suspects, and perform statistics and classification on the material attributes of the suspects. kind.
  • Patent Document 1 proposes a CT detection method and device, which respectively recognize the three-dimensional tomographic image and the two-dimensional image of the object, and obtain the identification result of explosives through the former, and through the latter Obtain the recognition results of other contraband.
  • Patent Document 1 109975335A
  • Patent Document 1 attempts to improve the above-mentioned limitations, the inventors of the present application have found through research that Patent Document 1 still has the following technical problems:
  • this method can also use the two-dimensional projection image obtained from the three-dimensional tomographic image in the identification of other contraband except explosives, the identification operation is limited to the two-dimensional plane, and it is impossible to obtain Recognition results in 3D space. Since the amount of information in two-dimensional images is significantly lower than that of three-dimensional data, if the two-dimensional recognition results cannot be effectively integrated, the advantages of security CT equipment cannot be fully utilized.
  • the present application proposes a security inspection CT target recognition method and device capable of improving the recognition effect on three-dimensional shape targets.
  • An embodiment of the present application provides a security inspection CT target recognition method, including: reducing the dimensionality of the three-dimensional CT data to generate multiple two-dimensional dimensionality reduction views;
  • the two-dimensional semantic description set is upgraded to obtain the three-dimensional recognition result of the target object, which includes: mapping the two-dimensional semantic description set to a three-dimensional space by using the back projection method to obtain a three-dimensional probability map; And feature extraction is performed on the three-dimensional probability map to obtain the three-dimensional recognition result of the target object.
  • the two-dimensional semantic description set is mapped to the three-dimensional space by using the back projection method, and the three-dimensional probability map is obtained, including: the mapping of the two-dimensional semantic description set to the three-dimensional space by voxel driving or pixel driving , to obtain the semantic feature matrix, and compress the semantic feature matrix into a three-dimensional probability map.
  • the voxel drive includes: mapping each voxel in the 3D CT data to a pixel in each 2D view, querying and accumulating the 2D semantic description information corresponding to the pixel, Generate a semantic feature matrix, pixel driving includes: each pixel in the 2D view corresponds to a straight line in the 3D CT data, traverse each pixel in each 2D view or each pixel in the region of interest, and follow the straight line Propagate the two-dimensional semantic description information corresponding to the pixel to the three-dimensional space to generate a semantic feature matrix, where the region of interest is given by the two-dimensional semantic description set.
  • feature extraction is performed on the three-dimensional probability map, and the three-dimensional recognition result of the target object includes: for the three-dimensional probability map, at least one of image processing methods, classic machine learning methods, and deep learning methods is used.
  • Feature extraction is performed in one or a combination of ways, so as to obtain a set of semantic descriptions of 3D images as a 3D recognition result.
  • feature extraction is performed on the three-dimensional probability map to obtain the three-dimensional recognition result of the target, including: binarizing the three-dimensional probability map to obtain a three-dimensional binary map; connecting the three-dimensional binary map Regional analysis to obtain connected regions; generate a set of semantic descriptions for 3D images for connected regions.
  • the connected region analysis includes: marking the connected components of the three-dimensional binary image, and performing a mask operation for each marked region to obtain connected regions.
  • generating a 3D image semantic description set for connected areas includes: extracting all probability values in the connected areas, performing principal component analysis to obtain an analysis set, and using the analysis set as an effective voxel area of the object, A set of 3D image semantic descriptions is calculated.
  • the 3D image semantic description set takes one or more of voxel, 3D region of interest, and 3D CT image as a unit, including: category information and/or confidence; or, 3D
  • the image semantic description set takes a 3D region of interest and/or a 3D CT image as a unit, and includes: at least one of category information, target object location information, and confidence.
  • the location information includes a three-dimensional bounding box.
  • the two-dimensional semantic description set contains category information and/or confidence in units of one or more of pixels, regions of interest, and two-dimensional images, or the two-dimensional semantic description
  • the set includes at least one of category information, confidence, and location information of the target, taking the region of interest and/or the two-dimensional image as a unit.
  • target recognition for each of the plurality of two-dimensional views includes: using at least one of an image processing method for two-dimensional images, a classical machine learning method, and a deep learning method One or a combination of ways to identify objects.
  • performing dimensionality reduction on the 3D CT data to generate multiple 2D dimensionality reduction views includes: setting multiple directions for the 3D CT data; and performing projection or rendering according to the multiple directions.
  • the multiple directions are arbitrary directions, and are not limited to the orthogonal direction of the traveling direction of the object during the detection process.
  • the multiple two-dimensional views further include a two-dimensional DR image, and the two-dimensional DR image is obtained by a DR imaging device.
  • the three-dimensional recognition result is projected onto the two-dimensional DR map, and then output as the recognition result of the two-dimensional DR map.
  • the embodiment of the present application also provides a security inspection CT target recognition device, including: a dimensionality reduction module, which performs dimensionality reduction on 3D CT data to generate multiple 2D dimensionality reduction views; Target recognition, to obtain the two-dimensional semantic description set of the target object, multiple two-dimensional views include multiple two-dimensional dimensionality reduction views; and the dimension enhancement module, to increase the dimension of the two-dimensional semantic description set, and obtain the three-dimensional recognition result of the target object .
  • a security inspection CT target recognition device including: a dimensionality reduction module, which performs dimensionality reduction on 3D CT data to generate multiple 2D dimensionality reduction views;
  • Target recognition to obtain the two-dimensional semantic description set of the target object, multiple two-dimensional views include multiple two-dimensional dimensionality reduction views; and the dimension enhancement module, to increase the dimension of the two-dimensional semantic description set, and obtain the three-dimensional recognition result of the target object .
  • the embodiment of the present application also provides a machine-readable storage medium, which stores a program, and the program can enable the computer to execute: performing dimensionality reduction on 3D CT data to generate multiple 2D dimensionality reduction views; target multiple 2D views Object recognition, obtain the two-dimensional semantic description set of the target object, multiple two-dimensional views include multiple two-dimensional dimensionality reduction views; and upgrade the dimension of the two-dimensional semantic description set to obtain the three-dimensional recognition result of the target object.
  • multiple 2D dimensionality reduction views are generated by performing dimensionality reduction from 3D CT data, and target recognition is performed using multiple 2D dimensionality reduction views including these multiple 2D dimensionality reduction views, to obtain two 3D semantic description set, and then increase the dimension of the 2D semantic description set to obtain 3D recognition results, that is, first reduce the dimension from 3D to 2D for recognition and then increase the dimension to generate 3D results.
  • It can effectively identify targets with complex material composition and physical properties and shape characteristics, and can effectively integrate two-dimensional recognition results to provide information-rich three-dimensional recognition results. Therefore, the recognition effect of the target can be improved. In addition, it can also meet the real-time requirements of security checks.
  • FIG. 1 is a flow chart showing a security inspection CT target recognition method related to the first embodiment
  • Figure 2 is a flowchart illustrating an example of dimensionality reduction processing
  • FIG. 3 is a flow chart illustrating an example of dimension-up processing
  • FIG. 4 is a flowchart illustrating an example of three-dimensional feature extraction
  • FIG. 5 is a flow chart showing a security inspection CT target recognition method related to the second embodiment
  • FIG. 6 is a schematic diagram illustrating an example of a security inspection CT target object recognition device according to a third embodiment
  • FIG. 7 is a schematic diagram showing another example of the security inspection CT target object recognition device according to the third embodiment.
  • FIG. 1 is a flowchart illustrating a security CT target recognition method according to a first embodiment.
  • the security inspection CT target object recognition method is applied to a security inspection CT system, for example, may be executed in a security inspection CT device or a server connected to the security inspection CT device.
  • step S10 dimensionality reduction processing is performed. That is, dimensionality reduction is performed on 3D CT data to generate multiple 2D dimensionality reduction views.
  • step S10 may include: step S11 and step S12.
  • a plurality of directions are set for the three-dimensional CT data.
  • the plurality of directions are arbitrary directions, and are not limited to specific directions such as the orthogonal direction to the traveling direction of the object during detection.
  • step S12 projection or rendering is performed according to multiple directions to obtain multiple two-dimensional reduced-dimensional views.
  • ray or ray casting can be performed based on a sequence of CT image slices, a ray or ray is emitted in a specific direction from each pixel of the image, the ray or ray traverses the entire image sequence, and in the process, the image sequence is processed Sampling is used to obtain attribute or color information, and the attribute or color value is accumulated according to a certain model until the ray or light passes through the entire image sequence, and the finally obtained attribute or color value is used as a two-dimensional view after dimensionality reduction.
  • dimensionality reduction can be avoided only along a specific direction, for example, dimensionality reduction is performed along the direction orthogonal to the traveling direction of the object in the detection process, In this way, the following problems can be solved when dimensionality reduction is performed only along a specific direction: (1) Under certain postures of the object, the area of the object after dimensionality reduction is too small, and the shape information expression is incomplete, so that it cannot be accurately (2) The object loses the shape information of the object due to being blocked by other objects, so that the target object cannot be accurately identified.
  • step S20 two-dimensional recognition processing is performed. That is, object recognition is performed for multiple two-dimensional views, and a two-dimensional semantic description set of the object is obtained.
  • the plurality of two-dimensional views includes the plurality of two-dimensional dimensionality reduction views obtained in the above step S10.
  • a target recognition method for a two-dimensional view at least one or a combination of image processing methods for two-dimensional images, classical machine learning methods, and deep learning methods may be used.
  • a two-dimensional view is fed into a neural network model as an input, and a two-dimensional semantic description set is obtained as an output.
  • a target detection neural network based on deep learning can be used to detect the two-dimensional position of the target.
  • the convolutional neural network used in the target detection task is a typical structure of deep learning in computer vision tasks.
  • Such a convolutional neural network has the characteristics of local connection, weight sharing, and spatial resampling. These characteristics make the convolutional neural network have a certain degree of translation and scaling invariance.
  • the two-dimensional semantic description set uses one or more of pixels, regions of interest, and two-dimensional images as units, and contains category information and/or confidence, or, the two-dimensional semantic description set uses regions of interest and /or a two-dimensional image as a unit, including at least one of category information, confidence, and position information of the target.
  • the category information indicates the category to which the target object belongs, for example, a gun, a knife, and the like.
  • the location information can include center coordinates, bounding boxes, etc. Confidence indicates the possibility of the existence of the target object, which can be a normalized scalar or vector.
  • the two-dimensional semantic description set includes: the category information and confidence level of a certain pixel belonging to the target object; the category information of the target object contained in a certain region of interest, the position information of the target object, confidence level, etc.; At least one of the information contained in the category information of the target object, the position information of the target object, and the confidence level.
  • at least one piece of information may be information contained in one group, or information contained in different groups respectively.
  • the description set of two-dimensional semantic information may also include other semantic information such as the posture of the target object, the number of the target object, etc., in addition to the category information, confidence degree, and position information.
  • the object recognition method of the two-dimensional view in the present application is not particularly limited, as long as it is a method capable of obtaining the above-mentioned two-dimensional semantic description set based on the two-dimensional view.
  • the two-dimensional recognition result of the target is expressed in the form of a two-dimensional semantic description set, and such a two-dimensional semantic description set is input into step S30 as an input and is upgraded to three dimensions, so that Achieve the integration of two-dimensional recognition results into three-dimensional.
  • the two-dimensional semantic description set is more flexible, it can make the contained information more abundant.
  • step S30 dimension-up processing is performed. That is, the two-dimensional semantic description set is upgraded to obtain the three-dimensional recognition result of the target object.
  • step S30 may include step S31 and step S32.
  • step S31 the two-dimensional semantic description set is mapped to the three-dimensional space by using a back-projection method to obtain a three-dimensional probability map.
  • Backprojection can be thought of as the inverse of projection.
  • the back-projection process may be implemented by means of voxel driving or pixel driving.
  • the semantic feature matrix can be obtained through voxel-driven or pixel-driven, and the semantic feature matrix can be compressed into a three-dimensional probability map.
  • voxel driving includes: mapping each voxel in the 3D CT data to a pixel in each 2D view, querying and accumulating the 2D semantic description information corresponding to the pixel, and generating a semantic feature matrix.
  • Voxel-to-pixel correspondences can be established as mapping functions or look-up tables to increase computational speed.
  • the voxel driver can perform parallel calculations for each voxel, so that the calculation speed is fast and the real-time performance of the security check can be improved.
  • Pixel driving includes: each pixel in the two-dimensional view corresponds to a straight line in the three-dimensional CT data, traverses each pixel in each two-dimensional view or each pixel of the region of interest, and propagates along the straight line to the three-dimensional space Two-dimensional semantic description information corresponding to the pixel, and a semantic feature matrix is generated, wherein the region of interest is given by a two-dimensional semantic description set.
  • the corresponding relationship between voxels and pixels may also be obtained through a mapping function or a lookup table.
  • each pixel in multiple two-dimensional views obtains its semantic feature matrix in turn, and finally compresses the semantic feature matrix to obtain a three-dimensional probability map.
  • the pixel driver can also perform parallel calculations according to the pixels, which is also conducive to improving the calculation speed and improving the real-time performance of security checks.
  • the semantic feature matrix is generated by two-dimensional semantic description information according to its spatial correspondence, and is obtained by digitizing and intensifying the two-dimensional semantic description information.
  • matrix For example, for category information in a two-dimensional semantic description set, a semantic feature matrix may be obtained according to the category of each object. For example, it may be assumed that the corresponding numerical value in the semantic feature matrix is 1 when it belongs to the category, and the corresponding numerical value in the semantic feature matrix is 0 when it does not belong to the category.
  • the semantic feature matrix can also be obtained in a similar manner.
  • Typical methods for compressing semantic feature matrix include weighted average, principal component analysis, etc.
  • the input is a semantic feature matrix
  • the output is a probability map.
  • the back projection method is mapped to the three-dimensional space to generate a semantic feature matrix corresponding to the three-dimensional space.
  • the value in this matrix is a vector composed of 0 or 1.
  • the dimension of the output probability map value is determined by the number of object categories.
  • the method of obtaining the probability map value described here is only an example, and the probability map value may also be obtained by other methods.
  • the weights can be different, and the values of the semantic feature matrix are weighted with different weights to obtain the value of the probability map.
  • one or more vectors in the three-dimensional semantic feature matrix can also be used as input variables in principal component analysis, and principal component analysis is performed on such input variables to obtain output variables as principal components, and the output Variables are normalized as probability map values for corresponding voxels.
  • step S32 feature extraction is performed on the three-dimensional probability map to obtain a three-dimensional recognition result of the target object.
  • a three-dimensional probability map For example, for a three-dimensional probability map, at least one of image processing methods, classic machine learning methods, and deep learning methods or a combination of methods are used to extract features, so as to obtain a set of semantic descriptions of three-dimensional images as the three-dimensional recognition result.
  • a 3D probability map is input into a deep learning model, and 3D recognition results such as confidence and 3D bounding boxes are obtained as output.
  • the deep learning models used here can employ techniques such as classification neural networks or object detection networks with fewer layers.
  • the 3D image semantic description set takes one or more of voxel, 3D region of interest, and 3D CT image as a unit, including: category information and/or confidence; or, the 3D image semantic description set is in 3D interest
  • a region and/or a three-dimensional CT image is used as a unit, and includes: at least one of category information, target location information, and confidence.
  • the position information of the object in the 3D CT image may include a 3D bounding box.
  • the 3D image semantic description set includes: a voxel belongs to the category information, confidence, etc. of the target; a 3D region of interest (VOI) contains the category information of the target, the location
  • the three-dimensional CT image includes at least one information of the category information of the target object, the position information of the target object, and the confidence level. Wherein, at least one piece of information may be information contained in one group, or information contained in different groups respectively.
  • the 3D image semantic description set is generated from the 3D probability map generated based on the 2D semantic description set, the type of semantic information contained in the 3D image semantic description set and the type of semantic information contained in the 2D semantic description set Consistent or interchangeable.
  • the two-dimensional semantic description set is up-dimensionalized to obtain the three-dimensional recognition result of the target object, thus solving the problem that the amount of information is significantly reduced when the two-dimensional recognition is only performed through dimensionality reduction.
  • the problem is that it can reduce the loss of information while adopting two-dimensional identification, and take into account the real-time and accuracy of security checks.
  • step S32 for example, an image processing method may be used. As shown in FIG. 4, step S32 may include steps S321-S323.
  • step S321 the three-dimensional probability map is binarized to obtain a three-dimensional binary map.
  • step S322 the connected region analysis is performed on the three-dimensional binary image to obtain connected regions.
  • the connected components may be marked on the 3D binary image, and a mask operation may be performed on each marked region to obtain connected regions.
  • step S323 a 3D image semantic description set is generated for connected regions.
  • the 3D image semantic description set can include a 3D bounding box.
  • a 3D bounding box By including a 3D bounding box, the spatial boundary of the target object on the 3D image can be given, and the position, range, posture, shape, etc. of the target object can be shown more intuitively. , which is beneficial to the accuracy of the security inspector in judging whether the target object is a dangerous object.
  • multiple 2D dimensionality reduction views are generated by performing dimensionality reduction from 3D CT data, and multiple 2D dimensionality reduction views including these multiple 2D dimensionality reduction views are used for target recognition to obtain 2D semantics Describe the set, and then increase the dimension of the two-dimensional semantic description set to obtain the three-dimensional recognition result, that is, first reduce the dimension from three-dimensional to two-dimensional for recognition and then increase the dimension to generate a three-dimensional result.
  • FIG. 5 is a flowchart showing a security CT target recognition method according to the first embodiment.
  • the difference between the second embodiment and the first embodiment is that in the second embodiment, not only the two-dimensional dimensionality reduction image generated from the dimensionality reduction of the three-dimensional CT data is used, but also two-dimensional digital radiography (Digital Radiography, DR) data for object recognition.
  • two-dimensional digital radiography Digital Radiography, DR
  • the plurality of two-dimensional views further include a two-dimensional DR image, and object recognition is also performed on the two-dimensional DR image to obtain a two-dimensional semantic description set of the object.
  • the two-dimensional DR map is obtained by a DR imaging device configured independently of the security inspection CT equipment.
  • the two-dimensional DR image is an image of the same security inspection object as the three-dimensional CT data.
  • step S40 a two-dimensional DR map is obtained from the DR imaging device as one of multiple two-dimensional views.
  • the step S40 can be performed in parallel with the step S10.
  • step S30 not only the dimension is increased for the two-dimensional semantic description set of the two-dimensional reduced image, but also the dimension is increased for the two-dimensional semantic description set of the two-dimensional DR image, thereby obtaining the three-dimensional recognition result.
  • the 2D DR image is a 2D image with different principles and properties from the 2D dimensionality reduction image generated by reducing the dimensionality of 3D CT data.
  • the information for recognition can be increased , so as to improve the accuracy of recognition.
  • step S50 there may optionally be a step S50 .
  • step S50 the three-dimensional recognition result generated in step S30 is projected onto the two-dimensional DR map, and then output as the recognition result of the two-dimensional DR map.
  • step S30 and the result of step S50 may also be output at the same time.
  • the 3D recognition result and the recognition result on the 2D DR map can be compared and verified with each other, so that security personnel can more accurately judge whether the target object is a dangerous article.
  • FIG. 6 is a schematic diagram showing a security inspection CT target object recognition device according to the first embodiment.
  • the security inspection CT object recognition device 100 of this embodiment includes: a dimensionality reduction module 10 , a two-dimensional recognition module 20 , and a dimensionality enhancement module 30 .
  • the dimensionality reduction module 10 performs dimensionality reduction on the 3D CT data to generate multiple 2D dimensionality reduction views. That is, the processing of step S10 in the first and second embodiments described above can be performed.
  • the two-dimensional recognition module 20 performs target recognition on multiple two-dimensional views, and obtains a set of two-dimensional semantic descriptions of the target.
  • the plurality of two-dimensional views includes the plurality of two-dimensional dimensionality reduction views. That is, the processing of step S20 in the first and second embodiments described above can be performed.
  • the dimension increasing module 30 increases the dimension of the two-dimensional semantic description set to obtain the three-dimensional recognition result of the target object. That is, the processing of step S30 in the first and second embodiments described above can be performed.
  • the processing of the dimensionality reduction module 10 , the two-dimensional recognition module 20 , and the dimensionality enhancement module 30 reference may be made to the above-mentioned first and second implementation modes, and therefore, it will not be repeated here.
  • the security inspection CT target recognition device 100 may further include: a DR map acquisition module 40, the DR map acquisition module 40 obtains a two-dimensional DR map from the DR imaging device, and uses it as one of multiple two-dimensional views one. That is, the DR map acquisition module 40 can execute the process of step S40 in the second embodiment.
  • the security inspection CT object recognition device 100 may further include: a DR output module 50 , which projects the three-dimensional recognition result generated by the dimension-enhancing module 30 into a two-dimensional DR image, and then outputs it as the recognition result of the two-dimensional DR image. That is, the DR output module 50 can execute the process of step S50 in the second embodiment.
  • the security inspection CT object recognition apparatus 100 may be realized in hardware, or in a software module running on one or more processors, or in a combination thereof.
  • the security inspection CT object recognition apparatus 100 may be realized by a combination of software and hardware by any suitable electronic device such as a desktop computer, a tablet computer, a smart phone, or a server equipped with a processor.
  • the security CT target recognition apparatus 100 may be a control computer of a security CT system, or a server connected to a security CT scanning device in the security CT system.
  • the security inspection CT object recognition device 100 can be realized by software modules on any suitable electronic equipment such as desktop computers, tablet computers, smart phones, servers, and the like.
  • a software module installed on the control computer of the security inspection CT system or a software module installed on a server connected to the security inspection CT scanning device in the security inspection CT system.
  • the processor of the security inspection CT target object recognition device 100 can execute the security inspection CT target object recognition method described later.
  • the security CT target recognition device 100 may further include a memory (not shown), a communication module (not shown), and the like.
  • the memory of the security CT target object recognition device 100 can store the steps of executing the security CT target object recognition method described later, as well as data related to the security CT target object recognition.
  • the memory may be, for example, ROM (ReadOnly Memory image, read-only memory), RAM (Random Access Memory, random access memory), etc.
  • the memory has a storage space for program codes for executing any step in the above security inspection CT target identification method. When these program codes are read and executed by the processor, the above-mentioned security CT target recognition method is executed.
  • These program codes can be read from or written into one or more computer program products.
  • These computer program products comprise program code carriers such as hard disks, compact discs (CDs), memory cards or floppy disks. Such computer program products are typically portable or fixed storage units.
  • Program codes for executing any of the steps in the above method can also be downloaded via the network.
  • the program code can eg be compressed in a suitable form.
  • the communication module in the security CT object recognition device 100 may support establishing a direct (eg, wired) communication channel or a wireless communication channel between the security CT target recognition device 100 and an external electronic device, and perform communication via the established communication channel.
  • the communication module receives three-dimensional CT data and the like from a CT scanning device via a network.
  • the security inspection CT target recognition device 100 may further include output units such as a display, a microphone, and a speaker, so as to output the target recognition result.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Pulmonology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Radiology & Medical Imaging (AREA)
  • Immunology (AREA)
  • General Life Sciences & Earth Sciences (AREA)
  • High Energy & Nuclear Physics (AREA)
  • Pathology (AREA)
  • Geophysics (AREA)
  • Analysing Materials By The Use Of Radiation (AREA)
  • Image Analysis (AREA)

Abstract

一种安检CT目标物识别方法和装置,该方法包括:对三维CT数据进行降维来生成多个二维降维视图(S10);针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合(S20),所述多个二维视图包含所述多个二维降维视图;以及对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果(S30)。

Description

安检CT目标物识别方法和装置
相关申请的交叉引用
本申请主张在2021年08月27日在中国提交的中国专利申请202110998653.3的优先权,其全部内容通过引用包含于此。
技术领域
本申请涉及安检电子计算机断层扫描(Computed Tomography,CT)领域,特别是涉及安检CT目标物识别方法和装置。
背景技术
目前,在安检领域中,CT设备经常用于例如违禁品等的目标物的识别中。在利用安检CT设备进行目标物识别时,传统技术主要是:利用CT重建技术得到包含物质属性信息的三维断层图像,将三维图像分割成若干个嫌疑物,对嫌疑物进行物质属性的统计和归类。
然而,如上所述的传统技术,虽然能够在爆炸物、毒品等物质属性上具有较强可分性的违禁品识别上具有良好的性能,但对于具有较强的三维形状特征且物质组成和物理属性比较复杂的目标物的识别,表现出明显的局限性。
为了解决这样的局限性,专利文献1提出了一种CT检测方法及装置,其对物体的三维断层图像和二维图像分别进行了识别,并通过前者得到燃爆物的识别结果,通过后者得到其它违禁品的识别结果。
专利文献1:109975335A
发明内容
如上所述专利文献1虽然试图对上述的局限性进行改进,然而,本申请的发明人经过研究发现,专利文献1仍然存在如下的技术问题:
(1)只沿着检测过程中物体的行进方向(Z方向)的正交方向进行投影,在物体的某些摆放姿态下,会导致投影面积过小、形状信息表达不完整,从而无法准确识别目标物。另外,按照这种方式进行投影,还可能会导致该物体被其它物体遮挡,此时也会损失该物体的形状信息,从而无法准确识别目标物。
(2)该方法虽然在除了燃爆物之外的其它违禁品的识别中,也可以利用从三维断层图像得到的二维投影图像进行识别,但是,识别操作仅局限在二维平面,无法得到三维空间中的识别结果。由于二维图像的信息量显著低于三维数据,如果不能对二维识别结果进行有效整合,则不能充分发挥安检CT设备的优势。
本申请提出了一种能够提高对三维形状目标物的识别效果的安检CT目标物识别方法和装置。
本申请实施例提供一种安检CT目标物识别方法,包括:对三维CT数据进行降维来生成多个二维降维视图;
针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,多个二维视图包含多个二维降维视图;以及对二维语义描述集合进行升维,获得目标物的三维识别结果。
在上述的安检CT目标物识别方法中,对二维语义描述集合进行升维,获得目标物的三维识别结果包括:将二维语义描述集合利用反投影法映射到三维空间,获得三维概率图;以及对三维概率图进行特征提取,获得目标物的三维识别结果。
在上述的安检CT目标物识别方法中,将二维语义描述集合利用反投影法映射到三维空间,获得三维概率图包括:通过体素驱动或像素驱动进 行二维语义描述集合到三维空间的映射,获得语义特征矩阵,并将语义特征矩阵压缩为三维概率图。
在上述的安检CT目标物识别方法中,体素驱动包括:将三维CT数据中的每一个体素对应到每幅二维视图中的像素,查询并累积像素所对应的二维语义描述信息,生成语义特征矩阵,像素驱动包括:二维视图中的每个像素对应于三维CT数据中的一条直线,遍历每一幅二维视图中的每一个像素或感兴趣区域的每一个像素,沿直线向三维空间中传播该像素所对应的二维语义描述信息,生成语义特征矩阵,其中,感兴趣区域由二维语义描述集合给出。
在上述的安检CT目标物识别方法中,在体素驱动或像素驱动中,通过映射函数或查找表来获得体素和像素的对应关系。
在上述的安检CT目标物识别方法中,对三维概率图进行特征提取,获得目标物的三维识别结果包括:对三维概率图,采用图像处理方法、经典的机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行特征提取,从而获得三维图像语义描述集合,作为三维识别结果。
在上述的安检CT目标物识别方法中,对三维概率图进行特征提取,获得目标物的三维识别结果包括:对三维概率图进行二值化,得到三维二值图;对三维二值图进行连通区域分析,获得连通区域;针对连通区域生成三维图像语义描述集合。
在上述的安检CT目标物识别方法中,连通区域分析包括:对三维二值图进行连通分量标记,针对每个标记区域,进行掩模操作,得到连通区域。
在上述的安检CT目标物识别方法中,针对连通区域生成三维图像语义描述集合包括:提取连通区域内的所有概率值,进行主成分分析得到分析集合,并将分析集合作为物体有效体素区域,统计出三维图像语义描述集合。
在上述的安检CT目标物识别方法中,三维图像语义描述集合以体素、三维感兴趣区域、三维CT图像中的一个或多个为单位,包含:类别信息和/或置信度;或者,三维图像语义描述集合以三维感兴趣区域和/或三维CT图像为单位,包含:类别信息、目标物的位置信息、置信度中的至少一个。
在上述的安检CT目标物识别方法中,位置信息包含三维包围盒。
在上述的安检CT目标物识别方法中,二维语义描述集合以像素、感兴趣区域、二维图像中的一个或多个为单位,包含类别信息和/或置信度,或者,二维语义描述集合以感兴趣区域和/或二维图像为单位,包含类别信息、置信度、目标物的位置信息中的至少一个。
在上述的安检CT目标物识别方法中,针对多个二维视图中的每一个进行目标物识别包括:采用用于二维图像的图像处理方法、经典机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行目标物识别。
在上述的安检CT目标物识别方法中,对三维CT数据进行降维来生成多个二维降维视图包括:对三维CT数据设定多个方向;以及按照多个方向进行投影或渲染。
在上述的安检CT目标物识别方法中,多个方向为任意的方向,而不限于检测过程中物体行进方向的正交方向。
在上述的安检CT目标物识别方法中,多个二维视图还包含二维DR图,二维DR图是由DR成像装置得到的。
在上述的安检CT目标物识别方法中,把三维识别结果投影到二维DR图,再作为二维DR图的识别结果输出。
本申请实施例还提供一种安检CT目标物识别装置,包括:降维模块,对三维CT数据进行降维来生成多个二维降维视图;二维识别模块,针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,多个二维视图包含多个二维降维视图;以及升维模块,对二维语义描述集合进行 升维,获得目标物的三维识别结果。
本申请实施例还提供一种机器可读存储介质,存储有程序,该程序能够使计算机执行:对三维CT数据进行降维来生成多个二维降维视图;针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,多个二维视图包含多个二维降维视图;以及对二维语义描述集合进行升维,获得目标物的三维识别结果。
如上所述,在本申请中,通过从三维CT数据进行降维来生成多个二维降维视图,利用包含这多个二维降维视图的多个二维视图进行目标物识别,获得二维语义描述集合,再对二维语义描述集合进行升维来获得三维识别结果,即,从三维先降维到二维进行识别后再升维生成三维结果,由此,既能够通过基于二维的识别,有效地识别出物质组成和物理属性比较复杂且具有形状特征的目标物,又能够有效整合二维识别结果,给出信息量丰富的三维识别结果。从而,能够提高对目标物的识别效果。另外,也能够满足安检的实时性要求。
附图说明
图1是示出第一实施方式所涉及的安检CT目标物识别方法的流程图;
图2是示出降维处理的一个示例的流程图;
图3是示出升维处理的一个示例的流程图;
图4是示出三维特征提取的一个示例的流程图;
图5是示出第二实施方式所涉及的安检CT目标物识别方法的流程图;
图6是示出第三实施方式所涉及的安检CT目标物识别装置的一个示例的示意图;
图7是示出第三实施方式所涉及的安检CT目标物识别装置的另一示例示意图。
具体实施方式
下面将参照附图更详细地描述本申请的示例性的实施方式或实施例。虽然附图中显示了本申请的示例性的实施例,然而应当理解,可以以各种形式实现本申请而不应被这里阐述的实施方式或实施例所限制。相反,提供这些实施方式或实施例是为了能够更清楚地理解本申请。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不是用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施方式或实施例能够以除了图示或描述的以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备,不限于清楚地列出的步骤或单元,而是可以包括没有清楚地列出的其它步骤或单元。文中的相同或相似的标号表示具有相同或相似的功能的构成要素。
<第一实施方式>
作为本申请的第一实施方式,提供一种安检CT目标物识别方法。图1是示出第一实施方式所涉及的安检CT目标物识别方法的流程图。安检CT目标物识别方法应用于安检CT系统,例如,可以在安检CT设备中或者与安检CT设备连接的服务器等中被执行。
如图1所示,在S10步骤中,进行降维处理。即,对三维CT数据进行降维来生成多个二维降维视图。
示例地,如图2所示,S10步骤可以包括:S11步骤和S12步骤。
在S11步骤中,对三维CT数据设定多个方向。在这里,多个方向为任意的方向,而不限于检测过程中物体行进方向的正交方向这样的特定方向。
另外,在设定多个方向的同时,或者在设定多个方向前后,可选地,可以做一定的三维体数据的预处理操作,例如,过滤无效体素,预先计算 投影或渲染所需的几何参数,由此,可以提高后续处理速度。
在S12步骤中,按照多个方向进行投影或渲染,得到多个二维降维视图。
作为一个示例,可以基于CT图像切片序列进行射线或光线投射,从图像的每一个像素,沿特定方向发射一条射线或光线,射线或光线穿越整个图像序列,并在这个过程中,对图像序列进行采样获取属性或颜色信息,同时依据一定的模型将属性或者颜色值进行累加,直至射线或光线穿越整个图像序列,最后得到的属性或者颜色值作为降维后的二维视图。
在本申请中,通过按照任意方向进行投影来获得二维降维视图,可以避免只沿着特定方向进行降维,例如,沿着检测过程中的物体的行进方向的正交方向进行降维,从而能够解决只沿着特定方向进行降维时存在的如下问题:(1)在物体的某些摆放姿态下,导致的物体降维后的面积过小、形状信息表达不完整,从而无法准确进行识别;(2)该物体因被其它物体遮挡而损失该物体的形状信息,从而无法准确识别目标物。
在S20步骤中,进行二维识别处理。即,针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合。在这里,多个二维视图包含上述S10步骤中获得的多个二维降维视图。
示例地,作为二维视图的目标物识别方法,可以采用用于二维图像的图像处理方法、经典机器学习方法、深度学习方法中的至少一种或者相结合的方式。
例如,将二维视图作为输入输入到神经网络模型中,并作为输出获得二维语义描述集合。
示例地,可以使用基于深度学习的目标检测神经网络进行目标物的二维位置检测。在目标检测任务中使用到的卷积神经网络是深度学习在计算机视觉任务中的一种典型结构,这样的卷积神经网络具有局部连接、权重共享、以及空间上的重采样等特性。这些特性使得卷积神经网络具有一定 程度上的平移缩放不变性。在这里,二维语义描述集合以像素、感兴趣区域、二维图像中的一个或多个为单位,包含类别信息和/或置信度,或者,所述二维语义描述集合以感兴趣区域和/或二维图像为单位,包含类别信息、置信度、目标物的位置信息中的至少一个。其中,类别信息表示目标物所属的类别,比如,枪、刀等。位置信息可以包含中心坐标、包围盒等。置信度表示目标物存在的可能性大小,可以为归一化标量或向量。
换言之,二维语义描述集合包含:某像素属于目标物的类别信息、置信度等;某感兴趣区域所包含的目标物的类别信息、目标物的位置信息、置信度等;某幅二维图像所包含的目标物的类别信息、目标物的位置信息、置信度等中的至少一个信息。其中,至少一个信息可以是包含在一组中的信息,也可以是分别包含在不同组中的信息。
另外,二维语义信息的描述集合除了类别信息、置信度、位置信息之外,也可以包含目标物的姿态、目标物的数量等其他语义信息。
本申请的二维视图的目标物识别方法不做特别限定,只要是能够基于二维视图获得上述的二维语义描述集合的方法即可。
如上所述,在本申请中,以二维语义描述集合的方式来表示目标物的二维识别结果,这样的二维语义描述集合作为输入被输入到S30步骤中被升维成三维,从而能够做到将二维识别结果整合成三维。另外,由于二维语义描述集合的方式较为灵活,能够使得所包含的信息较为丰富。
在S30步骤中,进行升维处理。即,对二维语义描述集合进行升维,获得目标物的三维识别结果。
示例地,如图3所示,S30步骤可以包括S31步骤和S32步骤。
在S31步骤中,将二维语义描述集合利用反投影法映射到三维空间,获得三维概率图。反投影可以认为是投影的逆过程。
可选地,反投影过程可以通过体素驱动或像素驱动等方式来实现。示例地,可以通过体素驱动或像素驱动,获得语义特征矩阵,并将语义特征 矩阵压缩为三维概率图。
在这里,体素驱动包括:将三维CT数据中的每一个体素对应到每幅二维视图中的像素,查询并累积像素所对应的二维语义描述信息,生成语义特征矩阵。
体素到像素的对应关系可以建立映射函数或查找表,以提高计算速度。
如上所述,按照体素驱动,遍历三维CT数据中的每一个体素,依次获得其语义特征矩阵,最后压缩语义特征矩阵获得三维概率图。
体素驱动针对每个体素能够并行运算,从而运算速度快,能够提高安检的实时性。
像素驱动包括:二维视图中的每个像素对应于三维CT数据中的一条直线,遍历每一幅二维视图中的每一个像素或感兴趣区域的每一个像素,沿直线向三维空间中传播该像素所对应的二维语义描述信息,并生成语义特征矩阵,其中,感兴趣区域由二维语义描述集合给出。其中,也可以通过映射函数或查找表来获得体素和像素的对应关系。
如上所述,按照像素驱动,多个二维视图中的每一个像素,依次获得其语义特征矩阵,最后压缩语义特征矩阵获得三维概率图。
像素驱动也可以按照像素进行并行运算,也有利于提高运算速度,提高安检的实时性。
根据如上所述的对体素驱动和像素驱动的说明可知,语义特征矩阵是由二维语义描述信息根据其空间对应关系生成的,是一种基于二维语义描述信息进行数字化和集约化得到的矩阵。例如,针对二维语义描述集合中的类别信息,可以按照每个目标物的类别,分别获得语义特征矩阵。例如,可以假设当属于该类别时语义特征矩阵中的对应的数值是1,当不属于该类别时语义特征矩阵中的对应的数值是0。另外,针对二维语义描述集合中的其他语义信息也可以用类似的方式获得语义特征矩阵。
压缩语义特征矩阵的典型方法有加权平均、主成分分析等。此时,输 入是语义特征矩阵,输出是概率图。
作为一个示例,假设有两个二维视图,从而具有两个二维语义描述集合,其中某像素(或感兴趣区域或二维图像)上的语义信息用数值表示为1或0,则可以使用反投影法映射到三维空间,生成对应三维空间的语义特征矩阵,此矩阵中的值为由0或1组成的向量,可以使用加权平均方法,计算出对应三维空间中体素的概率图值,如某个体素的语义特征矩阵值为v=[0,1],在权数相同情况下,该体素的概率图值为0.5。在对对应于所有目标物类别的语义特征矩阵进行压缩时,输出的概率图值的维度由目标物类别数量来决定。在这里说明的获得概率图值的方式仅是一个示例,也可以通过其他方式获得概率图值。例如,权数可以不同,对语义特征矩阵值以不同的权数进行加权来获得概率图值。
作为另一个示例,也可以将三维语义特征矩阵中的一个或多个向量作为主成分分析中的输入变量,对这样的输入变量进行主成分分析,来获得作为主成分的输出变量,将该输出变量归一化作为对应体素的概率图值。
通过使用如上所述的运算方法,不仅能够保证运算的实时性,还能对二维识别结果进行有效整合,提升最终识别效果。
在S32步骤中,对三维概率图进行特征提取,获得目标物的三维识别结果。
示例地,对三维概率图,采用图像处理方法、经典的机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行特征提取,从而获得三维图像语义描述集合,作为所述三维识别结果。
作为一个示例,将三维概率图作为输入输入到深度学习模型中,作为输出获得置信度和三维包围盒等的三维识别结果。此处使用的深度学习模型可以采用层数较少的分类神经网络或目标检测网络等技术。通过采用这样的技术,原始的三维CT数据中蕴含的信息量经过上述步骤处理后得到了有效精简及抽象,更接近于违禁品识别的最终目标,应用简单的特征提 取方法就可以快速准确地提取出三维语义描述集合。
在这里,三维图像语义描述集合以体素、三维感兴趣区域、三维CT图像中的一个或多个为单位,包含:类别信息和/或置信度;或者,三维图像语义描述集合以三维感兴趣区域和/或三维CT图像为单位,包含:类别信息、目标物的位置信息、置信度中的至少一个。三维CT图像中的目标物的位置信息可以包含三维包围盒。
换言之,三维图像语义描述集合包含:某体素属于目标物的类别信息、置信度等;某三维感兴趣区域(VOI)包含目标物的类别信息、目标物的位置信息、置信度等;某幅三维CT图像包含目标物的类别信息、目标物的位置信息、置信度等中的至少一个信息。其中,至少一个信息可以是包含在一组中的信息,也可以是分别包含在不同组中的信息。
由于三维图像语义描述集合是由从基于二维语义描述集合生成的三维概率图生成的,因此,三维图像语义描述集合所包含的语义信息的种类和二维语义描述集合所包含的语义信息的种类有一致性或相互转换性。
在本申请中,通过上述的升维处理,将二维语义描述集合进行升维,获得所述目标物的三维识别结果,由此解决了仅通过降维进行二维识别时信息量显著降低的问题,能够在采用二维识别的同时,减少信息量的损失,兼顾安检的实时性和准确性。
作为S32步骤的另一个示例,例如,可以采用图像处理方法,如图4所示,S32步骤可以包括S321~S323步骤。
在S321步骤中,对三维概率图进行二值化,得到三维二值图。
在S322步骤中,对三维二值图进行连通区域分析,获得连通区域。
作为一个示例,可以对三维二值图进行连通分量标记,针对每个标记区域,进行掩模操作,得到连通区域。
在S323步骤中,针对连通区域生成三维图像语义描述集合。
此时,三维图像语义描述集合可以包含三维包围盒,通过包含三维包 围盒,能够给出目标物在三维图像上的空间边界,能够更直观地示出目标物的位置、范围、姿势、形状等,有利于安检员判断目标物是否为危险物品的准确度。
作为一个示例,可以提取连通区域内的所有概率值,进行主成分分析得到分析集合,并将分析集合作为物体有效体素区域。针对有效体素区域统计出三维图像语义描述集合。由此,能够进一步提高三维识别的准确度。
在第一实施方式中,通过从三维CT数据进行降维来生成多个二维降维视图,利用包含这些多个二维降维视图的多个二维视图进行目标物识别,获得二维语义描述集合,再对二维语义描述集合进行升维来获得三维识别结果,即,从三维先降维到二维进行识别后再升维生成三维结果,由此,既能够通过基于二维的识别,有效地识别出物质组成和物理属性比较复杂且具有形状特征的目标物,又能够有效整合二维识别结果,给出信息量丰富的三维识别结果。从而,能够提高对目标物的识别效果另外,也能够满足安检的实时性要求。
<第二实施方式>
作为本申请的第二实施方式,提供另一种安检CT目标物识别方法。图5是示出第一实施方式所涉及的安检CT目标物识别方法的流程图。
第二实施方式和第一实施方式的区别在于,在第二实施方式中,不仅利用从三维CT数据降维生成的二维降维图像,还利用二维数字化X射线摄影(Digital Radiography,DR)数据进行目标物识别。
示例地,在S20步骤中,多个二维视图还包括二维DR图,对该二维DR图也进行目标物识别,获得目标物的二维语义描述集合。在这里,该二维DR图是由与安检CT设备独立地另外配置的DR成像装置得到的。该二维DR图是针对与三维CT数据相同的安检对象的图像。
在第二实施方式中,如图5所示,在S20步骤之前,还可以具有S40步骤。在S40步骤中,从DR成像装置获得二维DR图,将其作为多个二 维视图之一。该S40步骤可以与S10步骤并行地进行。
此时,在S30步骤中,不仅对二维降维图像的二维语义描述集合进行升维,还对二维DR图的二维语义描述集合也进行升维,由此获得三维识别结果。
二维DR图是与将三维CT数据降维生成的二维降维图像不同原理和性质的二维图像,通过将这样的二维DR图也用于目标物识别,能够增加用于识别的信息量,从而提高识别的准确度。
在第二实施方式中,如图5所示,在S30步骤之后,可选地,还可以具有S50步骤。在S50步骤中,把S30步骤生成的三维识别结果投影到二维DR图,再作为二维DR图的识别结果输出。
由于工作习惯和需求,有些安检人员想要在二维DR图上确认识别结果,然而,如果直接使用二维DR图的识别结果,当DR图上目标物被严重遮挡或存在特殊的摆放姿态时,目标物的信息呈现不完整进而影响识别精度,而三维识别结果有效整合了若干个二维视图的语义信息,从而更加精准可靠,因此,通过将三维识别结果投影到二维DR图上作为识别结果来输出,在满足安检人员通过二维DR图确认识别结果的工作需求的同时,也提高了识别结果的精确性。
另外,也可以将S30步骤的结果和S50步骤的结果同时输出。
此时可以通过三维识别结果和二维DR图上的识别结果互相进行比对和验证,从而有利于安检人员更加准确地判断目标物是否为危险物品。
<第三实施方式>
作为本申请的第三实施方式,提供一种安检CT目标物识别装置。图6是示出第一实施方式所涉及的安检CT目标物识别装置的示意图。
如图6所示,本实施方式的安检CT目标物识别装置100包括:降维模块10、二维识别模块20、升维模块30。
降维模块10对三维CT数据进行降维来生成多个二维降维视图。即, 可执行上述第一、第二实施方式中的S10步骤的处理。
二维识别模块20针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合。在这里,多个二维视图包含所述多个二维降维视图。即,可执行上述第一、第二实施方式中的S20步骤的处理。
升维模块30对二维语义描述集合进行升维,获得目标物的三维识别结果。即,可执行上述第一、第二实施方式中的S30步骤的处理。
降维模块10、二维识别模块20、升维模块30的处理可参考上述的第一、第二实施方式,因此,在这里不再重复。
另外,如图7所示,安检CT目标物识别装置100还可以包括:DR图获取模块40,该DR图获取模块40从DR成像装置获得二维DR图,将其作为多个二维视图之一。即,DR图获取模块40可执行第二实施方式中的S40步骤的处理。
安检CT目标物识别装置100还可以包括:DR输出模块50,DR输出模块50将升维模块30生成的三维识别结果投影到二维DR图,再作为二维DR图的识别结果输出。即,DR输出模块50可执行第二实施方式中的S50步骤的处理。
在本申请中,安检CT目标物识别装置100可以以硬件方式实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。
例如,安检CT目标物识别装置100可以由设置有处理器的台式计算机、平板电脑、智能电话、服务器等任何合适的电子设备,以软硬件相结合的方式实现。例如,安检CT目标物识别装置100可以是安检CT系统的控制用计算机,或者在安检CT系统中与安检CT扫描设备连接的服务器等。
另外,安检CT目标物识别装置100可以由台式计算机、平板电脑、智能电话、服务器等任何合适的电子设备上的软件模块的方式来实现。例 如,安装在安检CT系统的控制用计算机上的软件模块,或者是安装在在安检CT系统中与安检CT扫描设备连接的服务器上的软件模块。
安检CT目标物识别装置100的处理器可以执行后述的安检CT目标物识别方法。
安检CT目标物识别装置100还可以包括存储器(未示出)和通信模块(未示出)等。
安检CT目标物识别装置100的存储器可以存储执行后述的安检CT目标物识别方法的步骤,以及与用于进行安检CT目标物识别相关的数据等。存储器可以是例如ROM(ReadOnly Memory image,只读存储器)、RAM(Random Access Memory,随机存取存储器)等。存储器具有用于执行上述安检CT目标物识别方法中的任何步骤的程序代码的存储空间。这些程序代码被处理器读取并执行时执行上述的安检CT目标物识别方法。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,光盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为便携式或者固定存储单元。用于执行上述方法中的任何步骤的程序代码也可以通过网络进行下载。程序代码可以例如以适当形式进行压缩。
安检CT目标物识别装置100中的通信模块可以支持在安检CT目标物识别装置100与外部电子装置之间建立直接(例如,有线)通信信道或无线通信信道,并经由建立的通信信道执行通信。例如,通信模块经由网络从CT扫描装置接收三维CT数据等。
此外,安检CT目标物识别装置100还可以包括显示器、麦克风、扬声器等输出部,以便输出目标物识别结果。
通过上述的安检CT目标物识别装置100能够获得与上述第一、第二实施方式相同的效果。
以上,虽然结合附图描述了本申请的实施方式和具体实施例,但是本 领域技术人员可以在不脱落本申请的精神和范围的情况下做出各种修改和变形,这样的修改和变形均落入由所述权利要求所限定的范围之内。

Claims (19)

  1. 一种安检CT目标物识别方法,包括:
    对三维CT数据进行降维来生成多个二维降维视图;
    针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,所述多个二维视图包含所述多个二维降维视图;以及
    对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果。
  2. 如权利要求1所述的安检CT目标物识别方法,其中,
    对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果包括:
    将所述二维语义描述集合利用反投影法映射到三维空间,获得三维概率图;以及
    对所述三维概率图进行特征提取,获得所述目标物的三维识别结果。
  3. 如权利要求2所述的安检CT目标物识别方法,其中,
    将所述二维语义描述集合利用反投影法映射到三维空间,获得三维概率图包括:
    通过体素驱动或像素驱动进行二维语义描述集合到三维空间的映射,获得语义特征矩阵,并将语义特征矩阵压缩为三维概率图。
  4. 如权利要求3所述的安检CT目标物识别方法,其中,
    所述体素驱动包括:
    将所述三维CT数据中的每一个体素对应到每幅所述二维视图中的像素,查询并累积所述像素所对应的二维语义描述信息,生成所述语义特征矩阵,
    所述像素驱动包括:
    所述二维视图中的每个像素对应于所述三维CT数据中的一条直线,遍历每一幅二维视图中的每一个像素或感兴趣区域的每一个像素,沿直线 向三维空间中传播该像素所对应的二维语义描述信息,生成所述语义特征矩阵,其中,所述感兴趣区域由所述二维语义描述集合给出。
  5. 如权利要求4所述的安检CT目标物识别方法,其中,
    在所述体素驱动或所述像素驱动中,通过映射函数或查找表来获得所述体素和所述像素的对应关系。
  6. 如权利要求2所述的安检CT目标物识别方法,其中,
    对所述三维概率图进行特征提取,获得所述目标物的三维识别结果包括:
    对所述三维概率图,采用图像处理方法、经典的机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行特征提取,从而获得三维图像语义描述集合,作为所述三维识别结果。
  7. 如权利要求6所述的安检CT目标物识别方法,其中,
    对三维概率图进行二值化,得到三维二值图;
    对三维二值图进行连通区域分析,获得连通区域;
    针对所述连通区域生成三维图像语义描述集合。
  8. 如权利要求7所述的安检CT目标物识别方法,其中,
    所述连通区域分析包括:
    对所述三维二值图进行连通分量标记,针对每个标记区域,进行掩模操作,得到所述连通区域。
  9. 如权利要求7所述的安检CT目标物识别方法,其中,
    针对所述连通区域生成三维图像语义描述集合包括:
    提取所述连通区域内的所有概率值,进行主成分分析得到分析集合,并将所述分析集合作为物体有效体素区域,统计出三维图像语义描述集合。
  10. 如权利要求6所述的安检CT目标物识别方法,其中,
    所述三维图像语义描述集合以体素、三维感兴趣区域、三维CT图像中的一个或多个为单位,包含:类别信息和/或置信度;
    或者,所述三维图像语义描述集合以三维感兴趣区域和/或三维CT图像为单位,包含:类别信息、目标物的位置信息、置信度中的至少一个。
  11. 如权利要求10所述的安检CT目标物识别方法,其中,
    所述位置信息包含三维包围盒。
  12. 如权利要求1所述的安检CT目标物识别方法,其中,
    所述二维语义描述集合以像素、感兴趣区域、二维图像中的一个或多个为单位,包含类别信息和/或置信度,
    或者,所述二维语义描述集合以感兴趣区域和/或二维图像为单位,包含类别信息、置信度、目标物的位置信息中的至少一个。
  13. 如权利要求1所述的安检CT目标物识别方法,其中,
    针对所述多个二维视图中的每一个进行目标物识别包括:
    采用用于二维图像的图像处理方法、经典机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行目标物识别。
  14. 如权利要求1所述的安检CT目标物识别方法,其中,
    对三维CT数据进行降维来生成多个二维降维视图包括:
    对所述三维CT数据设定多个方向;以及
    按照所述多个方向进行投影或渲染。
  15. 如权利要求14所述的安检CT目标物识别方法,其中,
    所述多个方向为任意的方向,而不限于检测过程中物体行进方向的正交方向。
  16. 如权利要求1至15中的任一项所述的安检CT目标物识别方法,其中,
    所述多个二维视图还包含二维DR图,
    所述二维DR图是由DR成像装置得到的。
  17. 如权利要求16所述的安检CT目标物识别方法,其中,
    把所述三维识别结果投影到所述二维DR图,再作为二维DR图的识别结果输出。
  18. 一种安检CT目标物识别装置,包括:
    降维模块,对三维CT数据进行降维来生成多个二维降维视图;
    二维识别模块,针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,所述多个二维视图包含所述多个二维降维视图;以及
    升维模块,对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果。
  19. 一种机器可读存储介质,存储有程序,所述程序能够使计算机执行:
    对三维CT数据进行降维来生成多个二维降维视图;
    针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,所述多个二维视图包含所述多个二维降维视图;以及
    对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果。
PCT/CN2022/104606 2021-08-27 2022-07-08 安检ct目标物识别方法和装置 WO2023024726A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020247003300A KR20240025683A (ko) 2021-08-27 2022-07-08 보안검사 ct 목표물 식별 방법 및 장치

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110998653.3A CN113792623B (zh) 2021-08-27 2021-08-27 安检ct目标物识别方法和装置
CN202110998653.3 2021-08-27

Publications (1)

Publication Number Publication Date
WO2023024726A1 true WO2023024726A1 (zh) 2023-03-02

Family

ID=79182374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/104606 WO2023024726A1 (zh) 2021-08-27 2022-07-08 安检ct目标物识别方法和装置

Country Status (3)

Country Link
KR (1) KR20240025683A (zh)
CN (2) CN113792623B (zh)
WO (1) WO2023024726A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453063A (zh) * 2023-06-12 2023-07-18 中广核贝谷科技有限公司 基于dr图像与投影图融合的目标检测识别方法及系统

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113792623B (zh) * 2021-08-27 2022-12-13 同方威视技术股份有限公司 安检ct目标物识别方法和装置

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150332448A1 (en) * 2012-12-27 2015-11-19 Nuctech Company Limited Object detection methods, display methods and apparatuses
CN108882897A (zh) * 2015-10-16 2018-11-23 瓦里安医疗系统公司 图像引导放射疗法中的迭代图像重建
CN109493417A (zh) * 2018-10-31 2019-03-19 深圳大学 三维物体重建方法、装置、设备和存储介质
CN109975335A (zh) 2019-03-07 2019-07-05 北京航星机器制造有限公司 一种ct检测方法及装置
CN111968240A (zh) * 2020-09-04 2020-11-20 中国科学院自动化研究所 基于主动学习的摄影测量网格的三维语义标注方法
US20210049397A1 (en) * 2018-10-16 2021-02-18 Tencent Technology (Shenzhen) Company Limited Semantic segmentation method and apparatus for three-dimensional image, terminal, and storage medium
CN112598619A (zh) * 2020-11-23 2021-04-02 西安科锐盛创新科技有限公司 基于迁移学习的颅内血管模拟三维狭窄化模型的建立方法
CN113792623A (zh) * 2021-08-27 2021-12-14 同方威视技术股份有限公司 安检ct目标物识别方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104658028B (zh) * 2013-11-18 2019-01-22 清华大学 在三维图像中快速标记目标物的方法和装置
JP7005622B2 (ja) * 2017-07-12 2022-01-21 株式会社ソニー・インタラクティブエンタテインメント 認識処理装置、認識処理方法及びプログラム
CN109166183B (zh) * 2018-07-16 2023-04-07 中南大学 一种解剖标志点识别方法及识别设备
CN112444784B (zh) * 2019-08-29 2023-11-28 北京市商汤科技开发有限公司 三维目标检测及神经网络的训练方法、装置及设备
CN111652966B (zh) * 2020-05-11 2021-06-04 北京航空航天大学 一种基于无人机多视角的三维重建方法及装置

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150332448A1 (en) * 2012-12-27 2015-11-19 Nuctech Company Limited Object detection methods, display methods and apparatuses
CN108882897A (zh) * 2015-10-16 2018-11-23 瓦里安医疗系统公司 图像引导放射疗法中的迭代图像重建
US20210049397A1 (en) * 2018-10-16 2021-02-18 Tencent Technology (Shenzhen) Company Limited Semantic segmentation method and apparatus for three-dimensional image, terminal, and storage medium
CN109493417A (zh) * 2018-10-31 2019-03-19 深圳大学 三维物体重建方法、装置、设备和存储介质
CN109975335A (zh) 2019-03-07 2019-07-05 北京航星机器制造有限公司 一种ct检测方法及装置
CN111968240A (zh) * 2020-09-04 2020-11-20 中国科学院自动化研究所 基于主动学习的摄影测量网格的三维语义标注方法
CN112598619A (zh) * 2020-11-23 2021-04-02 西安科锐盛创新科技有限公司 基于迁移学习的颅内血管模拟三维狭窄化模型的建立方法
CN113792623A (zh) * 2021-08-27 2021-12-14 同方威视技术股份有限公司 安检ct目标物识别方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453063A (zh) * 2023-06-12 2023-07-18 中广核贝谷科技有限公司 基于dr图像与投影图融合的目标检测识别方法及系统
CN116453063B (zh) * 2023-06-12 2023-09-05 中广核贝谷科技有限公司 基于dr图像与投影图融合的目标检测识别方法及系统

Also Published As

Publication number Publication date
CN115661810A (zh) 2023-01-31
CN113792623A (zh) 2021-12-14
CN113792623B (zh) 2022-12-13
KR20240025683A (ko) 2024-02-27

Similar Documents

Publication Publication Date Title
Tong et al. Improved U-NET network for pulmonary nodules segmentation
WO2023024726A1 (zh) 安检ct目标物识别方法和装置
WO2021139324A1 (zh) 图像识别方法、装置、计算机可读存储介质及电子设备
Li et al. Saliency detection via dense and sparse reconstruction
Lin et al. Line segment extraction for large scale unorganized point clouds
Cheng et al. Global contrast based salient region detection
CN108615237A (zh) 一种肺部图像处理方法及图像处理设备
CN110956632B (zh) 钼靶图像中胸大肌区域自动检测方法及装置
CN109363697B (zh) 一种乳腺影像病灶识别的方法及装置
Cheng et al. Fabric defect detection based on separate convolutional UNet
CN104281856B (zh) 用于脑部医学图像分类的图像预处理方法和系统
US9508120B2 (en) System and method for computer vision item recognition and target tracking
Liu et al. Extracting lungs from CT images via deep convolutional neural network based segmentation and two-pass contour refinement
Zhao et al. Region-based saliency estimation for 3D shape analysis and understanding
Manh et al. Small object segmentation based on visual saliency in natural images
Ren et al. How important is location information in saliency detection of natural images
CN111310531A (zh) 图像分类方法、装置、计算机设备及存储介质
Wang et al. Accurate saliency detection based on depth feature of 3D images
Chagnon-Forget et al. Enhanced visual-attention model for perceptually improved 3D object modeling in virtual environments
US20230196748A1 (en) Method and system for training neural network for entity detection
CN116420176A (zh) 基于对象的图像表示来区分对象的不同配置状态的方法和设备
US20240212336A1 (en) Security check ct object recognition method and apparatus
JP2022067086A (ja) デジタル化された筆記の処理
CN113592807A (zh) 一种训练方法、图像质量确定方法及装置、电子设备
Wang An algorithm for ATM recognition of spliced money based on image features

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: P6000126/2024

Country of ref document: AE

ENP Entry into the national phase

Ref document number: 2024504228

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20247003300

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020247003300

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 18293704

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2022860072

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022860072

Country of ref document: EP

Effective date: 20240327