WO2023024726A1 - 安检ct目标物识别方法和装置 - Google Patents
安检ct目标物识别方法和装置 Download PDFInfo
- Publication number
- WO2023024726A1 WO2023024726A1 PCT/CN2022/104606 CN2022104606W WO2023024726A1 WO 2023024726 A1 WO2023024726 A1 WO 2023024726A1 CN 2022104606 W CN2022104606 W CN 2022104606W WO 2023024726 A1 WO2023024726 A1 WO 2023024726A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dimensional
- target
- views
- semantic description
- description set
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 84
- 230000009467 reduction Effects 0.000 claims abstract description 53
- 238000007689 inspection Methods 0.000 claims description 53
- 239000011159 matrix material Substances 0.000 claims description 30
- 238000004458 analytical method Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 10
- 238000001514 detection method Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 9
- 238000013135 deep learning Methods 0.000 claims description 8
- 238000003672 processing method Methods 0.000 claims description 7
- 238000010801 machine learning Methods 0.000 claims description 6
- 238000000513 principal component analysis Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 238000003384 imaging method Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000036544 posture Effects 0.000 description 5
- 238000002591 computed tomography Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 239000002360 explosive Substances 0.000 description 3
- 230000000704 physical effect Effects 0.000 description 3
- 238000009877 rendering Methods 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 239000011365 complex material Substances 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000002601 radiography Methods 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000005266 casting Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000003325 tomography Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N23/00—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
- G01N23/02—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material
- G01N23/04—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material
- G01N23/046—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material using tomography, e.g. computed tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N23/00—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
- G01N23/02—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material
- G01N23/04—Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by transmitting the radiation through the material and forming images of the material
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01V—GEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
- G01V5/00—Prospecting or detecting by the use of ionising radiation, e.g. of natural or induced radioactivity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/05—Recognition of patterns representing particular kinds of hidden objects, e.g. weapons, explosives, drugs
Definitions
- This application relates to the field of security inspection computerized tomography (Computed Tomography, CT), in particular to a security inspection CT target recognition method and device.
- Computerized Tomography Computerized tomography
- CT equipment is often used in the identification of targets such as contraband.
- the traditional technology is mainly: use CT reconstruction technology to obtain a 3D tomographic image containing material attribute information, divide the 3D image into several suspects, and perform statistics and classification on the material attributes of the suspects. kind.
- Patent Document 1 proposes a CT detection method and device, which respectively recognize the three-dimensional tomographic image and the two-dimensional image of the object, and obtain the identification result of explosives through the former, and through the latter Obtain the recognition results of other contraband.
- Patent Document 1 109975335A
- Patent Document 1 attempts to improve the above-mentioned limitations, the inventors of the present application have found through research that Patent Document 1 still has the following technical problems:
- this method can also use the two-dimensional projection image obtained from the three-dimensional tomographic image in the identification of other contraband except explosives, the identification operation is limited to the two-dimensional plane, and it is impossible to obtain Recognition results in 3D space. Since the amount of information in two-dimensional images is significantly lower than that of three-dimensional data, if the two-dimensional recognition results cannot be effectively integrated, the advantages of security CT equipment cannot be fully utilized.
- the present application proposes a security inspection CT target recognition method and device capable of improving the recognition effect on three-dimensional shape targets.
- An embodiment of the present application provides a security inspection CT target recognition method, including: reducing the dimensionality of the three-dimensional CT data to generate multiple two-dimensional dimensionality reduction views;
- the two-dimensional semantic description set is upgraded to obtain the three-dimensional recognition result of the target object, which includes: mapping the two-dimensional semantic description set to a three-dimensional space by using the back projection method to obtain a three-dimensional probability map; And feature extraction is performed on the three-dimensional probability map to obtain the three-dimensional recognition result of the target object.
- the two-dimensional semantic description set is mapped to the three-dimensional space by using the back projection method, and the three-dimensional probability map is obtained, including: the mapping of the two-dimensional semantic description set to the three-dimensional space by voxel driving or pixel driving , to obtain the semantic feature matrix, and compress the semantic feature matrix into a three-dimensional probability map.
- the voxel drive includes: mapping each voxel in the 3D CT data to a pixel in each 2D view, querying and accumulating the 2D semantic description information corresponding to the pixel, Generate a semantic feature matrix, pixel driving includes: each pixel in the 2D view corresponds to a straight line in the 3D CT data, traverse each pixel in each 2D view or each pixel in the region of interest, and follow the straight line Propagate the two-dimensional semantic description information corresponding to the pixel to the three-dimensional space to generate a semantic feature matrix, where the region of interest is given by the two-dimensional semantic description set.
- feature extraction is performed on the three-dimensional probability map, and the three-dimensional recognition result of the target object includes: for the three-dimensional probability map, at least one of image processing methods, classic machine learning methods, and deep learning methods is used.
- Feature extraction is performed in one or a combination of ways, so as to obtain a set of semantic descriptions of 3D images as a 3D recognition result.
- feature extraction is performed on the three-dimensional probability map to obtain the three-dimensional recognition result of the target, including: binarizing the three-dimensional probability map to obtain a three-dimensional binary map; connecting the three-dimensional binary map Regional analysis to obtain connected regions; generate a set of semantic descriptions for 3D images for connected regions.
- the connected region analysis includes: marking the connected components of the three-dimensional binary image, and performing a mask operation for each marked region to obtain connected regions.
- generating a 3D image semantic description set for connected areas includes: extracting all probability values in the connected areas, performing principal component analysis to obtain an analysis set, and using the analysis set as an effective voxel area of the object, A set of 3D image semantic descriptions is calculated.
- the 3D image semantic description set takes one or more of voxel, 3D region of interest, and 3D CT image as a unit, including: category information and/or confidence; or, 3D
- the image semantic description set takes a 3D region of interest and/or a 3D CT image as a unit, and includes: at least one of category information, target object location information, and confidence.
- the location information includes a three-dimensional bounding box.
- the two-dimensional semantic description set contains category information and/or confidence in units of one or more of pixels, regions of interest, and two-dimensional images, or the two-dimensional semantic description
- the set includes at least one of category information, confidence, and location information of the target, taking the region of interest and/or the two-dimensional image as a unit.
- target recognition for each of the plurality of two-dimensional views includes: using at least one of an image processing method for two-dimensional images, a classical machine learning method, and a deep learning method One or a combination of ways to identify objects.
- performing dimensionality reduction on the 3D CT data to generate multiple 2D dimensionality reduction views includes: setting multiple directions for the 3D CT data; and performing projection or rendering according to the multiple directions.
- the multiple directions are arbitrary directions, and are not limited to the orthogonal direction of the traveling direction of the object during the detection process.
- the multiple two-dimensional views further include a two-dimensional DR image, and the two-dimensional DR image is obtained by a DR imaging device.
- the three-dimensional recognition result is projected onto the two-dimensional DR map, and then output as the recognition result of the two-dimensional DR map.
- the embodiment of the present application also provides a security inspection CT target recognition device, including: a dimensionality reduction module, which performs dimensionality reduction on 3D CT data to generate multiple 2D dimensionality reduction views; Target recognition, to obtain the two-dimensional semantic description set of the target object, multiple two-dimensional views include multiple two-dimensional dimensionality reduction views; and the dimension enhancement module, to increase the dimension of the two-dimensional semantic description set, and obtain the three-dimensional recognition result of the target object .
- a security inspection CT target recognition device including: a dimensionality reduction module, which performs dimensionality reduction on 3D CT data to generate multiple 2D dimensionality reduction views;
- Target recognition to obtain the two-dimensional semantic description set of the target object, multiple two-dimensional views include multiple two-dimensional dimensionality reduction views; and the dimension enhancement module, to increase the dimension of the two-dimensional semantic description set, and obtain the three-dimensional recognition result of the target object .
- the embodiment of the present application also provides a machine-readable storage medium, which stores a program, and the program can enable the computer to execute: performing dimensionality reduction on 3D CT data to generate multiple 2D dimensionality reduction views; target multiple 2D views Object recognition, obtain the two-dimensional semantic description set of the target object, multiple two-dimensional views include multiple two-dimensional dimensionality reduction views; and upgrade the dimension of the two-dimensional semantic description set to obtain the three-dimensional recognition result of the target object.
- multiple 2D dimensionality reduction views are generated by performing dimensionality reduction from 3D CT data, and target recognition is performed using multiple 2D dimensionality reduction views including these multiple 2D dimensionality reduction views, to obtain two 3D semantic description set, and then increase the dimension of the 2D semantic description set to obtain 3D recognition results, that is, first reduce the dimension from 3D to 2D for recognition and then increase the dimension to generate 3D results.
- It can effectively identify targets with complex material composition and physical properties and shape characteristics, and can effectively integrate two-dimensional recognition results to provide information-rich three-dimensional recognition results. Therefore, the recognition effect of the target can be improved. In addition, it can also meet the real-time requirements of security checks.
- FIG. 1 is a flow chart showing a security inspection CT target recognition method related to the first embodiment
- Figure 2 is a flowchart illustrating an example of dimensionality reduction processing
- FIG. 3 is a flow chart illustrating an example of dimension-up processing
- FIG. 4 is a flowchart illustrating an example of three-dimensional feature extraction
- FIG. 5 is a flow chart showing a security inspection CT target recognition method related to the second embodiment
- FIG. 6 is a schematic diagram illustrating an example of a security inspection CT target object recognition device according to a third embodiment
- FIG. 7 is a schematic diagram showing another example of the security inspection CT target object recognition device according to the third embodiment.
- FIG. 1 is a flowchart illustrating a security CT target recognition method according to a first embodiment.
- the security inspection CT target object recognition method is applied to a security inspection CT system, for example, may be executed in a security inspection CT device or a server connected to the security inspection CT device.
- step S10 dimensionality reduction processing is performed. That is, dimensionality reduction is performed on 3D CT data to generate multiple 2D dimensionality reduction views.
- step S10 may include: step S11 and step S12.
- a plurality of directions are set for the three-dimensional CT data.
- the plurality of directions are arbitrary directions, and are not limited to specific directions such as the orthogonal direction to the traveling direction of the object during detection.
- step S12 projection or rendering is performed according to multiple directions to obtain multiple two-dimensional reduced-dimensional views.
- ray or ray casting can be performed based on a sequence of CT image slices, a ray or ray is emitted in a specific direction from each pixel of the image, the ray or ray traverses the entire image sequence, and in the process, the image sequence is processed Sampling is used to obtain attribute or color information, and the attribute or color value is accumulated according to a certain model until the ray or light passes through the entire image sequence, and the finally obtained attribute or color value is used as a two-dimensional view after dimensionality reduction.
- dimensionality reduction can be avoided only along a specific direction, for example, dimensionality reduction is performed along the direction orthogonal to the traveling direction of the object in the detection process, In this way, the following problems can be solved when dimensionality reduction is performed only along a specific direction: (1) Under certain postures of the object, the area of the object after dimensionality reduction is too small, and the shape information expression is incomplete, so that it cannot be accurately (2) The object loses the shape information of the object due to being blocked by other objects, so that the target object cannot be accurately identified.
- step S20 two-dimensional recognition processing is performed. That is, object recognition is performed for multiple two-dimensional views, and a two-dimensional semantic description set of the object is obtained.
- the plurality of two-dimensional views includes the plurality of two-dimensional dimensionality reduction views obtained in the above step S10.
- a target recognition method for a two-dimensional view at least one or a combination of image processing methods for two-dimensional images, classical machine learning methods, and deep learning methods may be used.
- a two-dimensional view is fed into a neural network model as an input, and a two-dimensional semantic description set is obtained as an output.
- a target detection neural network based on deep learning can be used to detect the two-dimensional position of the target.
- the convolutional neural network used in the target detection task is a typical structure of deep learning in computer vision tasks.
- Such a convolutional neural network has the characteristics of local connection, weight sharing, and spatial resampling. These characteristics make the convolutional neural network have a certain degree of translation and scaling invariance.
- the two-dimensional semantic description set uses one or more of pixels, regions of interest, and two-dimensional images as units, and contains category information and/or confidence, or, the two-dimensional semantic description set uses regions of interest and /or a two-dimensional image as a unit, including at least one of category information, confidence, and position information of the target.
- the category information indicates the category to which the target object belongs, for example, a gun, a knife, and the like.
- the location information can include center coordinates, bounding boxes, etc. Confidence indicates the possibility of the existence of the target object, which can be a normalized scalar or vector.
- the two-dimensional semantic description set includes: the category information and confidence level of a certain pixel belonging to the target object; the category information of the target object contained in a certain region of interest, the position information of the target object, confidence level, etc.; At least one of the information contained in the category information of the target object, the position information of the target object, and the confidence level.
- at least one piece of information may be information contained in one group, or information contained in different groups respectively.
- the description set of two-dimensional semantic information may also include other semantic information such as the posture of the target object, the number of the target object, etc., in addition to the category information, confidence degree, and position information.
- the object recognition method of the two-dimensional view in the present application is not particularly limited, as long as it is a method capable of obtaining the above-mentioned two-dimensional semantic description set based on the two-dimensional view.
- the two-dimensional recognition result of the target is expressed in the form of a two-dimensional semantic description set, and such a two-dimensional semantic description set is input into step S30 as an input and is upgraded to three dimensions, so that Achieve the integration of two-dimensional recognition results into three-dimensional.
- the two-dimensional semantic description set is more flexible, it can make the contained information more abundant.
- step S30 dimension-up processing is performed. That is, the two-dimensional semantic description set is upgraded to obtain the three-dimensional recognition result of the target object.
- step S30 may include step S31 and step S32.
- step S31 the two-dimensional semantic description set is mapped to the three-dimensional space by using a back-projection method to obtain a three-dimensional probability map.
- Backprojection can be thought of as the inverse of projection.
- the back-projection process may be implemented by means of voxel driving or pixel driving.
- the semantic feature matrix can be obtained through voxel-driven or pixel-driven, and the semantic feature matrix can be compressed into a three-dimensional probability map.
- voxel driving includes: mapping each voxel in the 3D CT data to a pixel in each 2D view, querying and accumulating the 2D semantic description information corresponding to the pixel, and generating a semantic feature matrix.
- Voxel-to-pixel correspondences can be established as mapping functions or look-up tables to increase computational speed.
- the voxel driver can perform parallel calculations for each voxel, so that the calculation speed is fast and the real-time performance of the security check can be improved.
- Pixel driving includes: each pixel in the two-dimensional view corresponds to a straight line in the three-dimensional CT data, traverses each pixel in each two-dimensional view or each pixel of the region of interest, and propagates along the straight line to the three-dimensional space Two-dimensional semantic description information corresponding to the pixel, and a semantic feature matrix is generated, wherein the region of interest is given by a two-dimensional semantic description set.
- the corresponding relationship between voxels and pixels may also be obtained through a mapping function or a lookup table.
- each pixel in multiple two-dimensional views obtains its semantic feature matrix in turn, and finally compresses the semantic feature matrix to obtain a three-dimensional probability map.
- the pixel driver can also perform parallel calculations according to the pixels, which is also conducive to improving the calculation speed and improving the real-time performance of security checks.
- the semantic feature matrix is generated by two-dimensional semantic description information according to its spatial correspondence, and is obtained by digitizing and intensifying the two-dimensional semantic description information.
- matrix For example, for category information in a two-dimensional semantic description set, a semantic feature matrix may be obtained according to the category of each object. For example, it may be assumed that the corresponding numerical value in the semantic feature matrix is 1 when it belongs to the category, and the corresponding numerical value in the semantic feature matrix is 0 when it does not belong to the category.
- the semantic feature matrix can also be obtained in a similar manner.
- Typical methods for compressing semantic feature matrix include weighted average, principal component analysis, etc.
- the input is a semantic feature matrix
- the output is a probability map.
- the back projection method is mapped to the three-dimensional space to generate a semantic feature matrix corresponding to the three-dimensional space.
- the value in this matrix is a vector composed of 0 or 1.
- the dimension of the output probability map value is determined by the number of object categories.
- the method of obtaining the probability map value described here is only an example, and the probability map value may also be obtained by other methods.
- the weights can be different, and the values of the semantic feature matrix are weighted with different weights to obtain the value of the probability map.
- one or more vectors in the three-dimensional semantic feature matrix can also be used as input variables in principal component analysis, and principal component analysis is performed on such input variables to obtain output variables as principal components, and the output Variables are normalized as probability map values for corresponding voxels.
- step S32 feature extraction is performed on the three-dimensional probability map to obtain a three-dimensional recognition result of the target object.
- a three-dimensional probability map For example, for a three-dimensional probability map, at least one of image processing methods, classic machine learning methods, and deep learning methods or a combination of methods are used to extract features, so as to obtain a set of semantic descriptions of three-dimensional images as the three-dimensional recognition result.
- a 3D probability map is input into a deep learning model, and 3D recognition results such as confidence and 3D bounding boxes are obtained as output.
- the deep learning models used here can employ techniques such as classification neural networks or object detection networks with fewer layers.
- the 3D image semantic description set takes one or more of voxel, 3D region of interest, and 3D CT image as a unit, including: category information and/or confidence; or, the 3D image semantic description set is in 3D interest
- a region and/or a three-dimensional CT image is used as a unit, and includes: at least one of category information, target location information, and confidence.
- the position information of the object in the 3D CT image may include a 3D bounding box.
- the 3D image semantic description set includes: a voxel belongs to the category information, confidence, etc. of the target; a 3D region of interest (VOI) contains the category information of the target, the location
- the three-dimensional CT image includes at least one information of the category information of the target object, the position information of the target object, and the confidence level. Wherein, at least one piece of information may be information contained in one group, or information contained in different groups respectively.
- the 3D image semantic description set is generated from the 3D probability map generated based on the 2D semantic description set, the type of semantic information contained in the 3D image semantic description set and the type of semantic information contained in the 2D semantic description set Consistent or interchangeable.
- the two-dimensional semantic description set is up-dimensionalized to obtain the three-dimensional recognition result of the target object, thus solving the problem that the amount of information is significantly reduced when the two-dimensional recognition is only performed through dimensionality reduction.
- the problem is that it can reduce the loss of information while adopting two-dimensional identification, and take into account the real-time and accuracy of security checks.
- step S32 for example, an image processing method may be used. As shown in FIG. 4, step S32 may include steps S321-S323.
- step S321 the three-dimensional probability map is binarized to obtain a three-dimensional binary map.
- step S322 the connected region analysis is performed on the three-dimensional binary image to obtain connected regions.
- the connected components may be marked on the 3D binary image, and a mask operation may be performed on each marked region to obtain connected regions.
- step S323 a 3D image semantic description set is generated for connected regions.
- the 3D image semantic description set can include a 3D bounding box.
- a 3D bounding box By including a 3D bounding box, the spatial boundary of the target object on the 3D image can be given, and the position, range, posture, shape, etc. of the target object can be shown more intuitively. , which is beneficial to the accuracy of the security inspector in judging whether the target object is a dangerous object.
- multiple 2D dimensionality reduction views are generated by performing dimensionality reduction from 3D CT data, and multiple 2D dimensionality reduction views including these multiple 2D dimensionality reduction views are used for target recognition to obtain 2D semantics Describe the set, and then increase the dimension of the two-dimensional semantic description set to obtain the three-dimensional recognition result, that is, first reduce the dimension from three-dimensional to two-dimensional for recognition and then increase the dimension to generate a three-dimensional result.
- FIG. 5 is a flowchart showing a security CT target recognition method according to the first embodiment.
- the difference between the second embodiment and the first embodiment is that in the second embodiment, not only the two-dimensional dimensionality reduction image generated from the dimensionality reduction of the three-dimensional CT data is used, but also two-dimensional digital radiography (Digital Radiography, DR) data for object recognition.
- two-dimensional digital radiography Digital Radiography, DR
- the plurality of two-dimensional views further include a two-dimensional DR image, and object recognition is also performed on the two-dimensional DR image to obtain a two-dimensional semantic description set of the object.
- the two-dimensional DR map is obtained by a DR imaging device configured independently of the security inspection CT equipment.
- the two-dimensional DR image is an image of the same security inspection object as the three-dimensional CT data.
- step S40 a two-dimensional DR map is obtained from the DR imaging device as one of multiple two-dimensional views.
- the step S40 can be performed in parallel with the step S10.
- step S30 not only the dimension is increased for the two-dimensional semantic description set of the two-dimensional reduced image, but also the dimension is increased for the two-dimensional semantic description set of the two-dimensional DR image, thereby obtaining the three-dimensional recognition result.
- the 2D DR image is a 2D image with different principles and properties from the 2D dimensionality reduction image generated by reducing the dimensionality of 3D CT data.
- the information for recognition can be increased , so as to improve the accuracy of recognition.
- step S50 there may optionally be a step S50 .
- step S50 the three-dimensional recognition result generated in step S30 is projected onto the two-dimensional DR map, and then output as the recognition result of the two-dimensional DR map.
- step S30 and the result of step S50 may also be output at the same time.
- the 3D recognition result and the recognition result on the 2D DR map can be compared and verified with each other, so that security personnel can more accurately judge whether the target object is a dangerous article.
- FIG. 6 is a schematic diagram showing a security inspection CT target object recognition device according to the first embodiment.
- the security inspection CT object recognition device 100 of this embodiment includes: a dimensionality reduction module 10 , a two-dimensional recognition module 20 , and a dimensionality enhancement module 30 .
- the dimensionality reduction module 10 performs dimensionality reduction on the 3D CT data to generate multiple 2D dimensionality reduction views. That is, the processing of step S10 in the first and second embodiments described above can be performed.
- the two-dimensional recognition module 20 performs target recognition on multiple two-dimensional views, and obtains a set of two-dimensional semantic descriptions of the target.
- the plurality of two-dimensional views includes the plurality of two-dimensional dimensionality reduction views. That is, the processing of step S20 in the first and second embodiments described above can be performed.
- the dimension increasing module 30 increases the dimension of the two-dimensional semantic description set to obtain the three-dimensional recognition result of the target object. That is, the processing of step S30 in the first and second embodiments described above can be performed.
- the processing of the dimensionality reduction module 10 , the two-dimensional recognition module 20 , and the dimensionality enhancement module 30 reference may be made to the above-mentioned first and second implementation modes, and therefore, it will not be repeated here.
- the security inspection CT target recognition device 100 may further include: a DR map acquisition module 40, the DR map acquisition module 40 obtains a two-dimensional DR map from the DR imaging device, and uses it as one of multiple two-dimensional views one. That is, the DR map acquisition module 40 can execute the process of step S40 in the second embodiment.
- the security inspection CT object recognition device 100 may further include: a DR output module 50 , which projects the three-dimensional recognition result generated by the dimension-enhancing module 30 into a two-dimensional DR image, and then outputs it as the recognition result of the two-dimensional DR image. That is, the DR output module 50 can execute the process of step S50 in the second embodiment.
- the security inspection CT object recognition apparatus 100 may be realized in hardware, or in a software module running on one or more processors, or in a combination thereof.
- the security inspection CT object recognition apparatus 100 may be realized by a combination of software and hardware by any suitable electronic device such as a desktop computer, a tablet computer, a smart phone, or a server equipped with a processor.
- the security CT target recognition apparatus 100 may be a control computer of a security CT system, or a server connected to a security CT scanning device in the security CT system.
- the security inspection CT object recognition device 100 can be realized by software modules on any suitable electronic equipment such as desktop computers, tablet computers, smart phones, servers, and the like.
- a software module installed on the control computer of the security inspection CT system or a software module installed on a server connected to the security inspection CT scanning device in the security inspection CT system.
- the processor of the security inspection CT target object recognition device 100 can execute the security inspection CT target object recognition method described later.
- the security CT target recognition device 100 may further include a memory (not shown), a communication module (not shown), and the like.
- the memory of the security CT target object recognition device 100 can store the steps of executing the security CT target object recognition method described later, as well as data related to the security CT target object recognition.
- the memory may be, for example, ROM (ReadOnly Memory image, read-only memory), RAM (Random Access Memory, random access memory), etc.
- the memory has a storage space for program codes for executing any step in the above security inspection CT target identification method. When these program codes are read and executed by the processor, the above-mentioned security CT target recognition method is executed.
- These program codes can be read from or written into one or more computer program products.
- These computer program products comprise program code carriers such as hard disks, compact discs (CDs), memory cards or floppy disks. Such computer program products are typically portable or fixed storage units.
- Program codes for executing any of the steps in the above method can also be downloaded via the network.
- the program code can eg be compressed in a suitable form.
- the communication module in the security CT object recognition device 100 may support establishing a direct (eg, wired) communication channel or a wireless communication channel between the security CT target recognition device 100 and an external electronic device, and perform communication via the established communication channel.
- the communication module receives three-dimensional CT data and the like from a CT scanning device via a network.
- the security inspection CT target recognition device 100 may further include output units such as a display, a microphone, and a speaker, so as to output the target recognition result.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Pulmonology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Radiology & Medical Imaging (AREA)
- Immunology (AREA)
- General Life Sciences & Earth Sciences (AREA)
- High Energy & Nuclear Physics (AREA)
- Pathology (AREA)
- Geophysics (AREA)
- Analysing Materials By The Use Of Radiation (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (19)
- 一种安检CT目标物识别方法,包括:对三维CT数据进行降维来生成多个二维降维视图;针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,所述多个二维视图包含所述多个二维降维视图;以及对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果。
- 如权利要求1所述的安检CT目标物识别方法,其中,对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果包括:将所述二维语义描述集合利用反投影法映射到三维空间,获得三维概率图;以及对所述三维概率图进行特征提取,获得所述目标物的三维识别结果。
- 如权利要求2所述的安检CT目标物识别方法,其中,将所述二维语义描述集合利用反投影法映射到三维空间,获得三维概率图包括:通过体素驱动或像素驱动进行二维语义描述集合到三维空间的映射,获得语义特征矩阵,并将语义特征矩阵压缩为三维概率图。
- 如权利要求3所述的安检CT目标物识别方法,其中,所述体素驱动包括:将所述三维CT数据中的每一个体素对应到每幅所述二维视图中的像素,查询并累积所述像素所对应的二维语义描述信息,生成所述语义特征矩阵,所述像素驱动包括:所述二维视图中的每个像素对应于所述三维CT数据中的一条直线,遍历每一幅二维视图中的每一个像素或感兴趣区域的每一个像素,沿直线 向三维空间中传播该像素所对应的二维语义描述信息,生成所述语义特征矩阵,其中,所述感兴趣区域由所述二维语义描述集合给出。
- 如权利要求4所述的安检CT目标物识别方法,其中,在所述体素驱动或所述像素驱动中,通过映射函数或查找表来获得所述体素和所述像素的对应关系。
- 如权利要求2所述的安检CT目标物识别方法,其中,对所述三维概率图进行特征提取,获得所述目标物的三维识别结果包括:对所述三维概率图,采用图像处理方法、经典的机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行特征提取,从而获得三维图像语义描述集合,作为所述三维识别结果。
- 如权利要求6所述的安检CT目标物识别方法,其中,对三维概率图进行二值化,得到三维二值图;对三维二值图进行连通区域分析,获得连通区域;针对所述连通区域生成三维图像语义描述集合。
- 如权利要求7所述的安检CT目标物识别方法,其中,所述连通区域分析包括:对所述三维二值图进行连通分量标记,针对每个标记区域,进行掩模操作,得到所述连通区域。
- 如权利要求7所述的安检CT目标物识别方法,其中,针对所述连通区域生成三维图像语义描述集合包括:提取所述连通区域内的所有概率值,进行主成分分析得到分析集合,并将所述分析集合作为物体有效体素区域,统计出三维图像语义描述集合。
- 如权利要求6所述的安检CT目标物识别方法,其中,所述三维图像语义描述集合以体素、三维感兴趣区域、三维CT图像中的一个或多个为单位,包含:类别信息和/或置信度;或者,所述三维图像语义描述集合以三维感兴趣区域和/或三维CT图像为单位,包含:类别信息、目标物的位置信息、置信度中的至少一个。
- 如权利要求10所述的安检CT目标物识别方法,其中,所述位置信息包含三维包围盒。
- 如权利要求1所述的安检CT目标物识别方法,其中,所述二维语义描述集合以像素、感兴趣区域、二维图像中的一个或多个为单位,包含类别信息和/或置信度,或者,所述二维语义描述集合以感兴趣区域和/或二维图像为单位,包含类别信息、置信度、目标物的位置信息中的至少一个。
- 如权利要求1所述的安检CT目标物识别方法,其中,针对所述多个二维视图中的每一个进行目标物识别包括:采用用于二维图像的图像处理方法、经典机器学习方法、深度学习方法中的至少一种或者相结合的方式来进行目标物识别。
- 如权利要求1所述的安检CT目标物识别方法,其中,对三维CT数据进行降维来生成多个二维降维视图包括:对所述三维CT数据设定多个方向;以及按照所述多个方向进行投影或渲染。
- 如权利要求14所述的安检CT目标物识别方法,其中,所述多个方向为任意的方向,而不限于检测过程中物体行进方向的正交方向。
- 如权利要求1至15中的任一项所述的安检CT目标物识别方法,其中,所述多个二维视图还包含二维DR图,所述二维DR图是由DR成像装置得到的。
- 如权利要求16所述的安检CT目标物识别方法,其中,把所述三维识别结果投影到所述二维DR图,再作为二维DR图的识别结果输出。
- 一种安检CT目标物识别装置,包括:降维模块,对三维CT数据进行降维来生成多个二维降维视图;二维识别模块,针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,所述多个二维视图包含所述多个二维降维视图;以及升维模块,对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果。
- 一种机器可读存储介质,存储有程序,所述程序能够使计算机执行:对三维CT数据进行降维来生成多个二维降维视图;针对多个二维视图进行目标物识别,获得目标物的二维语义描述集合,所述多个二维视图包含所述多个二维降维视图;以及对所述二维语义描述集合进行升维,获得所述目标物的三维识别结果。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020247003300A KR20240025683A (ko) | 2021-08-27 | 2022-07-08 | 보안검사 ct 목표물 식별 방법 및 장치 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110998653.3A CN113792623B (zh) | 2021-08-27 | 2021-08-27 | 安检ct目标物识别方法和装置 |
CN202110998653.3 | 2021-08-27 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023024726A1 true WO2023024726A1 (zh) | 2023-03-02 |
Family
ID=79182374
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/104606 WO2023024726A1 (zh) | 2021-08-27 | 2022-07-08 | 安检ct目标物识别方法和装置 |
Country Status (3)
Country | Link |
---|---|
KR (1) | KR20240025683A (zh) |
CN (2) | CN113792623B (zh) |
WO (1) | WO2023024726A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116453063A (zh) * | 2023-06-12 | 2023-07-18 | 中广核贝谷科技有限公司 | 基于dr图像与投影图融合的目标检测识别方法及系统 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113792623B (zh) * | 2021-08-27 | 2022-12-13 | 同方威视技术股份有限公司 | 安检ct目标物识别方法和装置 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150332448A1 (en) * | 2012-12-27 | 2015-11-19 | Nuctech Company Limited | Object detection methods, display methods and apparatuses |
CN108882897A (zh) * | 2015-10-16 | 2018-11-23 | 瓦里安医疗系统公司 | 图像引导放射疗法中的迭代图像重建 |
CN109493417A (zh) * | 2018-10-31 | 2019-03-19 | 深圳大学 | 三维物体重建方法、装置、设备和存储介质 |
CN109975335A (zh) | 2019-03-07 | 2019-07-05 | 北京航星机器制造有限公司 | 一种ct检测方法及装置 |
CN111968240A (zh) * | 2020-09-04 | 2020-11-20 | 中国科学院自动化研究所 | 基于主动学习的摄影测量网格的三维语义标注方法 |
US20210049397A1 (en) * | 2018-10-16 | 2021-02-18 | Tencent Technology (Shenzhen) Company Limited | Semantic segmentation method and apparatus for three-dimensional image, terminal, and storage medium |
CN112598619A (zh) * | 2020-11-23 | 2021-04-02 | 西安科锐盛创新科技有限公司 | 基于迁移学习的颅内血管模拟三维狭窄化模型的建立方法 |
CN113792623A (zh) * | 2021-08-27 | 2021-12-14 | 同方威视技术股份有限公司 | 安检ct目标物识别方法和装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104658028B (zh) * | 2013-11-18 | 2019-01-22 | 清华大学 | 在三维图像中快速标记目标物的方法和装置 |
JP7005622B2 (ja) * | 2017-07-12 | 2022-01-21 | 株式会社ソニー・インタラクティブエンタテインメント | 認識処理装置、認識処理方法及びプログラム |
CN109166183B (zh) * | 2018-07-16 | 2023-04-07 | 中南大学 | 一种解剖标志点识别方法及识别设备 |
CN112444784B (zh) * | 2019-08-29 | 2023-11-28 | 北京市商汤科技开发有限公司 | 三维目标检测及神经网络的训练方法、装置及设备 |
CN111652966B (zh) * | 2020-05-11 | 2021-06-04 | 北京航空航天大学 | 一种基于无人机多视角的三维重建方法及装置 |
-
2021
- 2021-08-27 CN CN202110998653.3A patent/CN113792623B/zh active Active
- 2021-08-27 CN CN202211300926.3A patent/CN115661810A/zh active Pending
-
2022
- 2022-07-08 WO PCT/CN2022/104606 patent/WO2023024726A1/zh active Application Filing
- 2022-07-08 KR KR1020247003300A patent/KR20240025683A/ko unknown
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150332448A1 (en) * | 2012-12-27 | 2015-11-19 | Nuctech Company Limited | Object detection methods, display methods and apparatuses |
CN108882897A (zh) * | 2015-10-16 | 2018-11-23 | 瓦里安医疗系统公司 | 图像引导放射疗法中的迭代图像重建 |
US20210049397A1 (en) * | 2018-10-16 | 2021-02-18 | Tencent Technology (Shenzhen) Company Limited | Semantic segmentation method and apparatus for three-dimensional image, terminal, and storage medium |
CN109493417A (zh) * | 2018-10-31 | 2019-03-19 | 深圳大学 | 三维物体重建方法、装置、设备和存储介质 |
CN109975335A (zh) | 2019-03-07 | 2019-07-05 | 北京航星机器制造有限公司 | 一种ct检测方法及装置 |
CN111968240A (zh) * | 2020-09-04 | 2020-11-20 | 中国科学院自动化研究所 | 基于主动学习的摄影测量网格的三维语义标注方法 |
CN112598619A (zh) * | 2020-11-23 | 2021-04-02 | 西安科锐盛创新科技有限公司 | 基于迁移学习的颅内血管模拟三维狭窄化模型的建立方法 |
CN113792623A (zh) * | 2021-08-27 | 2021-12-14 | 同方威视技术股份有限公司 | 安检ct目标物识别方法和装置 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116453063A (zh) * | 2023-06-12 | 2023-07-18 | 中广核贝谷科技有限公司 | 基于dr图像与投影图融合的目标检测识别方法及系统 |
CN116453063B (zh) * | 2023-06-12 | 2023-09-05 | 中广核贝谷科技有限公司 | 基于dr图像与投影图融合的目标检测识别方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
CN115661810A (zh) | 2023-01-31 |
CN113792623A (zh) | 2021-12-14 |
CN113792623B (zh) | 2022-12-13 |
KR20240025683A (ko) | 2024-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tong et al. | Improved U-NET network for pulmonary nodules segmentation | |
WO2023024726A1 (zh) | 安检ct目标物识别方法和装置 | |
WO2021139324A1 (zh) | 图像识别方法、装置、计算机可读存储介质及电子设备 | |
Li et al. | Saliency detection via dense and sparse reconstruction | |
Lin et al. | Line segment extraction for large scale unorganized point clouds | |
Cheng et al. | Global contrast based salient region detection | |
CN108615237A (zh) | 一种肺部图像处理方法及图像处理设备 | |
CN110956632B (zh) | 钼靶图像中胸大肌区域自动检测方法及装置 | |
CN109363697B (zh) | 一种乳腺影像病灶识别的方法及装置 | |
Cheng et al. | Fabric defect detection based on separate convolutional UNet | |
CN104281856B (zh) | 用于脑部医学图像分类的图像预处理方法和系统 | |
US9508120B2 (en) | System and method for computer vision item recognition and target tracking | |
Liu et al. | Extracting lungs from CT images via deep convolutional neural network based segmentation and two-pass contour refinement | |
Zhao et al. | Region-based saliency estimation for 3D shape analysis and understanding | |
Manh et al. | Small object segmentation based on visual saliency in natural images | |
Ren et al. | How important is location information in saliency detection of natural images | |
CN111310531A (zh) | 图像分类方法、装置、计算机设备及存储介质 | |
Wang et al. | Accurate saliency detection based on depth feature of 3D images | |
Chagnon-Forget et al. | Enhanced visual-attention model for perceptually improved 3D object modeling in virtual environments | |
US20230196748A1 (en) | Method and system for training neural network for entity detection | |
CN116420176A (zh) | 基于对象的图像表示来区分对象的不同配置状态的方法和设备 | |
US20240212336A1 (en) | Security check ct object recognition method and apparatus | |
JP2022067086A (ja) | デジタル化された筆記の処理 | |
CN113592807A (zh) | 一种训练方法、图像质量确定方法及装置、电子设备 | |
Wang | An algorithm for ATM recognition of spliced money based on image features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: P6000126/2024 Country of ref document: AE |
|
ENP | Entry into the national phase |
Ref document number: 2024504228 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20247003300 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020247003300 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18293704 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022860072 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022860072 Country of ref document: EP Effective date: 20240327 |