CN112561933A - Image segmentation method and device - Google Patents

Image segmentation method and device Download PDF

Info

Publication number
CN112561933A
CN112561933A CN202011480979.9A CN202011480979A CN112561933A CN 112561933 A CN112561933 A CN 112561933A CN 202011480979 A CN202011480979 A CN 202011480979A CN 112561933 A CN112561933 A CN 112561933A
Authority
CN
China
Prior art keywords
input data
preset
image segmentation
shift
item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011480979.9A
Other languages
Chinese (zh)
Inventor
陈海波
翟云龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenlan Artificial Intelligence Shenzhen Co Ltd
Original Assignee
Shenlan Artificial Intelligence Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenlan Artificial Intelligence Shenzhen Co Ltd filed Critical Shenlan Artificial Intelligence Shenzhen Co Ltd
Priority to CN202011480979.9A priority Critical patent/CN112561933A/en
Publication of CN112561933A publication Critical patent/CN112561933A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Abstract

The embodiment of the application relates to the technical field of image processing, and provides an image segmentation method and an image segmentation device, wherein the method comprises the following steps: acquiring a target image containing a target object; and inputting the target image into the image segmentation model to obtain a target object region in the target image output by the image segmentation model. The image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer splices at least two items of input data of the splicing layer based on a preset quantization coefficient and preset displacement information. Because the input data and the output data of the splicing layer are fixed-point data, the image segmentation can be realized by occupying less computer resources through the image segmentation model, the consumed time is short, and the image segmentation speed can be improved. And the convolutional neural network comprises a splicing layer, and at least two items of input data of the splicing layer can be spliced by presetting a quantization coefficient and presetting shift information.

Description

Image segmentation method and device
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image segmentation method and an image segmentation device.
Background
Image segmentation is a very important technique in computer vision tasks such as image recognition, and generally belongs to computer vision tasks at the pixel level. For example, image segmentation classifies pixels in an image, and the like. The accuracy of image segmentation can have a large impact on the results of computer vision tasks.
At present, in order to improve the accuracy of image segmentation, a neural network model is generally adopted for image segmentation. However, in the process of image segmentation using the neural network model, there are a lot of complex data operations, such as operations between floating point type model parameters of each network layer and image data that may be in decimal form, which results in a lot of computer resources and long time consumption when segmenting images. Moreover, when a neural network model is used for processing tasks, the neural network model often needs to be reconstructed, and the complexity of parameter processing of each network layer can cause low efficiency of model construction, so that the efficiency of image segmentation of the neural network model is greatly reduced.
In summary, when the neural network model is used for image segmentation, the process of building the neural network model by using the parameters of the complex data types is complex and occupies more resources, and the built neural network model occupies more computing resources when processing tasks, consumes long time and has low task processing efficiency.
Disclosure of Invention
The application provides an image segmentation method and device, which can realize image segmentation by occupying less computer resources, have short consumption time and can improve the image segmentation speed.
The application provides an image segmentation method, which comprises the following steps:
acquiring a target image containing a target object;
inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model;
the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data;
the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
According to the image segmentation method provided by the application, the number of the input data items is equal to the number of the preset quantization coefficients, and the input data items correspond to the preset quantization coefficients one to one.
According to the image segmentation method provided by the application, the splicing layer is specifically used for:
calculating the product of each item of input data and a preset quantization coefficient corresponding to each item of input data to obtain a first product result;
shifting the first product result based on the preset shifting information;
and splicing the shift processing results corresponding to all the input data.
According to the image segmentation method provided by the application, the preset quantization coefficient is determined off-line based on the following method:
determining a first type of fixed point coefficient corresponding to the output data of the splicing layer and a second type of fixed point coefficient corresponding to each item of input data;
and determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient and the preset shift information.
According to the image segmentation method provided by the application, each item of input data of the splicing layer corresponds to preset shift information respectively, the preset shift information comprises a shift direction and a shift digit, and the shift direction is right shift; accordingly, the number of the first and second electrodes,
the determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient, and the preset shift information specifically includes:
calculating the ratio of the first fixed point coefficient corresponding to the output data to the second fixed point coefficient corresponding to each item of input data;
calculating the product of the ratio and the exponential power of 2 to obtain a second product result, and performing rounding operation on the second product result to obtain a preset quantization coefficient corresponding to each item of input data;
wherein, the exponent is the corresponding shift bit number of each item of input data.
According to the image segmentation method provided by the application, the shift bit number in the preset shift information is 8-16 bits.
The present application also provides an image segmentation apparatus, including: the device comprises an image acquisition module and a segmentation module. Wherein the content of the first and second substances,
the image acquisition module is used for acquiring a target image containing a target object;
the segmentation module is used for inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model;
the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data;
the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
According to the image segmentation device provided by the application, the number of the input data is equal to the number of the preset quantization coefficients, and the input data corresponds to the preset quantization coefficients one to one.
According to the image segmentation device provided by the application, the splicing layer is specifically used for:
calculating the product of each item of input data and a preset quantization coefficient corresponding to each item of input data to obtain a first product result;
shifting the first product result based on the preset shifting information;
and splicing the shift processing results corresponding to all the input data.
The image segmentation device provided by the present application further includes a preset quantization coefficient offline determination module, configured to:
determining a first type of fixed point coefficient corresponding to each item of input data and a second type of fixed point coefficient corresponding to output data of the splicing layer;
and determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient and the preset shift information.
According to the image segmentation device provided by the application, each item of input data of the splicing layer corresponds to preset shift information respectively, the preset shift information comprises a shift direction and a shift digit, and the shift direction is right shift; accordingly, the number of the first and second electrodes,
the preset quantization coefficient offline determination module is specifically configured to:
calculating the ratio of the first fixed point coefficient corresponding to the output data to the second fixed point coefficient corresponding to each item of input data;
calculating the product of the ratio and the exponential power of 2 to obtain a second product result, and performing rounding operation on the second product result to obtain a preset quantization coefficient corresponding to each item of input data;
wherein, the exponent is the corresponding shift bit number of each item of input data.
The present application further provides an electronic device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of any of the image segmentation methods described above when executing the computer program.
The present application also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the image segmentation method as described in any of the above.
The image segmentation method and the image segmentation device provided by the application are characterized in that a target image containing a target object is obtained firstly; and then inputting the target image into the image segmentation model to obtain a target object region in the target image output by the image segmentation model. The image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer splices at least two items of input data of the splicing layer based on a preset quantization coefficient and preset displacement information. By the image segmentation model, the image segmentation can be realized by occupying less computer resources, the consumed time is short, and the image segmentation speed can be improved. Moreover, the convolutional neural network comprises a splicing layer, at least two items of input data of the splicing layer can be spliced by presetting a quantization coefficient and presetting shift information, the spliced data is also fixed-point data, and the input data of other layers behind the splicing layer in the convolutional neural network is also fixed-point data.
Drawings
In order to more clearly illustrate the technical solutions in the present application or the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart of an image segmentation method provided herein;
FIG. 2 is a schematic view illustrating an operation flow of a stitching layer of an image segmentation model in the image segmentation method provided by the present application;
fig. 3 is a schematic flowchart of an embodiment of determining a preset quantization coefficient in the image segmentation method provided in the present application;
fig. 4 is a schematic flowchart of an embodiment of determining a preset quantization coefficient in the image segmentation method provided in the present application;
FIG. 5 is a schematic structural diagram of an image segmentation apparatus provided in the present application;
FIG. 6 is a schematic structural diagram of an image segmentation apparatus provided in the present application;
fig. 7 is a schematic structural diagram of an electronic device provided in the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, in order to improve the accuracy of image segmentation, all the neural network models adopted are floating point type neural network models, that is, the model parameters and the processed data are floating point type data, which causes a large amount of complex data operations in the process of image segmentation by using the neural network models, for example, operations between the floating point type model parameters of each network layer and image data which may be in a decimal form, and causes a large amount of computer resources and long time consumption in image segmentation. Moreover, when a neural network model is used for processing tasks, the neural network model often needs to be reconstructed, and the complexity of parameter processing of each network layer can cause low efficiency of model construction, so that the efficiency of image segmentation of the neural network model is greatly reduced. Therefore, the embodiment of the application provides an image segmentation method to solve the problems in the prior art.
Fig. 1 is a schematic flowchart of an image segmentation method provided in an embodiment of the present application, and as shown in fig. 1, the method includes:
s11, acquiring a target image containing a target object;
s12, inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model;
the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data;
the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
Specifically, an execution subject of the image segmentation method provided in the embodiment of the present application is a server, which may specifically be a server, for example, the server may be a cloud server or a local server, and the local server may specifically be a computer, a tablet computer, a smart phone, and the like, which is not specifically limited in the embodiment of the present application.
In the image segmentation, step S11 is first executed to acquire a target image including a target object, such as a vehicle, a person, an article, a certain marker, and the like included in the target image. The target image is an image to be segmented, and a target object region in the image needs to be determined in a segmentation manner so as to be convenient for subsequent use. For example, the target object is a vehicle, and the target object region in the target image is a position of the vehicle in the target image. The target image can be represented by a three-dimensional matrix formed by two-dimensional pixel points on a plane and RGB channels, the values of all elements in the matrix are integers, and the value range can be [0,255 ].
Then, step S12 is executed, before the target image obtained in step S11 is input to the image segmentation model, the preprocessing operation may include adjusting the size of the target image, and performing normalization processing on the target image, and the image size may be adjusted to 64 × 256, and the pixel value in the target image is adjusted to be between-1 and 1. Then, the preprocessed target image is input to an image segmentation model, the image segmentation model performs segmentation processing on the target image, and a target object region in the target image is output. The image segmentation model is specifically constructed based on a Convolutional Neural Network (CNN), which is a kind of feed-forward Neural network (fed Neural network) containing convolution calculation and having a deep structure, and is one of the representative algorithms of deep learning (deep learning). Convolutional Neural Networks have a feature learning (rendering) capability, and can perform Shift-Invariant classification (Shift-Invariant classification) on input information according to a hierarchical structure thereof, and are therefore also called "Shift-Invariant Artificial Neural Networks (SIANN)".
The convolutional neural network can comprise an input layer, a hidden layer and an output layer, wherein the input layer is used for receiving a target image, the hidden layer is used for segmenting the target image, and the output layer is used for outputting a segmentation result. The hidden layer can comprise a convolution layer, a pooling layer and a full-connection layer, wherein the convolution layer is used for carrying out feature extraction to obtain a feature map, the obtained feature map is input into the pooling layer to carry out feature selection and information filtering, and the full-connection layer is positioned at the last part of the hidden layer and used for carrying out nonlinear combination on features. The feature map loses the spatial topology in the fully connected layer and is expanded into vectors. The convolution layer comprises a splicing layer, namely a Concat layer, so as to realize characteristic splicing, namely input data of the splicing layer are all characteristic matrixes, and output data are characteristic matrixes obtained after splicing.
The model parameters and the processed data in the convolutional neural network adopted in the embodiment of the application are fixed-point data, so the input data of the splicing layer and the output data are fixed-point data, but the existing splicing layer is the splicing effect aiming at the fact that the input data is the floating-point data, and the splicing effect aiming at the fact that the input data is the fixed-point data is not realized, so that the preset quantization coefficient and the preset displacement information are introduced in the embodiment of the application, the preset quantization coefficient is multiplied by the fixed-point input data, the multiplication result of the preset quantization coefficient and the fixed-point input data is displaced by the preset displacement information, the input data is multiplied by the preset quantization coefficient and is subjected to displacement operation, the input data can be directly spliced, and the spliced output data is also fixed-point data. It should be noted that, in the embodiment of the present application, the process of splicing at least two input data of the splicing layer based on the preset quantization coefficient and the preset shift information may be understood as a process of performing fixed-point calculation on the splicing layer. By presetting the quantization coefficients and the shifting information, the size adjustment of each item of input data can be realized together.
The preset quantization coefficients may be the same or different for different input data items, and may be fixed values or calculated according to fixed point coefficients adopted in conversion between fixed point type data and floating point type data of the input data. When the preset quantization coefficient is obtained through calculation, the preset quantization coefficient may be calculated online or calculated offline and then transmitted to the convolutional neural network, which is not specifically limited in the embodiment of the present application. It should be noted that the preset quantization coefficient is a positive integer greater than 0, that is, the preset quantization coefficient is an integer greater than or equal to 1, so as to ensure that the splicing layer in the convolutional neural network operates normally.
The preset shift information may include a shift direction and a shift bit number, the shift direction may include a leftward movement and a rightward movement, and the shift bit number may be set according to a requirement, which is not specifically limited in this embodiment of the application. The preset shift information may be the same or different for different items of input data, the shift direction may be fixed as moving to the left or moving to the right, or may be determined according to the actual value of the input data, the number of shift bits may be a fixed value, or may be determined according to the absolute value of the input data, which is not specifically limited in the embodiment of the present application. The number of shift bits is a positive integer greater than 0. When the multiplication result of the preset quantization coefficient and the fixed-point input data is shifted through the preset shift information, the operation of shifting can be represented by the product of the multiplication result and the exponent power of 2 or the ratio, wherein the exponent represents the shift digit, the product represents the left shift, and the ratio represents the right shift.
In the embodiment of the application, the image segmentation model is constructed by a convolutional neural network, and the convolutional neural network is trained to obtain the image segmentation model based on a sample image carrying an object region label. The object area label refers to identification information of the object area in the sample image. When the convolutional neural network is trained, a sample image is used as input, a prediction region output by the convolutional neural network is obtained, then a loss function is calculated through the prediction region and a target region label, and if the loss function is minimum, the training is finished. The sample image may also be pre-processed before being input to the convolutional neural network.
When the target image is segmented through the image segmentation model, the image segmentation model is constructed based on the convolutional neural network, and the data for operation are all fixed-point data, so that the image segmentation can be realized by occupying less computer resources, the consumed time is short, and the image segmentation speed can be improved. Moreover, the convolutional neural network comprises a splicing layer, at least two items of input data of the splicing layer can be spliced by presetting a quantization coefficient and presetting shift information, the spliced data is also fixed-point data, and the input data of other layers behind the splicing layer in the convolutional neural network is also fixed-point data.
The image segmentation method provided in the embodiment of the application comprises the steps of firstly obtaining a target image containing a target object; and then inputting the target image into the image segmentation model to obtain a target object region in the target image output by the image segmentation model. The image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer splices at least two items of input data of the splicing layer based on a preset quantization coefficient and preset displacement information. By the image segmentation model, the image segmentation can be realized by occupying less computer resources, the consumed time is short, and the image segmentation speed can be improved. Moreover, the convolutional neural network comprises a splicing layer, at least two items of input data of the splicing layer can be spliced by presetting a quantization coefficient and presetting shift information, the spliced data is also fixed-point data, and the input data of other layers behind the splicing layer in the convolutional neural network is also fixed-point data.
On the basis of the above embodiments, in the image segmentation method provided in this embodiment of the present application, the number of terms of the input data is equal to the number of the preset quantization coefficients, and the input data corresponds to the preset quantization coefficients one to one.
Specifically, in the embodiment of the application, the input data of the splicing layer may have multiple items, that is, at least two items, the number of items of the input data is equal to the number of preset quantization coefficients, and each item of the input data corresponds to one preset quantization coefficient, so that it can be ensured that each item of the input data is mutually independent and is multiplied by the corresponding preset quantization coefficient, so as to reflect the difference of different items of the input data, improve the accuracy of the adjustment of the input data, further improve the accuracy of the output data of the splicing layer, and ensure the accuracy of the image segmentation result.
As shown in fig. 2, based on the above embodiment, in the image segmentation method provided in the embodiment of the present application, the stitching layer is specifically configured to:
s21, calculating the product of each item of input data and the preset quantization coefficient corresponding to each item of input data to obtain a first product result;
s22, shifting the first product result based on the preset shifting information;
and S23, splicing the shift processing results corresponding to all the input data.
Specifically, when the splicing layer splices various items of input data, a product of each item of input data and a preset quantization coefficient corresponding to each item of input data may be calculated first, so as to obtain a first product result. And then, according to preset displacement information, performing displacement processing on the first product result, and finally splicing all displacement processing results. For example, the mosaic layer has two input data, which are fixed-point data, y1 and y2, respectively, y1 corresponds to a preset quantization coefficient s1, y2 corresponds to a preset quantization coefficient s2, y1 has a shift bit number of n1, the shift direction is right shift, y2 has a shift bit number of n2, and the shift direction is right shift.
The shift processing result of y1 can be expressed as:
Figure BDA0002837550000000111
the shift processing result of y2 can be expressed as:
Figure BDA0002837550000000112
the output data y3 obtained by splicing all the shift processing results can be represented as:
Figure BDA0002837550000000113
where, | | represents the splice operator.
In the embodiment of the application, the splicing layer obtains a first product result by calculating the product of each item of input data and the preset quantization coefficient corresponding to each item of input data, and shifts the obtained first product result according to the preset shift information, so that the shift processing results corresponding to all the input data can be spliced. Moreover, the whole process aims at processing the fixed-point data, and the splicing efficiency is improved.
As shown in fig. 3, based on the above embodiment, in the image segmentation method provided in the embodiment of the present application, the preset quantization coefficient is determined offline based on the following method:
s31, determining a first fixed point coefficient corresponding to the output data of the splicing layer and a second fixed point coefficient corresponding to each item of input data;
and S32, determining a preset quantization coefficient corresponding to each item of input data based on the first type fixed point coefficient, the second type fixed point coefficient and the preset shift information.
Specifically, the preset quantization coefficient in the embodiment of the present application may be determined offline and then used by the splicing layer. When the preset quantization coefficient is determined offline, a first type of fixed point coefficient corresponding to output data of a splicing layer and a second type of fixed point coefficient corresponding to each input data are determined. The first type of fixed point coefficient corresponding to the output data is a conversion coefficient between fixed point type data and floating point type data of the output data, and the second type of fixed point coefficient corresponding to each item of input data is a conversion coefficient between the fixed point type data and the floating point type data of each item of input data. Both the first type fixed point coefficient and the second type fixed point coefficient can be obtained through statistics, and the determination method in the embodiment of the present application is not specifically limited herein.
After the first type of fixed point coefficient and the second type of fixed point coefficient are determined, the preset quantization coefficient corresponding to each item of input data is determined according to the first type of fixed point coefficient and the second type of fixed point coefficient and by combining preset positioning information.
In the embodiment of the application, the preset quantization coefficient is determined in an off-line mode, so that the computing resources for image segmentation can be saved, the splicing speed can be increased, and the image segmentation efficiency can be improved.
As shown in fig. 4, on the basis of the above embodiment, in the image segmentation method provided in this embodiment of the present application, each item of input data of the splicing layer corresponds to a preset shift information, where the preset shift information includes a shift direction and a shift digit, and the shift direction is a right shift;
accordingly, the number of the first and second electrodes,
the determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient, and the preset shift information specifically includes:
s41, calculating the ratio of the first fixed point coefficient corresponding to the output data to the second fixed point coefficient corresponding to each input data;
s42, calculating the product of the ratio and the exponential power of 2 to obtain a second product result, and performing rounding operation on the second product result to obtain a preset quantization coefficient corresponding to each item of input data;
wherein, the exponent is the corresponding shift bit number of each item of input data.
Specifically, each item of input data corresponds to a preset shifting message, so that mutual independence of shifting operations of each item of input data can be realized, and the shifting operations are more personalized and targeted. The preset shift information includes a shift direction and a shift digit, the shift direction is a right shift, and the shift digit can be set as required, which is not specifically limited in the embodiment of the present application.
The rounding operation may be rounding down, rounding up, or rounding down, which is not specifically limited in the embodiments of the present application. For example, if the first fixed point coefficient corresponding to y3 is q3, the second fixed point coefficient corresponding to y1 is q1, the corresponding shift bit number is n1, the second fixed point coefficient corresponding to y2 is q2, and the corresponding shift bit number is n2, then the preset quantization coefficient s1 corresponding to y1 may be obtained by first determining s1 'through the following formula and performing a rounding operation on s 1':
s1′=(q3/q1)*2n1
y2 may determine s2 'by the following formula, and perform a rounding operation on s 2' to obtain the preset quantization coefficient s 2:
s2′=(q3/q2)*2n2
in the embodiment of the application, the specific method for determining the preset quantization coefficient in an off-line manner is simple and feasible, the computing resource of image segmentation can be saved, the splicing speed can be increased, and the image segmentation efficiency can be further improved.
If the floating-point data corresponding to y1 is y1 ', the floating-point data corresponding to y2 is y2 ', the splicing layer can splice y1 ' and y2 ' directly, and the obtained splicing result is floating-point data y3 ', that is:
y3′=y1′||y2′
transforming the above equation can result in:
Figure BDA0002837550000000131
Figure BDA0002837550000000132
Figure BDA0002837550000000133
order: s1 ═ (q3/q1) × 2n1,s2′=(q3/q2)*2n2Rounding s1 'to obtain s1 and rounding s 2' to obtain s 2.
Then there are:
Figure BDA0002837550000000141
in line with the above calculation formula of the output data y3, the realizability of the function of the splice layer in the present application is demonstrated.
On the basis of the above embodiments, in the image segmentation method provided in the embodiments of the present application, the number of shift bits is 8-16 bits. The larger the shift digit is, the smaller the precision loss is, and the smaller the shift digit is, the smaller the storage space occupation is, so that a suitable shift digit value can be selected according to the precision loss requirement and the storage space requirement, which is not specifically limited in the embodiment of the application.
On the basis of the above embodiments, in the image segmentation method provided in this embodiment of the present application, the shift direction in the preset shift information is determined based on the true value of the corresponding data, and the shift bit number in the preset shift information is determined based on the absolute value of the corresponding data.
Specifically, in the embodiment of the present application, when determining the shift direction in the preset shift information, the real value of the corresponding data may be compared with the preset value, and if the real value is greater than or equal to the preset value, the shift direction may be rightward, otherwise, the shift direction may be leftward. The preset value may be set as needed, which is not specifically limited in the embodiment of the present application.
When the number of shift bits in the preset shift information is determined, a value of an exponent in the power of 2 that is closest to the absolute value of the corresponding data may be determined, and then the value is taken as the number of shift bits.
In the embodiment of the application, a specific method for determining the shift direction and the shift digit in the preset shift information is provided, the method is simple and easy to implement, the computing resource of image segmentation can be saved, the splicing speed can be increased, and the image segmentation efficiency can be further improved.
As shown in fig. 5, on the basis of the above embodiment, an image segmentation apparatus provided in the embodiment of the present application includes: an image acquisition module 51 and a segmentation module 52. Wherein the content of the first and second substances,
the image acquiring module 51 is used for acquiring a target image containing a target object;
the segmentation module 52 is configured to input the target image into an image segmentation model, and obtain a target object region in the target image output by the image segmentation model;
the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data;
the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
On the basis of the foregoing embodiments, an embodiment of the present application provides an image segmentation apparatus, where the number of terms of the input data is equal to the number of preset quantization coefficients, and the input data corresponds to the preset quantization coefficients one to one.
On the basis of the foregoing embodiments, an image segmentation apparatus is provided in an embodiment of the present application, where the stitching layer is specifically configured to:
calculating the product of each item of input data and a preset quantization coefficient corresponding to each item of input data to obtain a first product result;
shifting the first product result based on the preset shifting information;
and splicing the shift processing results corresponding to all the input data.
As shown in fig. 6, on the basis of the foregoing embodiment, an image segmentation apparatus provided in the embodiment of the present application further includes a preset quantization coefficient offline determining module 53, configured to:
determining a first type of fixed point coefficient corresponding to each item of input data and a second type of fixed point coefficient corresponding to output data of the splicing layer;
and determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient and the preset shift information.
On the basis of the foregoing embodiments, an image segmentation apparatus is provided in this embodiment of the present application, where each item of input data of the splicing layer corresponds to a preset shift information, where the preset shift information includes a shift direction and a shift digit, and the shift direction is a right shift; accordingly, the number of the first and second electrodes,
the preset quantization coefficient offline determination module is specifically configured to:
calculating the ratio of the first fixed point coefficient corresponding to the output data to the second fixed point coefficient corresponding to each item of input data;
calculating the product of the ratio and the exponential power of 2 to obtain a second product result, and performing rounding operation on the second product result to obtain a preset quantization coefficient corresponding to each item of input data;
wherein, the exponent is the corresponding shift bit number of each item of input data.
On the basis of the above embodiments, an image segmentation apparatus is provided in the embodiments of the present application, where the number of shift bits in the preset shift information is 8-16 bits.
On the basis of the foregoing embodiments, an image segmentation apparatus is provided in an embodiment of the present application, where a shift direction in the preset shift information is determined based on a true value of corresponding data, and a shift bit number in the preset shift information is determined based on an absolute value of the corresponding data.
The image segmentation apparatus provided in the embodiment of the present application is configured to execute the image segmentation method, and a specific implementation manner of the image segmentation apparatus is consistent with that of the image segmentation method provided in the embodiment of the present application, and the same beneficial effects can be achieved, and details are not repeated here.
Fig. 7 illustrates a physical structure diagram of an electronic device, and as shown in fig. 7, the electronic device may include: a processor (processor)710, a communication Interface (Communications Interface)720, a memory (memory)730, and a communication bus 740, wherein the processor 710, the communication Interface 720, and the memory 730 communicate with each other via the communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform an image segmentation method comprising: acquiring a target image containing a target object; inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model; the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data; the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
In addition, the logic instructions in the memory 730 can be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The processor 710 in the electronic device provided in the embodiment of the present application may call a logic instruction in the memory 830 to implement the image segmentation method, and a specific implementation manner of the method is consistent with that of the image segmentation method provided in the embodiment of the present application, and the same beneficial effects may be achieved, which is not described herein again.
On the other hand, the present application further provides a computer program product, which is described below, and the computer program product described below and the image segmentation method described above may be referred to in correspondence with each other.
The computer program product comprises a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the image segmentation method provided by the above methods, the method comprising: acquiring a target image containing a target object; inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model; the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data; the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
When the computer program product provided in the embodiment of the present application is executed, the image segmentation method is implemented, and the specific implementation manner of the image segmentation method is consistent with the implementation manner of the image segmentation method provided in the embodiment of the present application, and the same beneficial effects can be achieved, and details are not described here.
In yet another aspect, the present application further provides a non-transitory computer-readable storage medium, which is described below, and the non-transitory computer-readable storage medium described below and the image segmentation method described above are referred to in correspondence with each other.
The present application also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform the image segmentation method provided above, the method comprising: acquiring a target image containing a target object; inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model; the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data; the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
When the computer program stored on the non-transitory computer readable storage medium provided in the embodiment of the present application is executed, the image segmentation method is implemented, and a specific implementation manner of the image segmentation method is consistent with that of the image segmentation method provided in the embodiment of the present application, and the same beneficial effects can be achieved, and details are not repeated here.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (13)

1. An image segmentation method, comprising:
acquiring a target image containing a target object;
inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model;
the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data;
the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
2. The image segmentation method according to claim 1, wherein the number of terms of the input data is equal to the number of the preset quantized coefficients, and the input data corresponds to the preset quantized coefficients one to one.
3. The image segmentation method according to claim 2, wherein the stitching layer is specifically configured to:
calculating the product of each item of input data and a preset quantization coefficient corresponding to each item of input data to obtain a first product result;
shifting the first product result based on the preset shifting information;
and splicing the shift processing results corresponding to all the input data.
4. The image segmentation method according to claim 2, wherein the preset quantization coefficient is determined offline based on:
determining a first type of fixed point coefficient corresponding to the output data of the splicing layer and a second type of fixed point coefficient corresponding to each item of input data;
and determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient and the preset shift information.
5. The image segmentation method according to claim 4, wherein each item of input data of the mosaic layer corresponds to a preset shift information, the preset shift information includes a shift direction and a shift bit number, and the shift direction is a right shift; accordingly, the number of the first and second electrodes,
the determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient, and the preset shift information specifically includes:
calculating the ratio of the first fixed point coefficient corresponding to the output data to the second fixed point coefficient corresponding to each item of input data;
calculating the product of the ratio and the exponential power of 2 to obtain a second product result, and performing rounding operation on the second product result to obtain a preset quantization coefficient corresponding to each item of input data;
wherein, the exponent is the corresponding shift bit number of each item of input data.
6. The image segmentation method according to claim 5, wherein the number of the shift bits in the predetermined shift information is 8-16 bits.
7. An image segmentation apparatus, comprising:
the image acquisition module is used for acquiring a target image containing a target object;
the segmentation module is used for inputting the target image into an image segmentation model to obtain a target object region in the target image output by the image segmentation model;
the image segmentation model is constructed based on a convolutional neural network, the convolutional neural network comprises a splicing layer, and the splicing layer is used for splicing at least two items of input data of the splicing layer based on a preset quantization coefficient and preset shift information; the preset quantization coefficient is used for multiplying the input data, and the preset shifting information is used for shifting the multiplication result of the preset quantization coefficient and the input data;
the input data and the output data are fixed-point data, and the image segmentation model is obtained based on sample image training with object region labels.
8. The image segmentation apparatus according to claim 7, wherein the number of terms of the input data is equal to the number of the preset quantized coefficients, and the input data corresponds to the preset quantized coefficients one to one.
9. The image segmentation apparatus as set forth in claim 8, wherein the stitching layer is specifically configured to:
calculating the product of each item of input data and a preset quantization coefficient corresponding to each item of input data to obtain a first product result;
shifting the first product result based on the preset shifting information;
and splicing the shift processing results corresponding to all the input data.
10. The image segmentation apparatus as claimed in claim 9, further comprising a pre-set quantization coefficient offline determination module configured to:
determining a first type of fixed point coefficient corresponding to each item of input data and a second type of fixed point coefficient corresponding to output data of the splicing layer;
and determining a preset quantization coefficient corresponding to each item of input data based on the first type of fixed point coefficient, the second type of fixed point coefficient and the preset shift information.
11. The image segmentation apparatus according to claim 10, wherein each input data of the mosaic layer corresponds to a preset shift information, the preset shift information includes a shift direction and a shift bit number, and the shift direction is a right shift; accordingly, the number of the first and second electrodes,
the preset quantization coefficient offline determination module is specifically configured to:
calculating the ratio of the first fixed point coefficient corresponding to the output data to the second fixed point coefficient corresponding to each item of input data;
calculating the product of the ratio and the exponential power of 2 to obtain a second product result, and performing rounding operation on the second product result to obtain a preset quantization coefficient corresponding to each item of input data;
wherein, the exponent is the corresponding shift bit number of each item of input data.
12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the image segmentation method according to any one of claims 1 to 6 are implemented when the program is executed by the processor.
13. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the image segmentation method according to any one of claims 1 to 6.
CN202011480979.9A 2020-12-15 2020-12-15 Image segmentation method and device Pending CN112561933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011480979.9A CN112561933A (en) 2020-12-15 2020-12-15 Image segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011480979.9A CN112561933A (en) 2020-12-15 2020-12-15 Image segmentation method and device

Publications (1)

Publication Number Publication Date
CN112561933A true CN112561933A (en) 2021-03-26

Family

ID=75063825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011480979.9A Pending CN112561933A (en) 2020-12-15 2020-12-15 Image segmentation method and device

Country Status (1)

Country Link
CN (1) CN112561933A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023000898A1 (en) * 2021-07-20 2023-01-26 腾讯科技(深圳)有限公司 Image segmentation model quantization method and apparatus, computer device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688855A (en) * 2016-08-12 2018-02-13 北京深鉴科技有限公司 It is directed to the layered quantization method and apparatus of Complex Neural Network
CN108009625A (en) * 2016-11-01 2018-05-08 北京深鉴科技有限公司 Method for trimming and device after artificial neural network fixed point
CN111612008A (en) * 2020-05-21 2020-09-01 苏州大学 Image segmentation method based on convolution network
CN111656315A (en) * 2019-05-05 2020-09-11 深圳市大疆创新科技有限公司 Data processing method and device based on convolutional neural network architecture
CN111931917A (en) * 2020-08-20 2020-11-13 浙江大华技术股份有限公司 Forward computing implementation method and device, storage medium and electronic device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107688855A (en) * 2016-08-12 2018-02-13 北京深鉴科技有限公司 It is directed to the layered quantization method and apparatus of Complex Neural Network
CN108009625A (en) * 2016-11-01 2018-05-08 北京深鉴科技有限公司 Method for trimming and device after artificial neural network fixed point
CN111656315A (en) * 2019-05-05 2020-09-11 深圳市大疆创新科技有限公司 Data processing method and device based on convolutional neural network architecture
WO2020223856A1 (en) * 2019-05-05 2020-11-12 深圳市大疆创新科技有限公司 Data processing method and device based on convolutional neural network architecture
CN111612008A (en) * 2020-05-21 2020-09-01 苏州大学 Image segmentation method based on convolution network
CN111931917A (en) * 2020-08-20 2020-11-13 浙江大华技术股份有限公司 Forward computing implementation method and device, storage medium and electronic device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BENOIT JACOB ET.AL: "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference", 《ARXIV:1712.05877V1 [CS.LG] 》, pages 1 - 14 *
JERMMY: "神经网络量化入门--Add和Concat", pages 1 - 6, Retrieved from the Internet <URL:https://mp.weixin.qq.com/s/Qt3Su8M9ntHoY_DzK78I6g> *
RAGHURAMAN KRISHNAMOORTHI: "Quantizing deep convolutional networks for efficient inference: A whitepaper", 《ARXIV:1806.08342V1 [CS.LG]》, pages 1 - 36 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023000898A1 (en) * 2021-07-20 2023-01-26 腾讯科技(深圳)有限公司 Image segmentation model quantization method and apparatus, computer device, and storage medium

Similar Documents

Publication Publication Date Title
CN108229479B (en) Training method and device of semantic segmentation model, electronic equipment and storage medium
CN112233038B (en) True image denoising method based on multi-scale fusion and edge enhancement
CN109035319B (en) Monocular image depth estimation method, monocular image depth estimation device, monocular image depth estimation apparatus, monocular image depth estimation program, and storage medium
Li et al. Dynamic scene deblurring by depth guided model
CN111488985B (en) Deep neural network model compression training method, device, equipment and medium
CN110610526B (en) Method for segmenting monocular image and rendering depth of field based on WNET
CN112308866B (en) Image processing method, device, electronic equipment and storage medium
EP4318313A1 (en) Data processing method, training method for neural network model, and apparatus
CN112767279A (en) Underwater image enhancement method for generating countermeasure network based on discrete wavelet integration
CN111833360A (en) Image processing method, device, equipment and computer readable storage medium
CN114170290A (en) Image processing method and related equipment
CN114792355A (en) Virtual image generation method and device, electronic equipment and storage medium
CN112561933A (en) Image segmentation method and device
CN110110775A (en) A kind of matching cost calculation method based on hyper linking network
CN112541438A (en) Text recognition method and device
CN112580492A (en) Vehicle detection method and device
CN112509144A (en) Face image processing method and device, electronic equipment and storage medium
EP4345771A1 (en) Information processing method and apparatus, and computer device and storage medium
US20230143985A1 (en) Data feature extraction method and related apparatus
CN116309158A (en) Training method, three-dimensional reconstruction method, device, equipment and medium of network model
CN112532251A (en) Data processing method and device
CN112949504B (en) Stereo matching method, device, equipment and storage medium
CN115578624A (en) Agricultural disease and pest model construction method, detection method and device
CN115409159A (en) Object operation method and device, computer equipment and computer storage medium
CN114998172A (en) Image processing method and related system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination