CN110222726A - Image processing method, device and electronic equipment - Google Patents
Image processing method, device and electronic equipment Download PDFInfo
- Publication number
- CN110222726A CN110222726A CN201910403859.XA CN201910403859A CN110222726A CN 110222726 A CN110222726 A CN 110222726A CN 201910403859 A CN201910403859 A CN 201910403859A CN 110222726 A CN110222726 A CN 110222726A
- Authority
- CN
- China
- Prior art keywords
- image
- layer
- layers
- convolution
- sampling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 17
- 238000005070 sampling Methods 0.000 claims abstract description 86
- 238000012545 processing Methods 0.000 claims abstract description 57
- 230000011218 segmentation Effects 0.000 claims abstract description 57
- 238000000034 method Methods 0.000 claims abstract description 32
- 230000004927 fusion Effects 0.000 claims abstract description 8
- 230000006870 function Effects 0.000 claims description 19
- 239000013598 vector Substances 0.000 claims description 19
- 238000000605 extraction Methods 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 4
- 238000004364 calculation method Methods 0.000 description 13
- 238000004590 computer program Methods 0.000 description 13
- 238000011176 pooling Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 238000013528 artificial neural network Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 9
- 239000000284 extract Substances 0.000 description 7
- 238000003062 neural network model Methods 0.000 description 7
- 238000011156 evaluation Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000003542 behavioural effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Provide a kind of image processing method, device and electronic equipment in the embodiment of the present disclosure, belong to technical field of data processing, this method comprises: obtain include target object the first image;The segmentation network that image procossing is carried out to the first image is set;In the segmentation network after second down-sampling layer, the parallel convolutional layer of multiple and different sample rates is set, the parallel convolutional layer is used to handle the image of second down-sampling layer output, and the characteristics of image extracted on each parallel convolutional layer forms the second image by way of fusion;By carrying out target identification to second image, the third image comprising the target object is obtained.By the scheme of the disclosure, the accuracy of target identification is improved.By the processing scheme of the disclosure, the stylized effect of image can be set in real time.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to an image processing method and apparatus, and an electronic device.
Background
With the development of artificial intelligence technology, more and more image processing work can be completed in an artificial intelligence mode, and a neural network is fully applied to the field of computer image recognition as an implementation means of artificial intelligence. For example, different people are identified in the image, or different objects on the road are automatically identified in unmanned driving. These all constitute the specific content of image semantic recognition. Image semantic segmentation is involved in the process of image semantic recognition and is generally modeled as a multi-classification problem at the pixel level, with the goal of distinguishing each pixel of an image into one of a predefined plurality of classes.
Most of the existing image semantic segmentation methods are based on a convolutional neural network of a coder and a decoder. Although a better semantic segmentation result can be obtained by the network structure, once an encoding and decoding structure is adopted, the spatial resolution of the feature map is inevitably reduced in the encoding process, and although the original resolution of the image is restored in the up-sampling process, the loss of spatial detail information is inevitably caused, so that the accuracy of target identification is reduced.
Disclosure of Invention
In view of the above, embodiments of the present disclosure provide an image processing method, an image processing apparatus, and an electronic device, which at least partially solve the problems in the prior art.
In a first aspect, an embodiment of the present disclosure provides an image processing method, including:
acquiring a first image containing a target object;
setting a segmentation network for processing a first image, wherein the segmentation network comprises a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers perform feature extraction on a target object in the first image, and the downsampling layers perform downsampling operation on the image output by the convolution layers;
after a second down-sampling layer in the segmentation network, a plurality of parallel convolution layers with different sampling rates are arranged, the parallel convolution layers are used for processing images output by the second down-sampling layer, and image features extracted from each parallel convolution layer form a second image in a fusion mode;
and acquiring a third image containing the target object by carrying out target identification on the second image.
According to a specific implementation manner of the embodiment of the present disclosure, the performing target recognition on the second image includes:
after the parallel convolution layer, a third downsampling layer is provided, which performs a downsampling operation on the second image.
According to a specific implementation manner of the embodiment of the present disclosure, the performing target identification on the second image further includes:
after the third down-sampling layer, a plurality of up-sampling layers are set, the up-sampling layers performing an up-sampling operation on the image output from the third down-sampling layer.
According to a specific implementation manner of the embodiment of the present disclosure, the performing target identification on the second image further includes:
setting a full connection layer in the split network;
in the full-connection layer, setting different weight values and bias values aiming at all nodes of a sampling layer for images output by different nodes of the parallel convolution layer;
and performing target recognition on the image output by the upper sampling layer based on the weight value and the bias value.
According to a specific implementation manner of the embodiment of the present disclosure, the method further includes:
acquiring all convolution layers in the segmentation network;
acquiring the image size of a characteristic image output by each convolution layer in all convolution layers;
convolution layer connection is performed between convolution layers to be output with the same image size.
According to a specific implementation manner of the embodiment of the present disclosure, the performing convolutional layer connection between convolutional layers that will output the same image size includes:
acquiring input xi and output H (xi) of the ith convolution layer in the convolution layers x outputting the same image size;
constructing a residual function f (xi) ═ h (xi) -xi of the ith convolution layer based on xi and h (xi);
and performing convolution layer connection based on the residual error function.
According to a specific implementation manner of the embodiment of the present disclosure, the performing convolutional layer connection based on the residual function includes:
setting a mapping function W (xi) for the ith convolution layer;
acquiring an input xi of the ith convolutional layer and an output F (xi) of the ith convolutional layer;
f (xi) + W (xi) is used as the input of the (i + 2) th convolutional layer.
According to a specific implementation manner of the embodiment of the present disclosure, the forming a second image by fusing the image features extracted from each parallel convolution layer includes:
setting convolution kernels of the same size in the plurality of parallel convolution layers;
performing feature extraction on the images input into the plurality of parallel convolution layers based on the convolution kernels to form a plurality of feature vector matrixes;
and allocating different weight values to the plurality of feature vector matrixes, and taking the sum of the feature vector matrixes with different weight values as a representation matrix of the second image.
In a second aspect, an embodiment of the present disclosure discloses an image processing apparatus, including:
an acquisition module for acquiring a first image containing a target object;
the image processing device comprises a setting module, a processing module and a processing module, wherein the setting module is used for setting a segmentation network for processing a first image, the segmentation network comprises a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers are used for carrying out feature extraction on a target object in the first image, and the downsampling layers are used for carrying out downsampling operation on an image output by the convolution layers;
the processing module is used for setting a plurality of parallel convolution layers with different sampling rates after a second down-sampling layer in the segmentation network, the parallel convolution layers are used for processing images output by the second down-sampling layer, and image features extracted from each parallel convolution layer form a second image in a fusion mode;
and the execution module is used for acquiring a third image containing the target object by carrying out target identification on the second image.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image processing method of any one of the first aspects or any implementation manner of the first aspect.
In a fourth aspect, the disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the image processing method of the first aspect or any implementation manner of the first aspect.
In a fifth aspect, the present disclosure also provides a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium, the computer program including program instructions that, when executed by a computer, cause the computer to perform the image processing method in the foregoing first aspect or any implementation manner of the first aspect.
The image processing scheme in the embodiment of the disclosure comprises acquiring a first image containing a target object; setting a segmentation network for processing a first image, wherein the segmentation network comprises a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers perform feature extraction on a target object in the first image, and the downsampling layers perform downsampling operation on the image output by the convolution layers; after a second down-sampling layer in the segmentation network, a plurality of parallel convolution layers with different sampling rates are arranged, the parallel convolution layers are used for processing images output by the second down-sampling layer, and image features extracted from each parallel convolution layer form a second image in a fusion mode; and acquiring a third image containing the target object by carrying out target identification on the second image. Through the scheme disclosed by the invention, the accuracy of target identification is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of an image processing flow provided in an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a neural network model provided in an embodiment of the present disclosure;
FIG. 3 is a schematic view of another image processing flow provided by the embodiments of the present disclosure;
FIG. 4 is a schematic view of another image processing flow provided by the embodiments of the present disclosure;
fig. 5 is a schematic structural diagram of an image processing apparatus according to an embodiment of the disclosure;
fig. 6 is a schematic diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
The embodiments of the present disclosure are described in detail below with reference to the accompanying drawings.
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
The embodiment of the disclosure provides an image processing method. The image processing method provided by the present embodiment may be executed by a computing apparatus, which may be implemented as software, or implemented as a combination of software and hardware, and which may be integrally provided in a server, a terminal device, or the like.
Referring to fig. 1, an image processing method provided in an embodiment of the present disclosure includes the following steps:
s101, a first image containing a target object is acquired.
The target object is the content to be acquired by the scheme of the present disclosure, and as an example, the target object may be a person with various actions, an animal with behavioral characteristics, or a stationary object.
The target object is usually contained in a certain scene, for example, a photo containing a portrait of a person usually also contains a background, which may include trees, mountains, rivers, other persons, and the like. At this time, if the target object is to be extracted from the image separately, the target object needs to be identified and processed separately. Based on the extracted target object, various behaviors of the target object may be analyzed.
The first image is an image containing a target object, and the first image may be one of a series of photos stored in advance, a video frame extracted from a piece of video stored in advance, or one or more pictures extracted from real-time live video. The first image may contain a plurality of objects, for example, a photograph used to describe the actions of a person may contain the target person, other persons with the target person, trees, buildings, etc. The target person constitutes a target object of the first image, and other persons, trees, buildings, and the like together with the target person constitute a background image. One or more objects may be selected as target objects in the first image based on actual needs.
As an example, the target object may be obtained from a video file, a video captured for the target object includes a plurality of frame images, and a plurality of images including one or more continuous motions of the target object may be selected from the frame images of the video to form an image set. By selecting images in the image set, a first image containing the target object can be acquired.
S102, a segmentation network for processing the first image is arranged, the segmentation network comprises a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers extract the characteristics of the target object in the first image, and the downsampling layers perform downsampling operation on the image output by the convolution layers.
In order to enable image processing of the first image, a segmentation network based on a neural network model is constructed, which comprises a convolutional layer, a sampling layer and a fully connected layer, see fig. 2.
The convolutional layers mainly comprise the size of convolutional kernels and the number of input feature graphs, each convolutional layer can comprise a plurality of feature graphs with the same size, the feature values of the same layer adopt a weight sharing mode, and the sizes of the convolutional kernels in each layer are consistent. The convolution layer performs convolution calculation on the input image and extracts the layout characteristics of the input image.
The back of the feature extraction layer of the convolutional layer can be connected with the sampling layer, the sampling layer is used for solving the local average value of the input image and carrying out secondary feature extraction, and the sampling layer is connected with the convolutional layer, so that the neural network model can be guaranteed to have better robustness for the input image.
The sampling layer may include an up-sampling layer and a down-sampling layer, and the up-sampling layer increases pixel information in the image by interpolating an input image, or the like. The down-sampling layer extracts the characteristics of the input image by means of characteristic extraction of the input image,
in order to accelerate the training speed of the segmentation network, a pooling layer (not shown in the figure) can be further arranged behind the convolutional layer, and the pooling layer is used for processing the output result of the convolutional layer in a maximum pooling mode, so that the invariance characteristics of the input image can be better extracted.
The full-connection layer integrates the features in the image feature map passing through the plurality of convolution layers and the pooling layer, and obtains the classification features of the input image features for image classification. In the neural network model of the segmented network, the fully-connected layer maps the feature map generated by the convolutional layer into a feature vector of fixed length. The feature vector contains the combined information of all the features of the input image, and the feature vector reserves the image features with the most features in the image to complete the image classification task. In this way, a prediction map corresponding to the input image is calculated, and the target object included in the first image is determined.
In order to improve the calculation speed of the segmentation network, a down-sampling layer is arranged in the segmentation network, the down-sampling layer and the convolution layer are distributed at intervals, the convolution layer performs feature extraction on a target object in the first image, and the down-sampling layer performs down-sampling operation on an image output by the convolution layer. By the arrangement, the calculation speed of the segmentation network for the first image is improved.
And S103, after a second down-sampling layer in the segmentation network, setting a plurality of parallel convolution layers with different sampling rates, wherein the parallel convolution layers are used for processing the image output by the second down-sampling layer, and the image features extracted from each parallel convolution layer form a second image in a fusion mode.
The conventional neural networks have the disadvantage that they need to input images with fixed sizes, and in fact, due to different image processing, the images input into the neural networks may have been subjected to cropping or warping processing, and the cropped or warped images may cause the neural networks to have lower accuracy in identifying objects to be identified in the input images due to the fact that the content is lost. In addition, when the size of the same target object in different images changes, the recognition accuracy of the target object by the neural network is also reduced.
In order to further improve the adaptability of the segmentation network to the first image, referring to fig. 2, parallel convolution layers are arranged in the segmentation network, specifically, after a second down-sampling layer in the segmentation network, a plurality of parallel convolution layers with different sampling rates are arranged, the parallel convolution layers are used for processing the image output by the second down-sampling layer, and the image features extracted from each parallel convolution layer are fused to form a second image.
The input image or the target object in the input image may be of any aspect ratio or of any size, using parallel convolutional layers for image processing. When the input image is in different scales, the segmentation network can extract features in different scales. For example, the parallel convolution layer may obtain 3-path independently processed images by performing feature calculation on input images using convolution kernels of 4 × 4, 2 × 2, and 1 × 1, respectively, and may form the second image by fusing the 3-path independently processed images. The robustness of the segmentation network is further improved because the second image is formed independent of the size or scale of the input image.
With this implementation, embodiments are not limited to detection of objects of a particular size, shape, or type, nor to detection of images of a particular size, type, or content. A system for image processing using parallel convolutional layer pooling according to embodiments may work on images of any size, type, or content.
The parallel convolution layer improves the data robustness and increases the calculation burden of the system, for this reason, the parallel convolution layer is arranged after the second down-sampling layer in the segmentation network, at this moment, the image output by the second down-sampling layer has enough characteristics to meet the requirement of the parallel sampling layer, and the calculation amount of the data is greatly reduced after the first image is processed by the two sampling layers. The computation consumption for evaluating the convolutional layer is reduced while the robustness of the parallel convolutional layer is satisfied. This is because, if the parallel sampling layer is placed on the third sampling layer and then image processing is performed, too many features will be lost after the first image is processed by the three sampling layers, so that the features obtained by the parallel convolution layers are insufficient, and the recognition effect of the parallel convolution layers on the target object is affected.
And S104, performing target identification on the second image to obtain a third image containing the target object.
The size of the second image may be adjusted, for example, a minimization function min (a; b) ═ c may be constructed, where a is the width of the second image, b is the height of the second image, c represents a predefined scale (e.g., 256), and a feature map may be extracted from the entire image. For example, taking 3 parallel convolution layers (1 × 1, 3 × 3, and 6 × 6, for a total of 46 feature vectors) as an example, these 3 parallel convolution layers may be used for each candidate window to urinal the features. An 11776-dimensional (256 × 46) representation is generated for each window. These representations may be provided to a fully-connected layer of the split network, through which target recognition is performed based on the representations. The identified target object is saved in the form of a separate image, forming a third image.
In order to further improve the processing efficiency of the segmentation network, according to a specific implementation manner of the embodiment of the present disclosure, in the process of performing target recognition on the second image, after the parallel convolution layer, a third downsampling layer may be provided, and the third downsampling layer performs a downsampling operation on the second image. By providing the third downsampling layer, the pixel value of the second image can be further reduced, and the amount of calculation of the segmentation network can be reduced.
For a scenario that employs a high-speed computing device such as a GPU, the feature information included in the image may be increased by increasing the pixel value of the image, and in this case, after the third downsampling layer, a plurality of (e.g., 3) upsampling layers may be set, and the upsampling layer performs an upsampling operation on the image output by the third downsampling layer. By providing multiple upsampling layers, more image details can be added to the second image by interpolation or the like.
Referring to fig. 3, according to a specific implementation manner of the embodiment of the present disclosure, the performing target recognition on the second image may include:
s301, setting a full connection layer in the split network.
And S302, in the full connection layer, setting different weight values and bias values aiming at all nodes of the sampling layer for the images output by different nodes of the parallel convolution layer.
Taking x1, x2, and x3 as the outputs of the parallel convolution layers as an example, the outputs a1, a2, and a3 of the fully connected layers can be expressed by the following equations:
wherein,the weight matrix and the bias vector are respectively, the weight matrix comprises different weight values, and the weight values are obtained by means of training a segmentation network and the like. The bias vector contains different bias values, and the bias values can be obtained by training a segmentation network and the like.
And S303, performing target identification on the image output by the upper sampling layer based on the weight value and the bias value.
In the manner in steps S301-S303, the target object contained in the second image can be quickly identified.
Referring to fig. 4, according to a specific implementation manner of the embodiment of the present disclosure, in the process of constructing a split network, the following steps may be further included:
s401, all the convolution layers in the segmentation network are obtained.
According to different requirements, a plurality of convolution layers can be arranged in the segmentation network, and corresponding processing can be carried out on the image needing to be processed by arranging different convolution kernels on different convolution layers.
S402, acquiring the image size of the characteristic image output by each convolution layer in all the convolution layers.
Based on the difference between the convolution kernel and the input image, the sizes of the feature images output by different convolution layers are different, and at this time, the size of each convolution layer output image can be obtained by calculating the input parameters and the convolution kernels of all the convolution layers.
S403, performing convolution layer connection between convolution layers outputting the same image size.
In the deep learning network, the shallow layer features have more image features, the deep layer features have more semantic features, and in order to combine the shallow layer features and the deep layer features together, for the convolutional layers with the same size, the connection between the convolutional layers is increased, so that the problem of edge aliasing in the image is reduced.
In the process of implementing step S403, according to a specific implementation manner of the embodiment of the present disclosure, the method may further include the following steps:
s4031, the input xi and output h (xi) of the ith convolution layer among the N convolution layers x outputting the same image size are acquired.
S4032, based on xi and h (xi), constructs a residual function f (xi) ═ h (xi) -xi of the ith convolution layer.
S4033, convolution layer connection is performed based on the residual function.
Specifically, the convolution layers may be connected by setting a mapping function w (xi) for the ith convolution layer, and an input xi and an output f (xi) for the ith convolution layer, and then using f (xi) + w (xi) as an input of the (i + 2) th convolution layer.
In order to extract the features of the second image quickly in the process of forming the second image, convolution kernels having the same size may be provided in the plurality of parallel convolution layers, and the features of the image input to the plurality of parallel convolution layers may be extracted by the convolution kernels to form a plurality of feature vector matrices. And distributing different weight values to the plurality of feature vector matrixes based on the condition of training the segmentation network, taking the sum of the feature vector matrixes with different weight values as a representation matrix of the second image, and finally forming the second image.
Corresponding to the above method embodiment, referring to fig. 5, the disclosed embodiment further discloses an image processing apparatus 50, comprising:
an obtaining module 501 is configured to obtain a first image including a target object.
The target object is the content to be acquired by the scheme of the present disclosure, and as an example, the target object may be a person with various actions, an animal with behavioral characteristics, or a stationary object.
The target object is usually contained in a certain scene, for example, a photo containing a portrait of a person usually also contains a background, which may include trees, mountains, rivers, other persons, and the like. At this time, if the target object is to be extracted from the image separately, the target object needs to be identified and processed separately. Based on the extracted target object, various behaviors of the target object may be analyzed.
The first image is an image containing a target object, and the first image may be one of a series of photos stored in advance, a video frame extracted from a piece of video stored in advance, or one or more pictures extracted from real-time live video. The first image may contain a plurality of objects, for example, a photograph used to describe the actions of a person may contain the target person, other persons with the target person, trees, buildings, etc. The target person constitutes a target object of the first image, and other persons, trees, buildings, and the like together with the target person constitute a background image. One or more objects may be selected as target objects in the first image based on actual needs.
As an example, the target object may be obtained from a video file, a video captured for the target object includes a plurality of frame images, and a plurality of images including one or more continuous motions of the target object may be selected from the frame images of the video to form an image set. By selecting images in the image set, a first image containing the target object can be acquired.
A setting module 502, configured to set a segmentation network for performing image processing on a first image, where the segmentation network includes a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers perform feature extraction on a target object in the first image, and the downsampling layers perform downsampling on an image output by the convolution layers.
In order to enable image processing of the first image, a segmentation network based on a neural network model is constructed, which comprises a convolutional layer, a sampling layer and a fully connected layer, see fig. 2.
The convolutional layers mainly comprise the size of convolutional kernels and the number of input feature graphs, each convolutional layer can comprise a plurality of feature graphs with the same size, the feature values of the same layer adopt a weight sharing mode, and the sizes of the convolutional kernels in each layer are consistent. The convolution layer performs convolution calculation on the input image and extracts the layout characteristics of the input image.
The back of the feature extraction layer of the convolutional layer can be connected with the sampling layer, the sampling layer is used for solving the local average value of the input image and carrying out secondary feature extraction, and the sampling layer is connected with the convolutional layer, so that the neural network model can be guaranteed to have better robustness for the input image.
The sampling layer may include an up-sampling layer and a down-sampling layer, and the up-sampling layer increases pixel information in the image by interpolating an input image, or the like. The down-sampling layer extracts the characteristics of the input image by means of characteristic extraction of the input image,
in order to accelerate the training speed of the segmentation network, a pooling layer (not shown in the figure) can be further arranged behind the convolutional layer, and the pooling layer is used for processing the output result of the convolutional layer in a maximum pooling mode, so that the invariance characteristics of the input image can be better extracted.
The full-connection layer integrates the features in the image feature map passing through the plurality of convolution layers and the pooling layer, and obtains the classification features of the input image features for image classification. In the neural network model of the segmented network, the fully-connected layer maps the feature map generated by the convolutional layer into a feature vector of fixed length. The feature vector contains the combined information of all the features of the input image, and the feature vector reserves the image features with the most features in the image to complete the image classification task. In this way, a prediction map corresponding to the input image is calculated, and the target object included in the first image is determined.
In order to improve the calculation speed of the segmentation network, a down-sampling layer is arranged in the segmentation network, the down-sampling layer and the convolution layer are distributed at intervals, the convolution layer performs feature extraction on a target object in the first image, and the down-sampling layer performs down-sampling operation on an image output by the convolution layer. By the arrangement, the calculation speed of the segmentation network for the first image is improved.
A processing module 503, configured to set, after a second down-sampling layer in the segmentation network, a plurality of parallel convolution layers with different sampling rates, where the parallel convolution layers are configured to process an image output by the second down-sampling layer, and an image feature extracted on each parallel convolution layer forms a second image by means of fusion.
The conventional neural networks have the disadvantage that they need to input images with fixed sizes, and in fact, due to different image processing, the images input into the neural networks may have been subjected to cropping or warping processing, and the cropped or warped images may cause the neural networks to have lower accuracy in identifying objects to be identified in the input images due to the fact that the content is lost. In addition, when the size of the same target object in different images changes, the recognition accuracy of the target object by the neural network is also reduced.
In order to further improve the adaptability of the segmentation network to the first image, referring to fig. 2, parallel convolution layers are arranged in the segmentation network, specifically, after a second down-sampling layer in the segmentation network, a plurality of parallel convolution layers with different sampling rates are arranged, the parallel convolution layers are used for processing the image output by the second down-sampling layer, and the image features extracted from each parallel convolution layer are fused to form a second image.
The input image or the target object in the input image may be of any aspect ratio or of any size, using parallel convolutional layers for image processing. When the input image is in different scales, the segmentation network can extract features in different scales. For example, the parallel convolution layer may obtain 3-path independently processed images by performing feature calculation on input images using convolution kernels of 4 × 4, 2 × 2, and 1 × 1, respectively, and may form the second image by fusing the 3-path independently processed images. The robustness of the segmentation network is further improved because the second image is formed independent of the size or scale of the input image.
With this implementation, embodiments are not limited to detection of objects of a particular size, shape, or type, nor to detection of images of a particular size, type, or content. A system for image processing using parallel convolutional layer pooling according to embodiments may work on images of any size, type, or content.
The parallel convolution layer improves the data robustness and increases the calculation burden of the system, for this reason, the parallel convolution layer is arranged after the second down-sampling layer in the segmentation network, at this moment, the image output by the second down-sampling layer has enough characteristics to meet the requirement of the parallel sampling layer, and the calculation amount of the data is greatly reduced after the first image is processed by the two sampling layers. The computation consumption for evaluating the convolutional layer is reduced while the robustness of the parallel convolutional layer is satisfied. This is because, if the parallel sampling layer is placed on the third sampling layer and then image processing is performed, too many features will be lost after the first image is processed by the three sampling layers, so that the features obtained by the parallel convolution layers are insufficient, and the recognition effect of the parallel convolution layers on the target object is affected.
An executing module 504, configured to obtain a third image including the target object by performing target recognition on the second image.
The size of the second image may be adjusted, for example, a minimization function min (a; b) ═ c may be constructed, where a is the width of the second image, b is the height of the second image, c represents a predefined scale (e.g., 256), and a feature map may be extracted from the entire image. For example, taking 3 parallel convolution layers (1 × 1, 3 × 3, and 6 × 6, for a total of 46 feature vectors) as an example, these 3 parallel convolution layers may be used for each candidate window to urinal the features. An 11776-dimensional (256 × 46) representation is generated for each window. These representations may be provided to a fully-connected layer of the split network, through which target recognition is performed based on the representations. The identified target object is saved in the form of a separate image, forming a third image.
The apparatus shown in fig. 5 may correspondingly execute the content in the above method embodiment, and details of the part not described in detail in this embodiment refer to the content described in the above method embodiment, which is not described again here.
Referring to fig. 6, an embodiment of the present disclosure also provides an electronic device 60, including:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image processing method of the preceding method embodiment.
The disclosed embodiments also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the foregoing method embodiments.
The disclosed embodiments also provide a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the image processing method in the aforementioned method embodiments.
Referring now to FIG. 6, a schematic diagram of an electronic device 60 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 6, the electronic device 60 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 60 are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 60 to communicate with other devices wirelessly or by wire to exchange data. While the figures illustrate an electronic device 60 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising the at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from the at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of a unit does not in some cases constitute a limitation of the unit itself, for example, the first retrieving unit may also be described as a "unit for retrieving at least two internet protocol addresses".
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
The above description is only for the specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present disclosure should be covered within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.
Claims (11)
1. An image processing method, comprising:
acquiring a first image containing a target object;
setting a segmentation network for processing a first image, wherein the segmentation network comprises a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers perform feature extraction on a target object in the first image, and the downsampling layers perform downsampling operation on the image output by the convolution layers;
after a second down-sampling layer in the segmentation network, a plurality of parallel convolution layers with different sampling rates are arranged, the parallel convolution layers are used for processing images output by the second down-sampling layer, and image features extracted from each parallel convolution layer form a second image in a fusion mode;
and acquiring a third image containing the target object by carrying out target identification on the second image.
2. The method of claim 1, wherein the performing object recognition on the second image comprises:
after the parallel convolution layer, a third downsampling layer is provided, which performs a downsampling operation on the second image.
3. The method of claim 2, wherein the performing object recognition on the second image further comprises:
after the third down-sampling layer, a plurality of up-sampling layers are set, the up-sampling layers performing an up-sampling operation on the image output from the third down-sampling layer.
4. The method of claim 3, wherein the performing object recognition on the second image further comprises:
setting a full connection layer in the split network;
in the full-connection layer, setting different weight values and bias values aiming at all nodes of a sampling layer for images output by different nodes of the parallel convolution layer;
and performing target recognition on the image output by the upper sampling layer based on the weight value and the bias value.
5. The method of claim 1, further comprising:
acquiring all convolution layers in the segmentation network;
acquiring the image size of a characteristic image output by each convolution layer in all convolution layers;
convolution layer connection is performed between convolution layers to be output with the same image size.
6. The method of claim 5, wherein performing convolutional layer connections between convolutional layers that will output the same image size comprises:
acquiring input xi and output H (xi) of the ith convolution layer in the convolution layers x outputting the same image size;
constructing a residual function f (xi) ═ h (xi) -xi of the ith convolution layer based on xi and h (xi);
and performing convolution layer connection based on the residual error function.
7. The method of claim 6, wherein the concatenating convolutional layers based on the residual function comprises:
setting a mapping function W (xi) for the ith convolution layer;
acquiring an input xi of the ith convolutional layer and an output F (xi) of the ith convolutional layer;
f (xi) + W (xi) is used as the input of the (i + 2) th convolutional layer.
8. The method of claim 1, wherein the image features extracted on each parallel convolution layer are fused to form a second image, comprising:
setting convolution kernels of the same size in the plurality of parallel convolution layers;
performing feature extraction on the images input into the plurality of parallel convolution layers based on the convolution kernels to form a plurality of feature vector matrixes;
and allocating different weight values to the plurality of feature vector matrixes, and taking the sum of the feature vector matrixes with different weight values as a representation matrix of the second image.
9. An image processing apparatus characterized by comprising:
an acquisition module for acquiring a first image containing a target object;
the image processing device comprises a setting module, a processing module and a processing module, wherein the setting module is used for setting a segmentation network for processing a first image, the segmentation network comprises a plurality of convolution layers and downsampling layers, the convolution layers and the downsampling layers are distributed at intervals, the convolution layers are used for carrying out feature extraction on a target object in the first image, and the downsampling layers are used for carrying out downsampling operation on an image output by the convolution layers;
the processing module is used for setting a plurality of parallel convolution layers with different sampling rates after a second down-sampling layer in the segmentation network, the parallel convolution layers are used for processing images output by the second down-sampling layer, and image features extracted from each parallel convolution layer form a second image in a fusion mode;
and the execution module is used for acquiring a third image containing the target object by carrying out target identification on the second image.
10. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image processing method of any one of claims 1 to 8.
11. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the image processing method of any one of the preceding claims 1-8.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910403859.XA CN110222726A (en) | 2019-05-15 | 2019-05-15 | Image processing method, device and electronic equipment |
PCT/CN2020/079192 WO2020228405A1 (en) | 2019-05-15 | 2020-03-13 | Image processing method and apparatus, and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910403859.XA CN110222726A (en) | 2019-05-15 | 2019-05-15 | Image processing method, device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110222726A true CN110222726A (en) | 2019-09-10 |
Family
ID=67821169
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910403859.XA Pending CN110222726A (en) | 2019-05-15 | 2019-05-15 | Image processing method, device and electronic equipment |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110222726A (en) |
WO (1) | WO2020228405A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111369468A (en) * | 2020-03-09 | 2020-07-03 | 北京字节跳动网络技术有限公司 | Image processing method, image processing device, electronic equipment and computer readable medium |
CN111931600A (en) * | 2020-07-21 | 2020-11-13 | 深圳市鹰硕教育服务股份有限公司 | Intelligent pen image processing method and device and electronic equipment |
WO2020228405A1 (en) * | 2019-05-15 | 2020-11-19 | 北京字节跳动网络技术有限公司 | Image processing method and apparatus, and electronic device |
CN113691863A (en) * | 2021-07-05 | 2021-11-23 | 浙江工业大学 | Lightweight method for extracting video key frames |
CN113936220A (en) * | 2021-12-14 | 2022-01-14 | 深圳致星科技有限公司 | Image processing method, storage medium, electronic device, and image processing apparatus |
WO2024012143A1 (en) * | 2022-07-15 | 2024-01-18 | 华为技术有限公司 | Image data processing method and apparatus, and storage medium |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112651983B (en) * | 2020-12-15 | 2023-08-01 | 北京百度网讯科技有限公司 | Splice graph identification method and device, electronic equipment and storage medium |
CN113469083B (en) * | 2021-07-08 | 2024-05-31 | 西安电子科技大学 | SAR image target classification method and system based on antialiasing convolutional neural network |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170046616A1 (en) * | 2015-08-15 | 2017-02-16 | Salesforce.Com, Inc. | Three-dimensional (3d) convolution with 3d batch normalization |
CN106920227A (en) * | 2016-12-27 | 2017-07-04 | 北京工业大学 | Based on the Segmentation Method of Retinal Blood Vessels that deep learning is combined with conventional method |
CN107292352A (en) * | 2017-08-07 | 2017-10-24 | 北京中星微电子有限公司 | Image classification method and device based on convolutional neural networks |
CN107657257A (en) * | 2017-08-14 | 2018-02-02 | 中国矿业大学 | A kind of semantic image dividing method based on multichannel convolutive neutral net |
CN107909113A (en) * | 2017-11-29 | 2018-04-13 | 北京小米移动软件有限公司 | Traffic-accident image processing method, device and storage medium |
CN108022647A (en) * | 2017-11-30 | 2018-05-11 | 东北大学 | The good pernicious Forecasting Methodology of Lung neoplasm based on ResNet-Inception models |
CN108615010A (en) * | 2018-04-24 | 2018-10-02 | 重庆邮电大学 | Facial expression recognizing method based on the fusion of parallel convolutional neural networks characteristic pattern |
CN108986124A (en) * | 2018-06-20 | 2018-12-11 | 天津大学 | In conjunction with Analysis On Multi-scale Features convolutional neural networks retinal vascular images dividing method |
CN109344878A (en) * | 2018-09-06 | 2019-02-15 | 北京航空航天大学 | A kind of imitative hawk brain feature integration Small object recognition methods based on ResNet |
CN109389030A (en) * | 2018-08-23 | 2019-02-26 | 平安科技(深圳)有限公司 | Facial feature points detection method, apparatus, computer equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107862287A (en) * | 2017-11-08 | 2018-03-30 | 吉林大学 | A kind of front zonule object identification and vehicle early warning method |
CN110046607A (en) * | 2019-04-26 | 2019-07-23 | 西安因诺航空科技有限公司 | A kind of unmanned aerial vehicle remote sensing image board house or building materials test method based on deep learning |
CN110222726A (en) * | 2019-05-15 | 2019-09-10 | 北京字节跳动网络技术有限公司 | Image processing method, device and electronic equipment |
CN110456805B (en) * | 2019-06-24 | 2022-07-19 | 深圳慈航无人智能系统技术有限公司 | Intelligent tracking flight system and method for unmanned aerial vehicle |
-
2019
- 2019-05-15 CN CN201910403859.XA patent/CN110222726A/en active Pending
-
2020
- 2020-03-13 WO PCT/CN2020/079192 patent/WO2020228405A1/en active Application Filing
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170046616A1 (en) * | 2015-08-15 | 2017-02-16 | Salesforce.Com, Inc. | Three-dimensional (3d) convolution with 3d batch normalization |
CN106920227A (en) * | 2016-12-27 | 2017-07-04 | 北京工业大学 | Based on the Segmentation Method of Retinal Blood Vessels that deep learning is combined with conventional method |
CN107292352A (en) * | 2017-08-07 | 2017-10-24 | 北京中星微电子有限公司 | Image classification method and device based on convolutional neural networks |
CN107657257A (en) * | 2017-08-14 | 2018-02-02 | 中国矿业大学 | A kind of semantic image dividing method based on multichannel convolutive neutral net |
CN107909113A (en) * | 2017-11-29 | 2018-04-13 | 北京小米移动软件有限公司 | Traffic-accident image processing method, device and storage medium |
CN108022647A (en) * | 2017-11-30 | 2018-05-11 | 东北大学 | The good pernicious Forecasting Methodology of Lung neoplasm based on ResNet-Inception models |
CN108615010A (en) * | 2018-04-24 | 2018-10-02 | 重庆邮电大学 | Facial expression recognizing method based on the fusion of parallel convolutional neural networks characteristic pattern |
CN108986124A (en) * | 2018-06-20 | 2018-12-11 | 天津大学 | In conjunction with Analysis On Multi-scale Features convolutional neural networks retinal vascular images dividing method |
CN109389030A (en) * | 2018-08-23 | 2019-02-26 | 平安科技(深圳)有限公司 | Facial feature points detection method, apparatus, computer equipment and storage medium |
CN109344878A (en) * | 2018-09-06 | 2019-02-15 | 北京航空航天大学 | A kind of imitative hawk brain feature integration Small object recognition methods based on ResNet |
Non-Patent Citations (2)
Title |
---|
LIANG-CHIEH CHEN ET AL.: "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution,and Fully CRFs", 《ARXIV:1606.00915V2》 * |
刘丹 等: "一种多尺度CNN的图像语义分割算法", 《遥感信息》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020228405A1 (en) * | 2019-05-15 | 2020-11-19 | 北京字节跳动网络技术有限公司 | Image processing method and apparatus, and electronic device |
CN111369468A (en) * | 2020-03-09 | 2020-07-03 | 北京字节跳动网络技术有限公司 | Image processing method, image processing device, electronic equipment and computer readable medium |
CN111369468B (en) * | 2020-03-09 | 2022-02-01 | 北京字节跳动网络技术有限公司 | Image processing method, image processing device, electronic equipment and computer readable medium |
CN111931600A (en) * | 2020-07-21 | 2020-11-13 | 深圳市鹰硕教育服务股份有限公司 | Intelligent pen image processing method and device and electronic equipment |
CN111931600B (en) * | 2020-07-21 | 2021-04-06 | 深圳市鹰硕教育服务有限公司 | Intelligent pen image processing method and device and electronic equipment |
WO2022016651A1 (en) * | 2020-07-21 | 2022-01-27 | 深圳市鹰硕教育服务有限公司 | Smart pen image processing method and apparatus, and electronic device |
CN113691863A (en) * | 2021-07-05 | 2021-11-23 | 浙江工业大学 | Lightweight method for extracting video key frames |
CN113691863B (en) * | 2021-07-05 | 2023-06-20 | 浙江工业大学 | Lightweight method for extracting video key frames |
CN113936220A (en) * | 2021-12-14 | 2022-01-14 | 深圳致星科技有限公司 | Image processing method, storage medium, electronic device, and image processing apparatus |
WO2024012143A1 (en) * | 2022-07-15 | 2024-01-18 | 华为技术有限公司 | Image data processing method and apparatus, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2020228405A1 (en) | 2020-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110222726A (en) | Image processing method, device and electronic equipment | |
CN110378410B (en) | Multi-label scene classification method and device and electronic equipment | |
CN110189246B (en) | Image stylization generation method and device and electronic equipment | |
CN110363753B (en) | Image quality evaluation method and device and electronic equipment | |
CN110399847B (en) | Key frame extraction method and device and electronic equipment | |
CN110211017B (en) | Image processing method and device and electronic equipment | |
CN112200041A (en) | Video motion recognition method and device, storage medium and electronic equipment | |
CN114419322B (en) | Image instance segmentation method and device, electronic equipment and storage medium | |
CN111738316A (en) | Image classification method and device for zero sample learning and electronic equipment | |
CN110555861B (en) | Optical flow calculation method and device and electronic equipment | |
CN110069997B (en) | Scene classification method and device and electronic equipment | |
CN111191553A (en) | Face tracking method and device and electronic equipment | |
CN110287350A (en) | Image search method, device and electronic equipment | |
CN114049403A (en) | Multi-angle three-dimensional face reconstruction method and device and storage medium | |
CN110378936B (en) | Optical flow calculation method and device and electronic equipment | |
CN110197459B (en) | Image stylization generation method and device and electronic equipment | |
CN109977925B (en) | Expression determination method and device and electronic equipment | |
CN110060324B (en) | Image rendering method and device and electronic equipment | |
CN110781809A (en) | Identification method and device based on registration feature update and electronic equipment | |
CN111832354A (en) | Target object age identification method and device and electronic equipment | |
CN116977195A (en) | Method, device, equipment and storage medium for adjusting restoration model | |
CN113808151A (en) | Method, device and equipment for detecting weak semantic contour of live image and storage medium | |
CN114898282A (en) | Image processing method and device | |
CN111862105A (en) | Image area processing method and device and electronic equipment | |
CN111325093A (en) | Video segmentation method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190910 |