WO2019105243A1 - 图像处理方法、装置及终端 - Google Patents
图像处理方法、装置及终端 Download PDFInfo
- Publication number
- WO2019105243A1 WO2019105243A1 PCT/CN2018/115987 CN2018115987W WO2019105243A1 WO 2019105243 A1 WO2019105243 A1 WO 2019105243A1 CN 2018115987 W CN2018115987 W CN 2018115987W WO 2019105243 A1 WO2019105243 A1 WO 2019105243A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- convolution layer
- feature map
- output data
- selection module
- chip selection
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/285—Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Definitions
- the present application relates to the field of image processing technologies, and in particular, to an image processing method, apparatus, and terminal.
- Deep learning has been widely used in video images, speech recognition, natural language processing and other related fields.
- Convolutional neural network as an important branch of deep learning, has greatly improved the accuracy of its prediction results in computer vision tasks such as target detection and classification due to its superior fitting ability and end-to-end global optimization ability.
- convolutional neural networks are computationally intensive algorithms with large computational complexity, slow processing speed on the central processing unit, and low task processing efficiency, making it difficult to use in tasks with high real-time requirements.
- the embodiment of the present invention provides an image processing method, device, and terminal, to solve the problem that the processing efficiency of the task is low in the convolutional neural network in the prior art.
- an image processing method including: determining, by a convolutional neural network, a convolution process on an image, determining whether a first pre-recalled first convolution layer is provided with a first chip selection module
- the convolutional neural network includes a plurality of convolution layers, each convolution layer comprising a plurality of feature maps; when the first convolutional layer is provided with the first chip selection module, the previous volume is The layered output data is respectively input to the first chip selection module and the first convolution layer; the first chip selection module is called, and the first chip selection module is based on the previous convolution layer Output data, determining a target feature map from a feature map included in the first convolutional layer; invoking the first convolution layer, and the first convolution layer according to the target feature map to the previous one The output data of the convolutional layer is convoluted to obtain output data.
- the first chip selection module is called, and the first chip selection module determines, according to the output data of the previous convolution layer, from the feature map included in the first convolution layer
- the step of the target feature map includes: invoking the first chip selection module, and generating, by the first chip selection module, a feature map weight vector according to the output data of the previous convolution layer; wherein, in the feature map weight vector Each point corresponds to a feature map in the first convolutional layer and a weight value; determining a number of target features N according to a preset acceleration ratio; and other than the first N points in the feature map weight vector
- the weight value of the point is adjusted to 0, and the adjusted feature map weight vector is input into the first convolution layer; wherein the feature map corresponding to the first N points is the target feature map.
- the invoking the first convolution layer, and convoluting the output data of the previous convolution layer by the first convolution layer according to the target feature map to obtain output data includes: invoking the first convolution layer, and determining, by the first convolution layer, the target feature map according to the adjusted feature map weight vector; and the previous convolution layer according to the target feature map The output data is convoluted to obtain output data.
- the method further includes: when the first convolution layer is not provided with the first chip selection module, inputting output data of the previous convolution layer into the first convolution layer;
- the first convolutional layer performs convolution processing on the output data of the previous convolution layer by the first convolutional layer according to all the feature maps included to obtain output data.
- an image processing apparatus comprising: a determination module configured to determine a current pre-called first volume during a convolution process of an image by a convolutional neural network Whether the layer is provided with a first chip selection module; wherein the convolutional neural network includes a plurality of convolution layers, each convolution layer includes a plurality of feature maps; and the first input module is configured to be When the first convolution layer is provided with the first chip selection module, the output data of the previous convolution layer is respectively input to the first chip selection module and the first convolution layer; the first calling module is configured In order to invoke the first chip selection module, the first chip selection module determines a target feature image from the feature map included in the first convolution layer according to the output data of the previous convolution layer; The calling module is configured to invoke the first convolution layer, and the first convolution layer performs convolution processing on the output data of the previous convolution layer according to the target feature map to obtain output data.
- the first chip selection module is configured to: generate a feature map weight vector according to the output data of the previous convolution layer; wherein each point in the feature map weight vector corresponds to the first a feature map in the convolutional layer and a weight value; determining a target feature number N according to a preset acceleration ratio; adjusting a weight value of other points outside the first N points in the feature map weight vector to 0, and adjusting The feature map weight vector is input to the first convolution layer; wherein the feature map corresponding to the first N points is the target feature map.
- the first convolution layer is configured to: determine the target feature map according to the adjusted feature map weight vector; and convolute output data of the previous convolution layer according to the target feature map Process to get the output data.
- the device further includes: a second input module configured to input output data of the previous convolution layer to the first when the first convolution layer is not provided with the first chip selection module a third invoking module configured to invoke the first convolution layer, and the first convolution layer performs volume on the output data of the previous convolution layer according to all the feature maps included Product processing, to get the output data.
- a second input module configured to input output data of the previous convolution layer to the first when the first convolution layer is not provided with the first chip selection module
- a third invoking module configured to invoke the first convolution layer, and the first convolution layer performs volume on the output data of the previous convolution layer according to all the feature maps included Product processing, to get the output data.
- another image processing method comprising:
- the image is input into a convolutional neural network for convolution processing, wherein the convolutional neural network includes a plurality of convolution layers, and at least one convolution layer is provided with a chip selection module;
- the manner in which the convolutional neural network performs convolution processing on the image includes:
- the first convolution layer provided with the first chip selection module convolves the output data of the previous convolution layer to obtain an alternative feature map
- the first chip selection module determines a target feature image from the candidate feature map as output data of the first convolution layer according to output data of the previous convolution layer.
- the first chip selection module determines the target feature image from the candidate feature map according to the output data of the previous convolution layer, including:
- the first chip selection module generates a feature map weight vector according to the output data of the previous convolution layer; wherein the weight value in the feature map weight vector has a one-to-one correspondence with the candidate feature map;
- a target feature map is determined from the candidate feature map according to the adjusted feature map weight vector.
- the first chip selection module includes a fully connected layer
- the first chip selection module generates a feature map weight vector according to the output data of the previous convolution layer, including:
- the processed result obtained by the processing is input to the fully connected layer to obtain a feature map weight vector.
- a terminal including: a memory, a processor, and an image processing program stored on the memory and operable on the processor, the image processing program being processed
- the steps of any of the image processing methods described in this application are implemented when executed.
- a computer readable storage medium having stored thereon an image processing program, the image processing program being executed by a processor to implement any of the methods described in the present application A step of an image processing method.
- an application product for performing the steps of any one of the image processing methods described herein at runtime.
- the image processing scheme provided by the embodiment of the present application pre-sets a chip selection module for one or more convolution layers in the convolutional neural network, and in the process of predicting the image by the convolutional neural network, the volume is selected by the chip selection module.
- the feature maps in the stack are screened, and part of the feature maps are selected from the plurality of feature maps included in the convolutional layer as the target feature map to calculate the convolution output, compared to the existing image processing scheme, not in the convolution layer
- the feature map is filtered to calculate the convolution output of each of the feature maps included in the convolutional layer as the target feature map, thereby reducing the amount of calculation and improving the task processing efficiency.
- FIG. 1 is a flow chart showing the steps of an image processing method according to Embodiment 1 of the present application;
- FIG. 2 is a flow chart showing the steps of an image processing method according to Embodiment 2 of the present application.
- FIG. 3 is a block diagram showing the structure of an image processing apparatus according to Embodiment 3 of the present application.
- FIG. 4 is a structural block diagram of a terminal according to Embodiment 4 of the present application.
- FIG. 1 a flow chart of steps of an image processing method according to Embodiment 1 of the present application is shown.
- Step 101 During the convolution processing of the image by the convolutional neural network, determine whether the first pre-committed first convolution layer is provided with the first chip selection module.
- the convolutional neural network includes a plurality of convolution layers, and each convolution layer includes a plurality of feature maps.
- a person skilled in the art may set a chip selection module for one convolution layer according to actual needs, or may separately set a chip selection module for multiple convolution layers.
- the image may be a single frame image in the video, or may be only one multimedia image.
- An image is input into the convolutional neural network and processed by each convolution layer to obtain a feature map.
- the output data of the upper convolution layer will be used as the input data of the next convolutional layer, and the final result will be obtained by layer-by-layer convolution processing.
- Step 102 If the first convolution layer is provided with the first chip selection module, the output data of the previous convolution layer is respectively input into the first chip selection module and the first convolution layer.
- the output data of the convolutional layer is a corresponding feature map of the image to be processed in the convolutional layer.
- the image to be processed is an image obtained by convolution processing of the above input convolutional neural network.
- Step 103 The first chip selection module is called, and the first chip selection module determines the target feature image from the feature map included in the first convolution layer according to the output data of the previous convolution layer.
- the output data of the previous layer is a plurality of feature maps
- the first chip selection module associates each feature map with each feature image included in the first convolution layer to determine a pre-match with the output data. Set the number of target feature maps.
- Step 104 The first convolution layer is called, and the output data of the previous convolution layer is convoluted by the first convolution layer according to the target feature map to obtain output data.
- the convolutional neural network when the convolutional neural network performs convolution processing on the image, for the first convolution layer provided with the first chip selection module, the output data of the previous convolution layer is respectively input to the first slice.
- the first convolutional layer can convolute the output data of the previous convolutional layer to obtain an alternative feature map.
- the first chip selection module may determine the target feature map from the candidate feature maps according to the output data of the previous volume, and use the determined target feature map as the output data of the first convolution layer.
- the second convolutional layer can perform convolution processing on the output data of the previous layer to obtain a plurality of feature maps, and the plurality of feature maps are the second convolutional map. Layer output data.
- the data is output to the next convolutional layer; the next convolutional layer performs the processes in steps 101 to 104 to obtain the output data.
- steps 101 to 104 are performed until the convolutional layers in the convolutional neural network are executed. , predicting the feature map corresponding to the image.
- the image processing method provided by the embodiment of the present application pre-sets a chip selection module for one or more convolution layers in the convolutional neural network, and in the process of predicting the image by the convolutional neural network, the volume is selected by the chip selection module.
- the feature maps in the stack are screened, and some of the feature maps are selected from the plurality of feature maps included in the convolutional layer as the target feature map to calculate the convolution output, compared to the existing image processing method, not in the convolution layer
- the feature map is filtered, and each of the feature maps included in the convolutional layer is used as the target feature map to calculate the convolution output, so that the calculation amount can be reduced, thereby improving the task processing efficiency.
- FIG. 2 a flow chart of steps of an image processing method according to Embodiment 2 of the present application is shown.
- Step 201 In the process of performing convolution processing on the image by the convolutional neural network, determining whether the first pre-committed first convolution layer is provided with the first chip selection module; if yes, executing step 202; if not, executing the pre- Set the operation.
- the convolutional neural network includes a plurality of convolution layers, and each convolution layer includes a plurality of feature maps.
- a person skilled in the art can selectively set a chip selection module for one or more convolution layers according to actual needs.
- the training method of the convolutional neural network in which the chip selection module is provided is the same as the training method in the convolutional neural network in which the chip selection module is not provided. Therefore, the training of the convolutional neural network in the embodiment of the present application can refer to the related technology. This is not specifically limited in the examples.
- An image is input into the convolutional neural network and processed by each convolution layer to obtain a feature map.
- the output data of the upper convolution layer will be used as the input data of the next convolutional layer, and the final result will be obtained by layer-by-layer convolution processing.
- the processing flow of the input data is the same for each convolutional layer. In the embodiment of the present application, the processing flow of a single convolution layer is taken as an example for description.
- the preset operation may be configured to input the output data of the previous convolution layer into the first convolution layer when the first convolution layer is not provided with the first chip selection module; and call the first convolution layer, by A roll of convolution processing convolves the output data of the previous convolution layer according to all the feature maps included to obtain output data.
- the first convolutional layer contains 100 feature maps
- the first convolutional layer is input according to the 100 feature map pairs.
- the input data is subjected to convolution processing, and the feature map matching the input data in the convolutional layer is determined as output data, and input to the next convolution layer.
- Step 202 If the first convolution layer is provided with the first chip selection module, the output data of the previous convolution layer is respectively input into the first chip selection module and the first convolution layer.
- the output data of the previous roll of layers is a plurality of feature maps.
- Step 203 The first chip selection module is called, and the first chip selection module generates a feature map weight vector according to the output data of the previous convolution layer.
- Each point in the feature map weight vector corresponds to a feature map in a first convolutional layer and a weight value.
- the feature map weight vector can be represented by ⁇ .
- the weight values in the feature map weight vector are in one-to-one correspondence with the candidate feature maps output by the first convolution layer.
- Step 204 Determine the number N of target features according to the preset acceleration ratio.
- the preset acceleration ratio can be represented by ,.
- the preset acceleration ratio indicates the degree of improvement of the processing efficiency of the convolutional neural network. The larger the preset acceleration ratio, the greater the degree of improvement in the processing efficiency of the convolutional neural network, and the smaller the number of target features N is. Thus, the number of feature maps that need to be processed in the next layer of convolutional layers is less.
- the processing efficiency of the convolutional neural network can also be improved relative to the reduction in the number of feature maps that need to be processed in the prior art.
- a specific value of the acceleration ratio may be set by a person skilled in the art according to actual requirements, which is not specifically limited in the embodiment of the present application.
- Step 205 Adjust the weight value of other points except the first N points in the feature map weight vector to 0, and input the adjusted feature map weight vector into the first convolution layer.
- the feature map corresponding to the first N points in the feature weight vector is the target feature map. If the weight value of a point in the feature weight vector is adjusted to 0, it means that the feature map corresponding to the point does not participate in the input in the first convolution layer. Convolution processing of data.
- the first convolutional layer contains 100 feature maps and N is 50
- the first 50 feature maps with high matching degree with the input data are selected from the 100 feature maps to participate in the convolution processing.
- Step 206 Calling the first convolution layer, and determining, by the first convolution layer, the target feature map according to the adjusted feature map weight vector.
- the feature map corresponding to the point whose weight value is non-zero is the target feature map.
- Step 207 Convolution processing the output data of the previous convolution layer according to the target feature map to obtain output data.
- the weight value other than the first N weight values having the largest median value in the feature map weight vector may be adjusted to 0.
- the target feature map is determined from the above candidate feature map according to the adjusted feature map weight vector.
- the weight values in the feature map weight vector may be sorted in descending order, and then the weight values arranged in the top N are retained, The other weight values arranged in the first N are adjusted to zero.
- the first chip selection module may determine the target feature image from the candidate feature map according to the one-to-one correspondence between the weight value in the feature map weight vector and the candidate feature map.
- the candidate feature map of the first convolutional layer output is 10, followed by the candidate feature map A - the candidate feature map J, assuming that the adjusted feature map weight vector is [0, 0, a, b, 0 , c, d, e, 0, f], where af denotes a weight value whose value is not zero.
- the first chip selection module may determine the candidate feature map C, the candidate feature map D, and the candidate feature map corresponding to the weight value af.
- the candidate feature map G, the candidate feature map H, and the candidate feature map J are target feature maps. These target feature maps are the output data of the first convolutional layer.
- the first chip selection module may include a fully connected layer, and the step of the first chip selection module to generate a feature map weight vector according to the output data of the previous convolution layer may include:
- the output data of the previous convolution layer is processed by using a global average-pooling algorithm (global-average-pooling); and the processed processing result is input into the fully connected layer to obtain a feature map weight vector.
- a global average-pooling algorithm global-average-pooling
- the global average pooling algorithm can be used to process the output data of the previous layer of the layer, and the processing result is obtained. That is to say, the first chip selection module can globally average the feature map outputted by the previous volume, and output an average value corresponding to each feature map. These average values can then be input into the fully connected layer, and the full values are further processed to obtain the weight vectors corresponding to the average values. Further, the weight vector outputted by the fully connected layer can be used as the feature map weight vector.
- the weight value with smaller value can be adjusted to 0, and the corresponding candidate feature map is discarded, and the next convolution layer is no longer input. In this way, the processing efficiency of the convolutional neural network can be improved as much as possible while ensuring the accuracy of the image prediction result.
- the data is output to the next convolutional layer; the next convolutional layer performs the process in steps 201 to 207 to obtain the output data, and the output data is output.
- the feature map corresponding to the image is predicted.
- the image processing method provided by the embodiment of the present application pre-sets a chip selection module for one or more convolution layers in the convolutional neural network, and in the process of predicting the image by the convolutional neural network, the volume is selected by the chip selection module.
- the feature maps in the stack are screened, and some of the feature maps are selected from the plurality of feature maps included in the convolutional layer as the target feature map to calculate the convolution output, compared to the existing image processing method, not in the convolution layer
- the feature map is filtered to calculate the convolution output of each of the feature maps included in the convolutional layer as the target feature map, thereby reducing the amount of calculation and improving the task processing efficiency.
- FIG. 3 a block diagram of a structure of an image processing apparatus according to Embodiment 3 of the present application is shown.
- the image processing apparatus of the embodiment of the present application may include: a determining module 301 configured to determine whether the first pre-committed first convolution layer is set with the first chip selection process during the convolution process of the image by the convolutional neural network a module, wherein the convolutional neural network includes a plurality of convolution layers, each convolution layer includes a plurality of feature maps; and the first input module 302 is configured to be configured on the first convolution layer When a piece of the module is selected, the output data of the previous roll of the layer is respectively input to the first chip selection module and the first convolution layer; the first invoking module 303 is configured to invoke the first piece And selecting, by the first chip selection module, the target feature map from the feature map included in the first convolution layer according to the output data of the previous convolution layer; the second calling module 304 is configured to Calling the first convolution layer, and convoluting the output data of the previous convolution layer by the first convolution layer according to the target feature map to obtain output data.
- the first chip selection module is configured to: generate a feature map weight vector according to the output data of the previous convolution layer; wherein each point in the feature map weight vector corresponds to the first a feature map in the convolutional layer and a weight value; determining a target feature number N according to a preset acceleration ratio; adjusting a weight value of other points outside the first N points in the feature map weight vector to 0, and adjusting The feature map weight vector is input to the first convolution layer; wherein the feature map corresponding to the first N points is the target feature map.
- the first convolution layer is configured to: determine the target feature map according to the adjusted feature map weight vector; and convolute output data of the previous convolution layer according to the target feature map Process to get the output data.
- the device further includes: a second input module 305, configured to input the output data of the previous convolution layer to the first when the first convolution layer is not provided with the first chip selection module
- the third invoking module 306 is configured to invoke the first convolutional layer
- the first convolutional layer outputs data of the previous convolution layer according to all the included feature maps. Convolution processing is performed to obtain output data.
- the image processing apparatus of the embodiment of the present application is used to implement the corresponding image processing method in the first embodiment and the second embodiment, and has the beneficial effects corresponding to the method embodiment, and details are not described herein again.
- FIG. 4 a structural block diagram of a terminal for image processing according to Embodiment 4 of the present application is shown.
- the terminal of the embodiment of the present application may include: a memory, a processor, and an image processing program stored on the memory and operable on the processor, and the image processing program is executed by the processor to implement any one of the image processings described in the present application.
- the steps of the method may include: a memory, a processor, and an image processing program stored on the memory and operable on the processor, and the image processing program is executed by the processor to implement any one of the image processings described in the present application. The steps of the method.
- FIG. 4 is a block diagram of an image processing terminal 600, according to an exemplary embodiment.
- terminal 600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
- terminal 600 can include one or more of the following components: processing component 602, memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, And a communication component 616.
- processing component 602 memory 604, power component 606, multimedia component 608, audio component 610, input/output (I/O) interface 612, sensor component 614, And a communication component 616.
- Processing component 602 typically controls the overall operation of device 600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
- Processing component 602 can include one or more processors 620 to execute instructions to perform all or part of the steps of the above described methods.
- processing component 602 can include one or more modules to facilitate interaction between component 602 and other components.
- processing component 602 can include a multimedia module to facilitate interaction between multimedia component 608 and processing component 602.
- Memory 604 is configured to store various types of data to support operation at terminal 600. Examples of such data include instructions for any application or method operating on terminal 600, contact data, phone book data, messages, pictures, videos, and the like.
- the memory 604 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read only memory
- EPROM erasable Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Disk Disk or Optical Disk.
- Power component 606 provides power to various components of terminal 600.
- Power component 606 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal 600.
- the multimedia component 608 includes a screen between the terminal 600 and the user that provides an output interface.
- the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
- the multimedia component 608 includes a front camera and/or a rear camera. When the terminal 600 is in an operation mode such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 610 is configured to output and/or input an audio signal.
- the audio component 610 includes a microphone (MIC) that is configured to receive an external audio signal when the terminal 600 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
- the received audio signal may be further stored in memory 604 or transmitted via communication component 616.
- audio component 610 also includes a speaker for outputting an audio signal.
- the I/O interface 612 provides an interface between the processing component 602 and the peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
- Sensor assembly 614 includes one or more sensors for providing terminal 600 with various aspects of status assessment.
- sensor component 614 can detect an open/closed state of terminal 600, a relative positioning of components, such as the display and keypad of terminal 600, and sensor component 614 can also detect a change in position of a component of terminal 600 or terminal 600. The presence or absence of contact by the user with the terminal 600, the orientation or acceleration/deceleration of the device 600 and the temperature change of the terminal 600.
- Sensor assembly 614 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
- Sensor assembly 614 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor assembly 614 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
- Communication component 616 is configured to facilitate wired or wireless communication between terminal 600 and other devices.
- the terminal 600 can access a wireless network based on a communication standard such as WiFi, 2G or 3G, or a combination thereof.
- communication component 616 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
- the communication component 616 also includes a near field communication (NFC) module to facilitate short range communication.
- NFC near field communication
- the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra-wideband
- Bluetooth Bluetooth
- terminal 600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing an image processing method, in particular an image processing method comprising:
- ASICs application specific integrated circuits
- DSPs digital signal processors
- DSPDs digital signal processing devices
- PLDs programmable logic devices
- FPGA field programmable A gate array
- controller microcontroller, microprocessor or other electronic component implementation for performing an image processing method, in particular an image processing method comprising:
- the convolutional neural network includes multiple convolutional layers, Each convolution layer includes a plurality of feature maps; the first convolution layer is provided with a first chip selection module, and output data of the previous convolution layer is respectively input to the first chip selection module and the Retrieving the first chip selection module, and determining, by the first chip selection module, the target from the feature map included in the first convolution layer according to the output data of the previous convolution layer Feature map; calling the first convolution layer, and convoluting the output data of the previous convolution layer by the first convolution layer according to the target feature map to obtain output data.
- the first chip selection module is called, and the first chip selection module determines, according to the output data of the previous convolution layer, from the feature map included in the first convolution layer
- the step of the target feature map includes: invoking the first chip selection module, and generating, by the first chip selection module, a feature map weight vector according to the output data of the previous convolution layer; wherein, in the feature map weight vector Each point corresponds to a feature map in the first convolutional layer and a weight value; determining a number of target features N according to a preset acceleration ratio; and other than the first N points in the feature map weight vector
- the weight value of the point is adjusted to 0, and the adjusted feature map weight vector is input into the first convolution layer; wherein the feature map corresponding to the first N points is the target feature map.
- the first convolution layer is invoked, and the first convolution layer performs convolution processing on the output data of the previous convolution layer according to the target feature map to obtain output data.
- the method includes: invoking the first convolution layer, and determining, by the first convolution layer, the target feature map according to the adjusted feature map weight vector; and the previous convolution layer according to the target feature map The output data is convoluted to obtain output data.
- the method further includes: when the first convolution layer is not provided with the first chip selection module, inputting output data of the previous convolution layer into the first convolution layer;
- the first convolutional layer performs convolution processing on the output data of the previous convolution layer by the first convolutional layer according to all the feature maps included to obtain output data.
- a non-transitory computer readable storage medium comprising instructions, such as a memory 604 comprising instructions executable by processor 620 of terminal 600 to perform the image processing method described above.
- the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
- the terminal pre-sets a chip selection module for one or more convolution layers in the convolutional neural network, and in the process of predicting the image by the convolutional neural network, the convolution layer is performed by the chip selection module.
- the feature map is filtered, and part of the feature map is selected from the plurality of feature maps included in the convolutional layer as the target feature map to calculate the convolution output.
- the features in the convolution layer are not If the graph is filtered to calculate the convolution output as the target feature map for each of the feature maps included in the convolutional layer, the amount of calculation can be reduced, thereby improving the task processing efficiency.
- the embodiment of the present application further provides an application product for performing the steps of any one of the image processing methods described in the present application at runtime.
- the terminal the computer readable storage medium and the application product embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
- modules in the devices of the embodiments can be adaptively changed and placed in one or more devices different from the embodiment.
- the modules or units or components of the embodiments may be combined into one module or unit or component, and further they may be divided into a plurality of sub-modules or sub-units or sub-components.
- any combination of the features disclosed in the specification, including the accompanying claims, the abstract and the drawings, and any methods so disclosed, or All processes or units of the device are combined.
- Each feature disclosed in this specification (including the accompanying claims, the abstract and the drawings) may be replaced by alternative features that provide the same, equivalent or similar purpose.
- the various component embodiments of the present application can be implemented in hardware, or in a software module running on one or more processors, or in a combination thereof.
- a microprocessor or digital signal processor may be used in practice to implement some or all of the functionality of some or all of the components of the image processing schemes in accordance with embodiments of the present application.
- the application can also be implemented as a device or device program (e.g., a computer program and a computer program product) for performing some or all of the methods described herein.
- Such a program implementing the present application may be stored on a computer readable medium or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, provided on a carrier signal, or provided in any other form.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biodiversity & Conservation Biology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
本申请实施例提供了一种图像处理方法、装置及终端,其中所述方法包括:通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;若所述第一卷积层设置有第一片选模块,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。通过本申请实施例提供的图像处理方法,能够减小计算量,从而提高任务处理效率。
Description
本申请要求于2017年11月28日提交中国专利局、申请号为201711219332.9、发明名称为“图像处理方法、装置及终端”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及图像处理技术领域,特别是涉及一种图像处理方法、装置及终端。
深度学习在视频图像、语音识别、自然语言处理等相关领域得到了广泛应用。卷积神经网络作为深度学习的一个重要分支,由于其超强的拟合能力以及端到端的全局优化能力,使得其在目标检测、分类等计算机视觉任务中所得预测结果的精度大幅提升。
但是卷积神经网络属于计算密集型算法,计算量大,在中央处理器上处理速度慢,任务处理效率低,导致其难以在实时性要求较高的任务中使用。
发明内容
本申请实施例提供一种图像处理方法、装置及终端,以解决现有技术中存在卷积神经网络对任务的处理效率低的问题。
依据本申请的一个方面,提供了一种图像处理方法,包括:通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;其中,所述卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图;在所述第一卷积层设置有第一片选模块时,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选地,所述调用所述第一片选模块,由所述第一片选模块依据所述 前一卷积层的输出数据,从所述第一卷积层包含的特征图中,确定目标特征图的步骤,包括:调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量;其中,特征图权重向量中的每个点对应一个所述第一卷积层中的特征图以及一个权重值;依据预设加速比,确定目标特征个数N;将所述特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至所述第一卷积层中;其中,前N个点对应的特征图为目标特征图。
可选地,所述调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据的步骤,包括:调用所述第一卷积层,由所述第一卷积层依据调整后的特征图权重向量确定所述目标特征图;依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选地,所述方法还包括:在所述第一卷积层未设置第一片选模块时,将前一卷积层的输出数据分别输入至所述第一卷积层中;调用所述第一卷积层,由所述第一卷积层依据包含的全部特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
根据本申请的另一方面,提供了一种图像处理装置,所述装置包括:判断模块,被配置为通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;其中,所述卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图;第一输入模块,被配置为在所述第一卷积层设置有第一片选模块时,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;第一调用模块,被配置为调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;第二调用模块,被配置为调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选地,所述第一片选模块被配置为:依据所述前一卷积层的输出数据,生成特征图权重向量;其中,特征图权重向量中的每个点对应一个所述第一卷积层中的特征图以及一个权重值;依据预设加速比,确定目标特 征个数N;将所述特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至所述第一卷积层中;其中,前N个点对应的特征图为目标特征图。
可选地,所述第一卷积层被配置为:依据调整后的特征图权重向量确定所述目标特征图;依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选地,所述装置还包括:第二输入模块,被配置为在所述第一卷积层未设置第一片选模块时,将前一卷积层的输出数据分别输入至所述第一卷积层中;第三调用模块,被配置为调用所述第一卷积层,由所述第一卷积层依据包含的全部特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
根据本申请的另一方面,提供了另一种图像处理方法,所述方法包括:
将图像输入卷积神经网络进行卷积处理,其中,所述卷积神经网络中包含多个卷积层,至少一个卷积层设置有片选模块;
所述卷积神经网络对所述图像进行卷积处理的方式,包括:
设置有第一片选模块的第一卷积层对前一卷积层的输出数据进行卷积处理,得到备选特征图;
所述第一片选模块依据所述前一卷积层的输出数据,从所述备选特征图中确定目标特征图,作为所述第一卷积层的输出数据。
可选地,所述第一片选模块依据所述前一卷积层的输出数据,从所述备选特征图中确定目标特征图,包括:
所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量;其中,所述特征图权重向量中的权重值与所述备选特征图一一对应;
依据预设加速比,确定目标特征个数N;
将所述特征图权重向量中值最大的前N个权重值外的其他权重值调整为0;
依据调整后的特征图权重向量从所述备选特征图中确定目标特征图。
可选地,所述第一片选模块包括全连接层;
所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量,包括:
采用全局平均池化算法对所述前一卷积层的输出数据进行处理;
将处理得到的处理结果输入所述全连接层,得到特征图权重向量。
根据本申请的再一方面,提供了一种终端,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的图像处理程序,所述图像处理程序被所述处理器执行时实现本申请中所述的任意一种图像处理方法的步骤。
根据本申请的又一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有图像处理程序,所述图像处理程序被处理器执行时实现本申请中所述的任意一种图像处理方法的步骤。
根据本申请的又一方面,提供了应用程序产品,所述应用程序产品用于在运行时执行本申请中所述的任意一种图像处理方法的步骤。
与现有技术相比,本申请具有以下优点:
本申请实施例提供的图像处理方案,预先为卷积神经网络中的一个或多个卷积层设置片选模块,在通过卷积神经网络对图像进行预测的过程中,通过片选模块对卷积层中的特征图进行筛选,从卷积层包含的多张特征图中筛选出部分特征图作为目标特征图计算卷积输出,相较于现有的图像处理方案中,不对卷积层中的特征图进行筛选将该卷积层包含的各张特征图均作为目标特征图计算卷积输出而言,能够减小计算量,从而提高任务处理效率。
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。
通过阅读下文优选实施方式的详细描述,各种优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1是根据本申请实施例一的一种图像处理方法的步骤流程图;
图2是根据本申请实施例二的一种图像处理方法的步骤流程图;
图3是根据本申请实施例三的一种图像处理装置的结构框图;
图4是根据本申请实施例四的一种终端的结构框图。
下面将参照附图更详细地描述本申请的示例性实施例。虽然附图中显示了本申请的示例性实施例,然而应当理解,可以以各种形式实现本申请而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本申请,并且能够将本申请的范围完整的传达给本领域的技术人员。
实施例一
参照图1,示出了本申请实施例一的一种图像处理方法的步骤流程图。
本申请实施例的图像处理方法可以包括以下步骤:
步骤101:通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块。
其中,卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图。本领域技术人员可以根据实际需求为一个卷积层设置片选模块,也可以为多个卷积层分别设置片选模块。
本申请实施例中图像可以为视频中的单帧图像,也可以仅为一个多媒体图像。一张图像输入到卷积神经网络中,经过各卷积层处理后得到特征图。在卷积神经网络中,上一层卷积层的输出数据将作为下一卷积层的输入数据,逐层卷积处理后得到最终结果。
步骤102:若第一卷积层设置有第一片选模块,将前一卷积层的输出数据分别输入至第一片选模块以及第一卷积层中。
卷积层的输出数据为待处理图像在该卷积层中对应的特征图。待处理图像即为上述输入卷积神经网络进行卷积处理的图像。
步骤103:调用第一片选模块,由第一片选模块依据前一卷积层的输出数据,从第一卷积层包含的特征图中确定目标特征图。
前一卷积层的输出数据为多张特征图,第一片选模块分别将各特征图与第一卷积层中包含的各特征图建立关联处理,确定出与输出数据匹配度高的预设数量的目标特征图。
步骤104:调用第一卷积层,由第一卷积层依据目标特征图对前一卷积层的输出数据进行卷积处理,得到输出数据。
作为另一种实施方式,卷积神经网络对图像进行卷积处理时,对于设置有第一片选模块的第一卷积层来说,前一卷积层的输出数据分别输入至第一片选模块以及第一卷积层中,第一卷积层可以对前一卷积层的输出数据进行卷积处理,进而得到备选特征图。第一片选模块可以依据前一卷积层的输出数据,从这些备选特征图中确定目标特征图,将确定的目标特征图作为第一卷积层的输出数据。
对于未设置片选模块的第二卷积层来说,第二卷积层可以对前一层的输出数据进行卷积处理,得到多张特征图,该多张特征图则为第二卷积层输出数据。
卷积层依据特征图对输入的数据进行卷积处理的具体方式,参照现有相关技术即可,本申请实施例中对此不再赘述。
第一卷积层和第一片选模块对前一卷积层的输出数据进行处理后,输出数据至下一卷积层;下一卷积层执行步骤101至步骤104中的流程得到输出数据,将输出数据输入至再下一个卷积层,各卷积层处理前一卷积层的输出数据时,均执行步骤101至步骤104直至卷积神经网络中的各卷积层均执行完毕后,预测得到图像对应的特征图。
本申请实施例提供的图像处理方法,预先为卷积神经网络中的一个或多个卷积层设置片选模块,在通过卷积神经网络对图像进行预测的过程中,通过片选模块对卷积层中的特征图进行筛选,从卷积层包含的多张特征图中筛选出部分特征图作为目标特征图计算卷积输出,相较于现有的图像处理方法中,不对卷积层中的特征图进行筛选,将该卷积层包含的各张特征图均作为目标特征图计算卷积输出而言,能够减小计算量,从而提高任务处理效率。
实施例二
参照图2,示出了本申请实施例二的一种图像处理方法的步骤流程图。
本申请实施例的图像处理方法具体可以包括以下步骤:
步骤201:通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;若是,则执行步骤202;若否,则执行预设操作。
其中,卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图。本领域技术人员可以根据实际需求有选择性的为一个或多个卷积层设置片选模块。对于设置有片选模块的卷积神经网络的训练与未设置片选模块的卷积神经网络的训练方式相同,因此对于本申请实施例中卷积神经网络的训练参照相关技术即可,本申请实施例中对此不作具体限制。
一张图像输入到卷积神经网络中,经过各卷积层处理后得到特征图。在卷积神经网络中,上一层卷积层的输出数据将作为下一卷积层的输入数据,逐层卷积处理后得到最终结果。各卷积层对输入数据的处理流程相同,本申请实施例中以单个卷积层的处理流程为例进行说明。
其中,预设操作可以设置为在第一卷积层未设置第一片选模块时,将前一卷积层的输出数据输入至第一卷积层中;调用第一卷积层,由第一卷积层依据包含的全部特征图对前一卷积层的输出数据进行卷积处理,得到输出数据。
例如:第一卷积层中包含100张特征图,则在通过第一卷积层对前一卷积层的输出数据进行卷积处理时,依据这100张特征图对输入第一卷积 层的输入数据进行卷积处理,确定输入数据在该卷积层中匹配的特征图作为输出数据,输入至下一卷积层。
步骤202:若第一卷积层设置有第一片选模块,将前一卷积层的输出数据分别输入至第一片选模块以及第一卷积层中。
前一卷积层的输出数据为多张特征图。
步骤203:调用第一片选模块,由第一片选模块依据前一卷积层的输出数据,生成特征图权重向量。
特征图权重向量中的每个点对应一个第一卷积层中的特征图以及一个权重值。其中,特征图权重向量可以用σ表示。
作为另一种实施方式,特征图权重向量中的权重值与第一卷积层输出的备选特征图一一对应。
步骤204:依据预设加速比,确定目标特征个数N。
预设加速比可以用ζ表示,预设加速比越大则目标特征个数N越小,预设加速比越小则目标特征个数N越大。其中,预设加速比表示卷积神经网络的处理效率提高的程度,预设加速比越大,表示卷积神经网络的处理效率需要提高的程度越大,那么目标特征个数N则需要越小,这样,下一层卷积层需要处理的特征图数量才越少。
相反的,预设加速比越小,表示卷积神经网络的处理效率需要提高的程度越小,那么目标特征个数N则可以多一些,这样,下一层卷积层需要处理的特征图数量相对于现有技术中需要处理的特征图数量减少,也可以提高卷积神经网络的处理效率。
在具体实现过程中,本领域技术人员可以根据实际需求设置加速比的具体数值,本申请实施例中对此不作具体限制。
步骤205:将特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至第一卷积层中。
特征图权重向量中前N个点对应的特征图为目标特征图,将特征权重向量中某点的权重值调整为0,则表示该点对应的特征图不参与对第一卷 积层中输入数据的卷积处理。
例如:第一卷积层中包含100张特征图,N为50,则从100张特征图中选择与输入数据匹配度高的前50张特征图参与卷积处理即可。
步骤206:调用第一卷积层,由第一卷积层依据调整后的特征图权重向量确定目标特征图。
调整后的特征权重向量中,权重值为非0值的点对应的特征图为目标特征图。
步骤207:依据目标特征图对前一卷积层的输出数据进行卷积处理,得到输出数据。
计算第一卷积层的输出数据即特征图输出时,根据Y‘=Yσ,其中,Y‘为第一卷积层的输出数据。由于在计算第一卷积层的输出数据时,第一卷积层中权重值为0的特征图不再进行计算,以此来加速第一卷积层的预测效率。
作为另一种实施方式,第一片选模块依据预设加速比确定目标特征个数N后,可以将特征图权重向量中值最大的前N个权重值以外的其他权重值调整为0,进而依据调整后的特征图权重向量从上述备选特征图中确定目标特征图。
第一片选模块依据预设加速比确定目标特征个数N后,可以将特征图权重向量中的权重值按照从大到小的顺序排序,然后将排列在前N个的权重值保留,将该排列在前N个外的其他权重值调整为0。
进而,第一片选模块可以根据特征图权重向量中的权重值与上述备选特征图一一对应关系,从备选特征图中确定目标特征图。
例如,第一卷积层输出的备选特征图为10张,依次为备选特征图A-备选特征图J,假设调整后的特征图权重向量为[0,0,a,b,0,c,d,e,0,f],其中,a-f表示值不为0的权重值。那么,根据特征图权重向量中的权重值与备选特征图一一对应关系,第一片选模块可以确定与权重值a-f对应的备选特征图C、备选特征图D、备选特征图F、备选特征图G、备选特征图H及 备选特征图J为目标特征图。这些目标特征图即为第一卷积层的输出数据。
在一种实施方式中,上述第一片选模块可以包括全连接层,上述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量的步骤,可以包括:
采用全局平均池化算法(global-average-pooling)对所述前一卷积层的输出数据进行处理;将处理得到的处理结果输入所述全连接层,得到特征图权重向量。
第一片选模块获取前一卷积层的输出数据后,便可以采用全局平均池化算法对前一卷积层的输出数据进行处理,进而得到处理结果。也就是说,第一片选模块可以将前一卷积层输出的特征图进行全局平均处理,输出每个特征图对应的平均值。接下来可以将这些平均值输入全连接层中,全连接对这些平均值进行进一步处理,得到这些平均值对应的权重向量,进而,便可以全连接层输出的权重向量作为特征图权重向量。
特征图权重向量中权重值越大,表明其对应的备选特征图所包括的图像特征越重要,权重值越小,表明其对应的备选特征图所包括的图像特征越不重要,因此在调整特征图权重向量中的权重值时,可以将值较小的权重值调整为0,其对应的备选特征图即被舍弃,不再输入下一卷积层。这样可以在尽量保证图像预测结果准确度的同时,尽可能提高卷积神经网络的处理效率。
第一卷积层对前一卷积层的输出数据进行卷积处理后,输出数据至下一卷积层;下一卷积层执行步骤201至步骤207中的流程得到输出数据,将输出数据输入至再下一个卷积层,直至卷积神经网络中的各卷积层均执行完卷积处理后,预测得到图像对应的特征图。
本申请实施例提供的图像处理方法,预先为卷积神经网络中的一个或多个卷积层设置片选模块,在通过卷积神经网络对图像进行预测的过程中,通过片选模块对卷积层中的特征图进行筛选,从卷积层包含的多张特征图中筛选出部分特征图作为目标特征图计算卷积输出,相较于现有的图像处理方法中,不对卷积层中的特征图进行筛选将该卷积层包含的各张特征图 均作为目标特征图计算卷积输出而言,能够减小计算量,从而提高任务处理效率。
实施例三
参照图3,示出了本申请实施例三的一种图像处理装置的结构框图。
本申请实施例的图像处理装置可以包括:判断模块301,被配置为通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;其中,所述卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图;第一输入模块302,被配置为在所述第一卷积层设置有第一片选模块时,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;第一调用模块303,被配置为调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;第二调用模块304,被配置为调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选的,所述第一片选模块被配置为:依据所述前一卷积层的输出数据,生成特征图权重向量;其中,特征图权重向量中的每个点对应一个所述第一卷积层中的特征图以及一个权重值;依据预设加速比,确定目标特征个数N;将所述特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至所述第一卷积层中;其中,前N个点对应的特征图为目标特征图。
可选的,所述第一卷积层被配置为:依据调整后的特征图权重向量确定所述目标特征图;依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选的,所述装置还包括:第二输入模块305,被配置为在所述第一卷积层未设置第一片选模块时,将前一卷积层的输出数据分别输入至所述第一卷积层中;第三调用模块306,被配置为调用所述第一卷积层,由所述第一卷积层依据包含的全部特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
本申请实施例的图像处理装置用于实现前述实施例一、实施例二中相应的图像处理方法,并具有与方法实施例相应的有益效果,在此不再赘述。
实施例四
参照图4,示出了本申请实施例四的一种用于图像处理的终端的结构框图。
本申请实施例的终端可以包括:存储器、处理器及存储在存储器上并可在处理器上运行的图像处理程序,图像处理程序被处理器执行时实现本申请中所述的任意一种图像处理方法的步骤。
图4是根据一示例性实施例示出的一种图像处理终端600的框图。例如,终端600可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图4,终端600可以包括以下一个或多个组件:处理组件602,存储器604,电源组件606,多媒体组件608,音频组件610,输入/输出(I/O)的接口612,传感器组件614,以及通信组件616。
处理组件602通常控制装置600的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件602可以包括一个或多个处理器620来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件602可以包括一个或多个模块,便于处理组件602和其他组件之间的交互。例如,处理部件602可以包括多媒体模块,以方便多媒体组件608和处理组件602之间的交互。
存储器604被配置为存储各种类型的数据以支持在终端600的操作。这些数据的示例包括用于在终端600上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器604可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件606为终端600的各种组件提供电力。电源组件606可以包 括电源管理系统,一个或多个电源,及其他与为终端600生成、管理和分配电力相关联的组件。
多媒体组件608包括在所述终端600和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件608包括一个前置摄像头和/或后置摄像头。当终端600处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件610被配置为输出和/或输入音频信号。例如,音频组件610包括一个麦克风(MIC),当终端600处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器604或经由通信组件616发送。在一些实施例中,音频组件610还包括一个扬声器,用于输出音频信号。
I/O接口612为处理组件602和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件614包括一个或多个传感器,用于为终端600提供各个方面的状态评估。例如,传感器组件614可以检测到终端600的打开/关闭状态,组件的相对定位,例如所述组件为终端600的显示器和小键盘,传感器组件614还可以检测终端600或终端600一个组件的位置改变,用户与终端600接触的存在或不存在,装置600方位或加速/减速和终端600的温度变化。传感器组件614可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件614还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件614还可以包括加速度传感器,陀螺仪传感器,磁传感器, 压力传感器或温度传感器。
通信组件616被配置为便于终端600和其他设备之间有线或无线方式的通信。终端600可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信部件616经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信部件616还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,终端600可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行图像处理方法,具体地图像处理方法包括:
通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;其中,所述卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图;所述第一卷积层设置有第一片选模块,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选的,所述调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中,确定目标特征图的步骤,包括:调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量;其中,特征图权重向量中的每个点对应一个所述第一卷积层中的特征图以及一个权重值;依据预设加速比,确定目标特征个数N;将所述特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至所述第一卷积层中;其中,前N个点对应的特征图为目标特征图。
可选的,所述调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据的步骤,包括:调用所述第一卷积层,由所述第一卷积层依据调整后的特征图权重向量确定所述目标特征图;依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
可选的,所述方法还包括:在所述第一卷积层未设置第一片选模块时,将前一卷积层的输出数据分别输入至所述第一卷积层中;调用所述第一卷积层,由所述第一卷积层依据包含的全部特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器604,上述指令可由终端600的处理器620执行以完成上述图像处理方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。当存储介质中的指令由终端的处理器执行时,使得终端能够执行本申请中所述的任意一种图像处理方法的步骤。
本申请实施例提供的终端,预先为卷积神经网络中的一个或多个卷积层设置片选模块,在通过卷积神经网络对图像进行预测的过程中,通过片选模块对卷积层中的特征图进行筛选,从卷积层包含的多张特征图中筛选出部分特征图作为目标特征图计算卷积输出,相较于现有的图像处理方法中,不对卷积层中的特征图进行筛选将该卷积层包含的各张特征图均作为目标特征图计算卷积输出而言,能够减小计算量,从而提高任务处理效率。
本申请实施例还提供了一种应用程序产品,该应用程序产品用于在运行时执行本申请中所述的任意一种图像处理方法的步骤。
对于装置、终端、计算机可读存储介质及应用程序产品实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
在此提供的图像处理方案不与任何特定计算机、虚拟系统或者其它设 备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造具有本申请方案的系统所要求的结构是显而易见的。此外,本申请也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本申请的内容,并且上面对特定语言所做的描述是为了披露本申请的最佳实施方式。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本申请的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
类似地,应当理解,为了精简本申请并帮助理解各个申请方面中的一个或多个,在上面对本申请的示例性实施例的描述中,本申请的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本申请要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如权利要求书所反映的那样,申请方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本申请的单独实施例。
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本申请的范围之内并且形成不同的实施例。例如,在权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来 使用。
本申请的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本申请实施例的图像处理方案中的一些或者全部部件的一些或者全部功能。本申请还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本申请的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。
应该注意的是上述实施例对本申请进行说明而不是对本申请进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本申请可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。
Claims (14)
- 一种图像处理方法,其特征在于,所述方法包括:通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;其中,所述卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图;在所述第一卷积层设置有第一片选模块时,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
- 根据权利要求1所述的方法,其特征在于,所述调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中,确定目标特征图的步骤,包括:调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量;其中,特征图权重向量中的每个点对应一个所述第一卷积层中的特征图以及一个权重值;依据预设加速比,确定目标特征个数N;将所述特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至所述第一卷积层中;其中,前N个点对应的特征图为目标特征图。
- 根据权利要求2所述的方法,其特征在于,所述调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据的步骤,包括:调用所述第一卷积层,由所述第一卷积层依据调整后的特征图权重向量确定所述目标特征图;依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
- 根据权利要求1所述的方法,其特征在于,所述方法还包括:在所述第一卷积层未设置第一片选模块时,将前一卷积层的输出数据分别输入至所述第一卷积层中;调用所述第一卷积层,由所述第一卷积层依据包含的全部特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
- 一种图像处理装置,其特征在于,所述装置包括:判断模块,被配置为通过卷积神经网络对图像进行卷积处理的过程中,判断当前预调用的第一卷积层是否设置有第一片选模块;其中,所述卷积神经网络中包含多个卷积层,每个卷积层中包含多张特征图;第一输入模块,被配置为在所述第一卷积层设置有第一片选模块时,将前一卷积层的输出数据分别输入至所述第一片选模块以及所述第一卷积层中;第一调用模块,被配置为调用所述第一片选模块,由所述第一片选模块依据所述前一卷积层的输出数据,从所述第一卷积层包含的特征图中确定目标特征图;第二调用模块,被配置为调用所述第一卷积层,由所述第一卷积层依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
- 根据权利要求5所述的装置,其特征在于,所述第一片选模块被配置为:依据所述前一卷积层的输出数据,生成特征图权重向量;其中,特征图权重向量中的每个点对应一个所述第一卷积层中的特征图以及一个权重值;依据预设加速比,确定目标特征个数N;将所述特征图权重向量中前N个点外的其他点的权重值调整为0,将调整后的特征图权重向量输入至所述第一卷积层中;其中,前N个点对应的特征图为目标特征图。
- 根据权利要求6所述的装置,其特征在于,所述第一卷积层被配置 为:依据调整后的特征图权重向量确定所述目标特征图;依据所述目标特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
- 根据权利要求5所述的装置,其特征在于,所述装置还包括:第二输入模块,被配置为在所述第一卷积层未设置第一片选模块时,将前一卷积层的输出数据分别输入至所述第一卷积层中;第三调用模块,被配置为调用所述第一卷积层,由所述第一卷积层依据包含的全部特征图对所述前一卷积层的输出数据进行卷积处理,得到输出数据。
- 一种图像处理方法,其特征在于,所述方法包括:将图像输入卷积神经网络进行卷积处理,其中,所述卷积神经网络中包含多个卷积层,至少一个卷积层设置有片选模块;所述卷积神经网络对所述图像进行卷积处理的方式,包括:设置有第一片选模块的第一卷积层对前一卷积层的输出数据进行卷积处理,得到备选特征图;所述第一片选模块依据所述前一卷积层的输出数据,从所述备选特征图中确定目标特征图,作为所述第一卷积层的输出数据。
- 根据权利要求9所述的方法,其特征在于,所述第一片选模块依据所述前一卷积层的输出数据,从所述备选特征图中确定目标特征图,包括:所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量;其中,所述特征图权重向量中的权重值与所述备选特征图一一对应;依据预设加速比,确定目标特征个数N;将所述特征图权重向量中值最大的前N个权重值外的其他权重值调整为0;依据调整后的特征图权重向量从所述备选特征图中确定目标特征图。
- 根据权利要求10所述的方法,其特征在于,所述第一片选模块包 括全连接层;所述第一片选模块依据所述前一卷积层的输出数据,生成特征图权重向量,包括:采用全局平均池化算法对所述前一卷积层的输出数据进行处理;将处理得到的处理结果输入所述全连接层,得到特征图权重向量。
- 一种终端,其特征在于,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的图像标签确定程序,所述图像标签确定程序被所述处理器执行时实现如权利要求1至4或9至11中任一项所述的图像处理方法的步骤。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有图像处理程序,所述图像处理程序被处理器执行时实现如权利要求1至4或9至11中任一项所述的图像处理方法的步骤。
- 一种应用程序产品,其特征在于,所述应用程序产品用于在运行时执行权利要求1至4或9至11中任一项所述的图像处理方法的步骤。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/767,945 US20200293884A1 (en) | 2017-11-28 | 2018-11-16 | Image processing method and device and terminal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711219332.9 | 2017-11-28 | ||
CN201711219332.9A CN108108738B (zh) | 2017-11-28 | 2017-11-28 | 图像处理方法、装置及终端 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019105243A1 true WO2019105243A1 (zh) | 2019-06-06 |
Family
ID=62208575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/115987 WO2019105243A1 (zh) | 2017-11-28 | 2018-11-16 | 图像处理方法、装置及终端 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20200293884A1 (zh) |
CN (1) | CN108108738B (zh) |
WO (1) | WO2019105243A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108108738B (zh) * | 2017-11-28 | 2018-11-16 | 北京达佳互联信息技术有限公司 | 图像处理方法、装置及终端 |
US20200160889A1 (en) * | 2018-11-19 | 2020-05-21 | Netflix, Inc. | Techniques for identifying synchronization errors in media titles |
CN116051848B (zh) * | 2023-02-10 | 2024-01-09 | 阿里巴巴(中国)有限公司 | 图像特征提取方法、网络模型、装置及设备 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106127173A (zh) * | 2016-06-30 | 2016-11-16 | 北京小白世纪网络科技有限公司 | 一种基于深度学习的人体属性识别方法 |
CN106682736A (zh) * | 2017-01-18 | 2017-05-17 | 北京小米移动软件有限公司 | 图像识别方法及装置 |
US20170228870A1 (en) * | 2016-02-05 | 2017-08-10 | International Business Machines Corporation | Tagging Similar Images Using Neural Network |
CN108108738A (zh) * | 2017-11-28 | 2018-06-01 | 北京达佳互联信息技术有限公司 | 图像处理方法、装置及终端 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5638465A (en) * | 1994-06-14 | 1997-06-10 | Nippon Telegraph And Telephone Corporation | Image inspection/recognition method, method of generating reference data for use therein, and apparatuses therefor |
JP2002358523A (ja) * | 2001-05-31 | 2002-12-13 | Canon Inc | パターン認識処理装置及びその方法、画像入力装置 |
US7127106B1 (en) * | 2001-10-29 | 2006-10-24 | George Mason Intellectual Properties, Inc. | Fingerprinting and recognition of data |
EP3259911B1 (en) * | 2015-02-19 | 2021-04-07 | Magic Pony Technology Limited | Enhancing visual data using updated neural networks |
CN106127208A (zh) * | 2016-06-16 | 2016-11-16 | 北京市商汤科技开发有限公司 | 对图像中的多个对象进行分类的方法和系统、计算机系统 |
CN106096602A (zh) * | 2016-06-21 | 2016-11-09 | 苏州大学 | 一种基于卷积神经网络的中文车牌识别方法 |
CN106127204B (zh) * | 2016-06-30 | 2019-08-09 | 华南理工大学 | 一种全卷积神经网络的多方向水表读数区域检测算法 |
CN106250911B (zh) * | 2016-07-20 | 2019-05-24 | 南京邮电大学 | 一种基于卷积神经网络的图片分类方法 |
US9947103B1 (en) * | 2017-10-03 | 2018-04-17 | StradVision, Inc. | Learning method and learning device for improving image segmentation and testing method and testing device using the same |
-
2017
- 2017-11-28 CN CN201711219332.9A patent/CN108108738B/zh active Active
-
2018
- 2018-11-16 US US16/767,945 patent/US20200293884A1/en not_active Abandoned
- 2018-11-16 WO PCT/CN2018/115987 patent/WO2019105243A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170228870A1 (en) * | 2016-02-05 | 2017-08-10 | International Business Machines Corporation | Tagging Similar Images Using Neural Network |
CN106127173A (zh) * | 2016-06-30 | 2016-11-16 | 北京小白世纪网络科技有限公司 | 一种基于深度学习的人体属性识别方法 |
CN106682736A (zh) * | 2017-01-18 | 2017-05-17 | 北京小米移动软件有限公司 | 图像识别方法及装置 |
CN108108738A (zh) * | 2017-11-28 | 2018-06-01 | 北京达佳互联信息技术有限公司 | 图像处理方法、装置及终端 |
Also Published As
Publication number | Publication date |
---|---|
US20200293884A1 (en) | 2020-09-17 |
CN108108738A (zh) | 2018-06-01 |
CN108108738B (zh) | 2018-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108256555B (zh) | 图像内容识别方法、装置及终端 | |
WO2019184471A1 (zh) | 图像标签确定方法、装置及终端 | |
JP7238141B2 (ja) | 顔と手を関連付けて検出する方法及び装置、電子機器、記憶媒体及びコンピュータプログラム | |
TWI766286B (zh) | 圖像處理方法及圖像處理裝置、電子設備和電腦可讀儲存媒介 | |
CN106651955B (zh) | 图片中目标物的定位方法及装置 | |
CN108121952B (zh) | 人脸关键点定位方法、装置、设备及存储介质 | |
US20210117726A1 (en) | Method for training image classifying model, server and storage medium | |
KR102463101B1 (ko) | 이미지 처리 방법 및 장치, 전자 기기 및 저장 매체 | |
KR101694643B1 (ko) | 이미지 분할 방법, 장치, 기기, 프로그램 및 기록매체 | |
WO2019141042A1 (zh) | 图像分类方法、装置及终端 | |
TWI782480B (zh) | 圖像處理方法及電子設備和電腦可讀儲存介質 | |
TWI773945B (zh) | 錨點確定方法、電子設備和儲存介質 | |
WO2020134866A1 (zh) | 关键点检测方法及装置、电子设备和存储介质 | |
TW202113757A (zh) | 目標對象匹配方法及目標對象匹配裝置、電子設備和電腦可讀儲存媒介 | |
CN111160448B (zh) | 一种图像分类模型的训练方法及装置 | |
RU2628494C1 (ru) | Способ и устройство для генерирования фильтра изображения | |
CN105335684B (zh) | 人脸检测方法及装置 | |
TW202032499A (zh) | 網路模組、分配方法及裝置、電子設備和電腦可讀儲存媒體 | |
CN106485567B (zh) | 物品推荐方法及装置 | |
WO2019105243A1 (zh) | 图像处理方法、装置及终端 | |
CN106557759B (zh) | 一种标志牌信息获取方法及装置 | |
CN110458218B (zh) | 图像分类方法及装置、分类网络训练方法及装置 | |
CN107133354B (zh) | 图像描述信息的获取方法及装置 | |
CN108009563B (zh) | 图像处理方法、装置及终端 | |
CN108154093B (zh) | 人脸信息识别方法及装置、电子设备、机器可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18882510 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 07/09/2020) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18882510 Country of ref document: EP Kind code of ref document: A1 |