CN108764051A

CN108764051A - Image processing method, device and mobile terminal

Info

Publication number: CN108764051A
Application number: CN201810399087.2A
Authority: CN
Inventors: 张弓
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-04-28
Filing date: 2018-04-28
Publication date: 2018-11-06
Anticipated expiration: 2038-04-28
Also published as: CN108764051B

Abstract

This application discloses a kind of image processing method, device and mobile terminal, this method includes：In images to be recognized, identifies level-one scene by deep layer convolutional neural networks, obtain the first classification results；In the first classification results, identifies two level scene by shallow-layer convolutional neural networks, obtain the second classification results；Based on the second classification results, the scene sophisticated category information of images to be recognized is exported.This method successively classifies to major class scene and group scene by cascade convolutional neural networks, it avoids and the high calculation amount that sophisticated category is brought is carried out to group scene using single catenet, it is calculating and is achieving good balance in precision, possibility is provided to mobile terminal for landing.

Description

Image processing method, device and mobile terminal

Technical field

This application involves technical field of mobile terminals, eventually more particularly, to a kind of image processing method, device and movement End.

Background technology

Existing scene classification method includes classifying for major class scene and group scene.Major class scene includes such as day The little scene of the associations such as sky, meadow, cuisines.But people often focus more on group scene, for example sky includes cloud, too Sun, atmosphere etc., meadow include pure green meadow, yellowish green alternate meadow etc., and cuisines include water fruits and vegetables, meat etc..Identification Finer post-processing can be done by going out different group scenes, improve photo display effect.

However, at present by deep learning to complex scene carry out sophisticated category, mainly using single catenet come It carries out, it is computationally intensive, time-consuming, huge pressure is caused to mobile terminal deployment.

Invention content

In view of the above problems, present applicant proposes a kind of image processing method, device and mobile terminals, to solve above-mentioned ask Topic.

In a first aspect, the embodiment of the present application provides a kind of image processing method, this method includes：In images to be recognized In, it identifies level-one scene by deep layer convolutional neural networks, obtains the first classification results, include extremely per a kind of level-one scene Few one kind two level scene；In first classification results, identifies two level scene by shallow-layer convolutional neural networks, obtain second Classification results, the number of plies of the deep layer convolutional neural networks are more than the number of plies of the shallow-layer convolutional neural networks；Based on described Two classification results export the scene sophisticated category information of the images to be recognized.

Second aspect, the embodiment of the present application provide a kind of image processing apparatus, and described device includes：First-level class mould Block, in images to be recognized, identifying that level-one scene, the first classification results of acquisition are each by deep layer convolutional neural networks Level-one scene described in class includes at least a kind of two level scene；Secondary classification module, in first classification results, passing through Shallow-layer convolutional neural networks identify two level scene, obtain the second classification results, and the number of plies of the deep layer convolutional neural networks is more than The number of plies of the shallow-layer convolutional neural networks；Output module exports the figure to be identified for being based on second classification results The scene sophisticated category information of picture.

The third aspect, the embodiment of the present application provide a kind of mobile terminal comprising display, memory and processing Device, the display and the memory are couple to the processor, the memory store instruction, when described instruction is by described When processor executes, the processor executes the method described in above-mentioned first aspect.

Fourth aspect, the embodiment of the present application provide it is a kind of with processor can perform program code it is computer-readable Take storage medium, said program code that the processor is made to execute the method described in above-mentioned first aspect.

Compared with the existing technology, image processing method provided by the embodiments of the present application, device and mobile terminal, pass through deep layer Convolutional neural networks classify to the level-one scene in images to be recognized, then to every class level-one scene using shallow-layer convolution god The classification that two level scene is carried out through network, finally exports the scene sophisticated category information of the images to be recognized.Relative to existing Technology, the embodiment of the present application successively classify to major class scene and group scene by using cascade convolutional neural networks, It avoids and the high calculation amount that sophisticated category is brought is carried out to group scene using single catenet, calculating and taken in precision Good balance was obtained, possibility is provided to mobile terminal for landing.

These aspects or other aspects of the application can more straightforward in the following description.

Description of the drawings

In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present application, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Fig. 1 shows the flow diagram for the image processing method that the application first embodiment provides；

Fig. 2 shows the flow diagrams for the image processing method that the application second embodiment provides；

Fig. 3 shows the module frame chart for the image processing apparatus that the application 3rd embodiment provides；

Fig. 4 shows the module frame chart for the image processing apparatus that the application fourth embodiment provides；

Fig. 5 shows a kind of structure diagram of mobile terminal provided by the embodiments of the present application；

Fig. 6 shows the block diagram of the mobile terminal for executing the image processing method according to the embodiment of the present application；

Fig. 7 is traditional process schematic using single convolutional neural networks to image scene identification classification.

Specific implementation mode

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, those of ordinary skill in the art are obtained every other without creative efforts Embodiment shall fall in the protection scope of this application.

With the continuous development of machine learning and deep learning, image scene is identified using machine learning model Method has been widely applied in image scene sophisticated category.

Fig. 7 is please referred to, Fig. 7 is the traditional single convolutional neural networks of utilization (Convolutional Neural Network, CNN) to the process schematic of image scene identification classification.In Fig. 7, a test image is utilized by input Selective search algorithm extracts 2000 or so candidate regions (Region Proposal) from top to bottom in the picture, then will Each candidate region scales (warp) into the size of 227x227 and is input to CNN, by the output of last layer of full articulamentum of CNN As feature, finally by the CNN features that each candidate region is extracted be input to SVM (Support Vector Machines, Support vector machines) classify.

However, inventor has found after having studied above-mentioned scene classification system, since it is mainly using single Catenet to carry out sophisticated category to each group scene in complex scene, computationally intensive, time-consuming (such as VGG16 Model treatment one opens image and needs 47 seconds), it is not high to the utilization ratio of computing resource, cause to cause mobile terminal deployment huge Pressure.In the course of the study, inventor has studied the computationally intensive reason of existing scene classification model, have studied how Optimum Classification model structure could reduce calculation amount, improve scene discrimination efficiency, and have studied feasible mobile terminal deployment side Case, and propose the image processing method in the embodiment of the present application, device and mobile terminal.

Below will by specific embodiment to image processing method provided by the embodiments of the present application, device, mobile terminal with And storage medium is described in detail.

First embodiment

Referring to Fig. 1, Fig. 1 shows the flow diagram for the image processing method that the application first embodiment provides.Institute It states image processing method to classify to the level-one scene in images to be recognized by deep layer convolutional neural networks, then to every class Level-one scene carries out the classification of two level scene using shallow-layer convolutional neural networks, finally exports the scene essence of the images to be recognized Category information is segmented, avoids and the high calculation amount that sophisticated category is brought is carried out to group scene using single catenet, counting It calculates and achieves good balance in precision, possibility is provided to mobile terminal for landing.In the particular embodiment, described image Processing method is applied to image processing apparatus 300 as shown in Figure 3 and the mobile terminal configured with image processing apparatus 300 100 (Fig. 5), described image processing method are used to improve scene sophisticated category effect of the mobile terminal 100 when shooting image Rate.It will be explained in detail below for flow shown in FIG. 1 by taking mobile phone as an example.Above-mentioned image processing method is specifically It may comprise steps of：

Step S101：In images to be recognized, level-one scene is identified by deep layer convolutional neural networks, obtains the first classification As a result.

In the embodiment of the present application, the images to be recognized can be in the image acquisition modality of mobile phone camera when shooting The image of display can also be the image that shooting is stored in after the completion in local photograph album, can also be the image obtained from high in the clouds Deng；It can be two-dimensional image, can also be three-dimensional image.The level-one scene, also refers to scene classification In be located at upper scene.The two level scene also refers to be located at bottom relative to the level-one scene in scene classification Scene.For example, shooting a sub-picture by mobile phone camera, it is likely to occur sky, meadow and house in the image at this time, and day Sky may be the sky of the sky at noon, the sky of dusk or the late into the night, in this embodiment, sky, meadow and house these do not have The scene of obvious relation between persistence may be considered level-one scene, and belong to the sky at noon, the sky of dusk and the late into the night of sky bottom Sky it may be considered that being two level scene.

It is understood that including at least a kind of level-one scene in each width images to be recognized；And in every a kind of level-one All include at least a kind of two level scene in scene.It is understood that in addition to the two level of level-one scene and level-one scene subordinate Scene can also have the three-level scene of two level scene subordinate, the level Four scene etc. of three-level scene subordinate.

In the present embodiment, first classification results include to pass through deep layer convolutional neural networks in the images to be recognized The level-one scene come is sorted out after identification.As a kind of mode, identified by deep layer convolutional neural networks sorted per class one Grade scene can be used as independent image-region, to be further respectively used to the classification of next two level scene.

Step S102：In first classification results, identifies two level scene by shallow-layer convolutional neural networks, obtain the Two classification results.

In the present embodiment, the deep layer convolutional neural networks and the shallow-layer convolutional neural networks be in contrast, In, the number of plies of the deep layer convolutional neural networks is more than the number of plies of the shallow-layer convolutional neural networks.Wherein, convolutional neural networks The number of plies, it is believed that be the number of plies of the convolutional layer and full articulamentum in convolutional neural networks.

As a kind of mode, the deep layer convolutional neural networks can select VGG, and the shallow-layer convolutional neural networks can be with Select AlexNet, wherein VGG that there is 19 layers of convolutional layer/full articulamentum, AlexNet to have 8 layers of convolutional layer/full articulamentum.At it In his possible embodiment, it is also an option that the others convolution such as Goog l eNet (22 layers), ResNet (152-1000 layers) Neural network realizes the image processing method in the present embodiment.

It is understood that deep layer convolutional neural networks and shallow-layer convolutional neural networks are opposite, in a kind of embodiment party VGG in formula as deep layer convolutional neural networks may be used as shallow-layer convolutional neural networks in another embodiment, such as When the more ResNet of the number of plies are as deep layer convolutional neural networks.

In the present embodiment, second classification results include to pass through shallow-layer convolutional Neural net in first classification results Network sorts out the two level scene come after further identifying.As a kind of mode, pass through shallow-layer convolutional Neural net per class level-one scene Network identification is sorted can be used as independent image-region per class two level scene.

Step S103：Based on second classification results, the scene sophisticated category information of the images to be recognized is exported.

In the present embodiment, in the second classification results obtained by previous step, contain sorted described to be identified At least a kind of two level scene in image.It, can will be in second classification results for ease of operations such as subsequent image procossings The parameter integrations such as position, the type of every class two level scene get up, formation can be identified by subsequent processing module described in wait knowing The scene sophisticated category information of other image.

In existing single catenet, since it is trained by the data set of a large amount of variety classes scenes, It is very high to the precision of scene Recognition classification, but simultaneously, very big to the calculation amount of scene Recognition, the hardware water of mobile terminal at present It is flat then be difficult to realize so huge calculation amount, therefore the scene classification precision of mobile terminal is limited by always this and is difficult to be promoted.And Image processing method provided in this embodiment ensure that field in such a way that concatenated convolutional neural network carries out scene classification While scape nicety of grading reaches standard, the calculation amount of scene classification needs is greatly reduced, and then reduces scene and finely divides The hardware resource requirements of class improve the efficiency of scene classification.

The image processing method that the application first embodiment provides, by deep layer convolutional neural networks in images to be recognized Level-one scene classify, then to every class level-one scene using shallow-layer convolutional neural networks carry out two level scene classification, The scene sophisticated category information for finally exporting the images to be recognized, avoid using single catenet to group scene into The high calculation amount that row sophisticated category is brought is calculating and is achieving good balance in precision, provided to mobile terminal for landing It may.

Second embodiment

Referring to Fig. 2, the flow diagram of the image processing method provided Fig. 2 shows the application second embodiment.Under Face will be explained in detail by taking mobile phone as an example for flow shown in Fig. 2.Above-mentioned image processing method can specifically wrap Include following steps：

Step S201：Images to be recognized is obtained under image acquisition modality.

In the present embodiment, the images to be recognized can be in the image acquisition modality when mobile phone camera is shot, via The image that the first-class component of cell-phone camera obtains.In order to carry out the images to be recognized finer post-processing, wait knowing to this Before other image carries out image procossing, the step of image processing method provided in this embodiment can be carried out.

It is understood that in other implementations, the images to be recognized can also be locally deposits via mobile phone The image that the approach such as reservoir, Cloud Server or browsing device net page end obtain, described image acquisition mode can also be local photograph album Image acquisition modality, mobile terminal to Cloud Server obtain image data when image acquisition modality, browsing device net page load The image acquisition modality etc. of picture.

Step S202：Classified to the level-one scene in the images to be recognized by deep layer convolutional neural networks.

In the present embodiment, the data of the images to be recognized can be inputted to trained deep layer convolutional neural networks In, classification is identified to the level-one scene in the images to be recognized by deep layer convolutional neural networks.

Step S203：Obtain the first classification results, first classification results include in the images to be recognized extremely Few a kind of level-one scene.

In the present embodiment, classify by the identification of deep layer convolutional neural networks, can export what at least oneclass classification was completed The image data of level-one scene.Level-one scene in images to be recognized includes at least one kind.It is understood that if to be identified Level-one scene in image only has one kind, then the first classification results obtained after the identification classification of deep layer convolutional neural networks Also it may only include up to a kind of level-one scene.

In the present embodiment, step S204 can be directly executed after step S203, can also first carry out step S208.

Step S204：By shallow-layer convolutional neural networks to two in every one kind level-one scene in first classification results Grade scene is classified.

In the present embodiment, since deep layer convolutional neural networks and shallow-layer convolutional neural networks are cascade relationships, by deep layer Convolutional neural networks identification sorts out the level-one scene image data come, and can be automatically matched to the image with such level-one scene The input layer of the corresponding shallow-layer convolutional neural networks of data, and classify to the two level scene in such level-one scene.

Step S205：Obtain the second classification results, second classification results include in the images to be recognized extremely Few a kind of two level scene.

In the present embodiment, classify by the identification of shallow-layer convolutional neural networks, can export what at least oneclass classification was completed The image data of two level scene.Two level scene in every class level-one scene that first classification results include includes at least one kind.It can With understanding, if the level-one scene in images to be recognized only has one kind, and the two level scene in the level-one scene also only has One kind, then the second classification results obtained after the identification classification of shallow-layer convolutional neural networks also may only include up to a kind of two level Scene.

Special circumstances wherein that may be present are that the level-one scene in the images to be recognized may be identified Come, or can recognize that mistake.For example, trained deep layer convolutional neural networks are only used for identification sky, meadow originally And house, when there was only ocean in images to be recognized, the first possible situation, the of the output of deep layer convolutional neural networks Not comprising any kind level-one scene in one classification results, because not by the deep layer convolutional Neural of oceanographic data collection training Network None- identified goes out ocean；Second of possible situation, since ocean and sky are similar in certain dimensions of image data Degree is higher, and in the first classification results of deep layer convolutional neural networks output, the ocean in images to be recognized is classified as sky, This belongs to the case where classification error.Similarly, shallow-layer convolutional neural networks are when two level scene is classified in identification, it is also possible to go out Existing above-mentioned problem.

The case where being come out for the first above-mentioned new scene None- identified, can by it is subsequent to the new scene type into Row definition, and the data set for inputting the new scene type is trained convolutional neural networks, so that the convolutional neural networks It can recognize that the new scene and classify to it.In the case of above-mentioned second of scene Recognition mistake, then need pair The parameter of convolutional neural networks model is adjusted, and is optimized to the image recognition accuracy of convolutional neural networks, is reduced and is known Not wrong situation.

Step S206：To the two level scene setting classification marker in second classification results.

Can include the picture number for sorting out the every class two level scene come in the present embodiment, in second classification results According to, can be to sorting out the every class two level scene setting classification marker come, in order to every for ease of operations such as subsequent image procossings Matching relationship is established between class two level scene and corresponding subsequent processing module.In the present embodiment, the classification marker can be The mark added after identifying two level scene can also be the feature that the two level scene image itself carries.

Step S207：Output includes the scene sophisticated category information of the images to be recognized of the classification marker.

The scene sophisticated category information can include the images to be recognized after sophisticated category, stamp contingency table The image data of all two level scenes of note.As a kind of mode, the scene sophisticated category information can directly input image Processing module carries out classification processing to two level scene.

In the present embodiment, step S208 can also be carried out after step S203.

Step S208：Judge whether to receive simple sort instructions.

If receiving simple sort instructions, S209 is thened follow the steps；If not receiving simple sort instructions, then follow the steps S204。

Step S209：Based on first classification results, the simple classification information of scene of the images to be recognized is exported.

In the present embodiment, the simple classification information of scene of the images to be recognized is exported by carrying out step S209, wherein Include only the classification information for sorting out the level-one scene come, specific implementation steps can refer to step S206 to step S207。

In the present embodiment, judgment step S208 can be added after step S203, it is simple for judging whether to receive Sort instructions then follow the steps S209 if receiving simple sort instructions, and the scene for exporting images to be recognized is simply classified letter Breath；If being not received by simple sort instructions, S204 to step S207 is thened follow the steps, exports the scene essence of images to be recognized Segment category information.Which can be used for providing to the user a kind of selection of classification fineness, and user can be according to the demand of oneself Selection carries out simply and easy treat or fine processing to images to be recognized.It is understood that image processing method provided in this embodiment Method can also export the simple classification information of scene and scene sophisticated category information of same width images to be recognized respectively.

It should be noted that in above-mentioned scene sophisticated category information and the simple classification information of scene, finely and it is simply Relative concept, what is showed is the relationship between two level scene classification and level-one scene classification, i.e. the classification of two level scene is opposite It is more fine in level-one scene, and be not that element that two level scene includes or classification are centainly more than level-one scene, it is not yet Be some scene in same level-one scene subordinate classification than the subordinate of another scene classify it is more.For example, a figure to be identified The level-one scene for including as in includes cat, sky and meadow, and the two level scene classification of cat includes ear, eyes, nose of cat etc. Multiclass, and the two level scene classification of sky only includes this one kind of the sky of blue in the images, from the point of view of classification results, sky Only include a kind of two level scene, and cat contains at least three kinds of two level scenes, but can not be because of the two level field that cat is included Scape is considered as that its scene classification is more finer than sky more, because the two of the sky that can be identified in shallow-layer convolutional neural networks The sky etc. of red sky, black is further comprised in grade scene, even if the sky that shallow-layer convolutional neural networks can identify Two level scene type sum is fewer than the two level scene type sum of cat, between the scene of same rank also without whether it is fine can Than property, the fine and relationship being simply only used between differentiation subordinate's scene classification and higher level's scene classification in the present embodiment.

In the present embodiment, step S210 and step S211 can be executed after step S203.

Step S210：Judge to whether there is non-classified level-one scene in the images to be recognized.

If thening follow the steps S211 there are non-classified level-one scene in the images to be recognized：If the figure to be identified Non-classified level-one scene is not present as in, then terminates flow.

Step S211：Input includes the data set of at least a kind of level-one scene in the non-classified level-one scene, instruction Practice the deep layer convolutional neural networks.In the present embodiment, step S212 can be executed after step S207.

Step S212：Judge to whether there is non-classified two level scene in the images to be recognized.

If thening follow the steps S213 there are non-classified two level scene in the images to be recognized；If the figure to be identified Non-classified two level scene is not present as in, then terminates flow.

Step S213：Input includes the data set of at least a kind of two level scene in the non-classified two level scene, instruction Practice the shallow-layer convolutional neural networks.

In the present embodiment, step S210, step S211 and step S212, step S213 can be used for solving above-mentioned volume The problem of product neural network None- identified new scene.By inputting new level-one contextual data set pair to deep layer convolutional neural networks Deep layer convolutional neural networks are trained, and new two level contextual data set pair shallow-layer convolution god is inputted to shallow-layer convolutional neural networks It is trained, the concatenated convolutional neural network in the present embodiment can be optimized through network, expanded images processing method is fitted Use range.

It, can be by way of transfer learning, by the volume of trained scene old for identification as a kind of mode Product neural network is as pre-training model, and to train the convolutional neural networks of new scene for identification, new model training can be improved Efficiency.

In the present embodiment, step S214 can also be performed after step S209.

Step S214：Based on the simple classification information of the scene, to each level-one scene graph in the images to be recognized As carrying out classification processing.

In the present embodiment, step S215 can also be performed after step S207.

Step S215：Based on the scene sophisticated category information, to each two level scene graph in the images to be recognized As carrying out classification processing.

In the present embodiment, each level-one field in the images to be recognized can be extracted according to the simple classification information of the scene Scape characteristics of image, and different every class level-one scene images is directed at image processing system corresponding with its feature Reason；Likewise, can be special according to each two level scene image in images to be recognized described in the scene sophisticated category information extraction Sign, and different every class two level scene images is directed into image processing system corresponding with its feature and is handled.It is executing After step S214, each level-one scene image by Subdividing Processing can be integrated, obtain images to be recognized process Summary image treated image；It, can be by each two level scene image by Subdividing Processing after executing step S215 It integrates, obtains images to be recognized by precise image treated image.

It, can be with output field relative to the image processing method that the application first embodiment, the application second embodiment provide The simple classification information of scape and scene sophisticated category information can be convenient for the Contrast on effect of displaying simple classification and sophisticated category, with Provide personalized selection to the user；It is trained by inputting new contextual data set pair concatenated convolutional neural network, it can be with Accuracy of identification is continued to optimize, the scope of application of expanded images processing method keeps the application of scheme more intelligent.

3rd embodiment

Referring to Fig. 3, Fig. 3 shows the module frame chart for the image processing apparatus 300 that the application 3rd embodiment provides.Under Face will be illustrated for module frame chart shown in Fig. 3, and described image processing unit 300 includes：First-level class module 310, two Grade sort module 320 and output module 330, wherein：

First-level class module 310, in images to be recognized, level-one scene to be identified by deep layer convolutional neural networks, The first classification results are obtained, include at least a kind of two level scene per a kind of level-one scene.

Secondary classification module 320, in first classification results, two level to be identified by shallow-layer convolutional neural networks Scene, obtains the second classification results, and the number of plies of the deep layer convolutional neural networks is more than the layer of the shallow-layer convolutional neural networks Number.

Output module 330 exports the scene sophisticated category of the images to be recognized for being based on second classification results Information.

The image processing apparatus that the application 3rd embodiment provides, by deep layer convolutional neural networks in images to be recognized Level-one scene classify, then to every class level-one scene using shallow-layer convolutional neural networks carry out two level scene classification, The scene sophisticated category information for finally exporting the images to be recognized, avoid using single catenet to group scene into The high calculation amount that row sophisticated category is brought is calculating and is achieving good balance in precision, provided to mobile terminal for landing It may.

Fourth embodiment

Referring to Fig. 4, Fig. 4 shows the module frame chart for the image processing apparatus 400 that the application fourth embodiment provides.Under Face will be illustrated for module frame chart shown in Fig. 4, and described image processing unit 400 includes：First-level class module 410, two Grade sort module 420, output module 430, simple output module 440, level-one training module 450 and two level training module 460, Wherein：

First-level class module 410, in images to be recognized, level-one scene to be identified by deep layer convolutional neural networks, The first classification results are obtained, include at least a kind of two level scene per a kind of level-one scene.Further, the first-level class Module 410 includes：Preview unit 411, first-level class unit 412 and level-one acquiring unit 413, wherein：

Preview unit 411, for obtaining images to be recognized under image acquisition modality；

First-level class unit 412, for passing through deep layer convolutional neural networks to the level-one scene in the images to be recognized Classify.

Level-one acquiring unit 413, for obtaining the first classification results, first classification results include described to be identified At least a kind of level-one scene in image.

Secondary classification module 420, in first classification results, two level to be identified by shallow-layer convolutional neural networks Scene, obtains the second classification results, and the number of plies of the deep layer convolutional neural networks is more than the layer of the shallow-layer convolutional neural networks Number.Further, the secondary classification module 420 includes：Secondary classification unit 421 and two level acquiring unit 422, wherein：

Secondary classification unit 421, for passing through shallow-layer convolutional neural networks to every one kind one in first classification results Two level scene in grade scene carries out classification

Two level acquiring unit 422, for obtaining the second classification results, second classification results include described to be identified At least a kind of two level scene in image.

Output module 430 exports the scene sophisticated category of the images to be recognized for being based on second classification results Information.Further, the output module 430 includes：Marking unit 431 and fine output unit 432, wherein：

Marking unit 431, for the two level scene setting classification marker in second classification results；

Fine output unit 432, the scene for exporting the images to be recognized for including the classification marker are fine Classification information.

Instruction module 440 receives simple sort instructions for judging whether.

Simple output module 442, for being based on first classification results, the scene for exporting the images to be recognized is simple Classification information.

Primary characterization module 450 whether there is non-classified level-one scene for judging in the images to be recognized.

Level-one training module 452, for inputting comprising at least a kind of level-one scene in the non-classified level-one scene Data set, the training deep layer convolutional neural networks.

Secondary characterization module 460 whether there is non-classified two level scene for judging in the images to be recognized.

Two level training module 462, for inputting comprising at least two class two level scenes in the non-classified two level scene Data set, the training shallow-layer convolutional neural networks.

Coagulation module 470, for being based on the simple classification information of the scene, to each in the images to be recognized Level-one scene image carries out classification processing.

Two stage treatment module 480, for being based on the scene sophisticated category information, to each in the images to be recognized Two level scene image carries out classification processing.

5th embodiment

The 5th embodiment of the application provides a kind of mobile terminal comprising display, memory and processor, it is described Display and the memory are couple to the processor, the memory store instruction, when described instruction is by the processor It is executed when execution：

In images to be recognized, level-one scene is identified by deep layer convolutional neural networks, obtains the first classification results, it is each Level-one scene described in class includes at least a kind of two level scene；

In first classification results, two level scene is identified by shallow-layer convolutional neural networks, obtains the second classification knot Fruit, the number of plies of the deep layer convolutional neural networks are more than the number of plies of the shallow-layer convolutional neural networks；

Based on second classification results, the scene sophisticated category information of the images to be recognized is exported.

Sixth embodiment

The application sixth embodiment provide it is a kind of with processor can perform the computer-readable of program code deposit Storage media, said program code make the processor execute：

In conclusion image processing method provided by the present application, device and mobile terminal, deep layer convolutional neural networks are treated Level-one scene in identification image is classified, and then carries out two level field using shallow-layer convolutional neural networks to every class level-one scene The classification of scape finally exports the scene sophisticated category information of the images to be recognized.Compared with the existing technology, the embodiment of the present application Classify successively to major class scene and group scene by using cascade convolutional neural networks, avoids using single big Type network carries out group scene the high calculation amount that sophisticated category is brought, and is calculating and is achieving good balance in precision, is being It lands to mobile terminal and provides possibility.

It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment weight Point explanation is all difference from other examples, and the same or similar parts between the embodiments can be referred to each other. For device class embodiment, since it is basically similar to the method embodiment, so fairly simple, the related place ginseng of description See the part explanation of embodiment of the method.For arbitrary processing mode described in embodiment of the method, in device reality It applies in example and no longer can be one by one repeated in device embodiment by corresponding processing modules implement.

Referring to Fig. 5, based on above-mentioned image processing method, device, the embodiment of the present application also provides a kind of mobile terminal 100 comprising electronic body portion 10, the electronic body portion 10 include shell 12 and the main display being arranged on the shell 12 Screen 120.Metal can be used in the shell 12, such as steel, aluminium alloy are made.In the present embodiment, the main display 120 usually wraps Display panel 111 is included, may also comprise for responding the circuit etc. for carrying out touch control operation to the display panel 111.The display Panel 111 can be a liquid crystal display panel (Liquid Crystal Display, LCD), in some embodiments, described Display panel 111 is a touch screen 109 simultaneously.

Please refer to Fig. 6, in actual application scenarios, the mobile terminal 100 can be used as intelligent mobile phone terminal into It exercises and uses, the electronic body portion 10 also typically includes one or more (only showing one in figure) processors in this case 102, memory 104, RF (Radio Frequency, radio frequency) module 106, voicefrequency circuit 110, sensor 114, input module 118, power module 122.It will appreciated by the skilled person that structure shown in fig. 5 is only to illustrate, not to described The structure in electronic body portion 10 causes to limit.For example, the electronic body portion 10 may also include than shown in Fig. 5 more or more Few component, or with the configuration different from shown in Fig. 5.

It will appreciated by the skilled person that for the processor 102, every other component belongs to It is coupled by multiple Peripheral Interfaces 124 between peripheral hardware, the processor 102 and these peripheral hardwares.The Peripheral Interface 124 can Based on following standard implementation：Universal Asynchronous Receive/sending device (Universal Asynchronous Receiver/ Transmitter, UART), universal input/output (General Purpose Input Output, GPIO), serial peripheral connect Mouthful (Serial Peripheral Interface, SPI), internal integrated circuit (Inter-Integrated Circuit, I2C), but it is not limited to above-mentioned standard.In some instances, the Peripheral Interface 124 can only include bus；In other examples In, the Peripheral Interface 124 may also include other elements, such as one or more controller, such as connecting the display The display controller of panel 111 or storage control for connecting memory.In addition, these controllers can also be from described Detached in Peripheral Interface 124, and be integrated in the processor 102 or corresponding peripheral hardware in.

The memory 104 can be used for storing software program and module, and the processor 102 is stored in institute by operation The software program and module in memory 104 are stated, to perform various functions application and data processing.The memory 104 may include high speed random access memory, may also include nonvolatile memory, and such as one or more magnetic storage device dodges It deposits or other non-volatile solid state memories.In some instances, the memory 104 can further comprise relative to institute The remotely located memory of processor 102 is stated, these remote memories can pass through network connection to the electronic body portion 10 Or the main display 120.The example of above-mentioned network includes but not limited to internet, intranet, LAN, mobile communication Net and combinations thereof.

The RF modules 106 are used to receive and transmit electromagnetic wave, realize the mutual conversion of electromagnetic wave and electric signal, to It is communicated with communication network or other equipment.The RF modules 106 may include various existing for executing these functions Circuit element, for example, antenna, RF transceiver, digital signal processor, encryption/deciphering chip, subscriber identity module (SIM) card, memory etc..The RF modules 106 can be carried out with various networks such as internet, intranet, wireless network Communication is communicated by wireless network and other equipment.Above-mentioned wireless network may include cellular telephone networks, wireless LAN or Metropolitan Area Network (MAN).Above-mentioned wireless network can use various communication standards, agreement and technology, including but not limited to Global system for mobile communications (Global System for Mobile Communication, GSM), enhanced mobile communication skill Art (Enhanced Data GSM Environment, EDGE), Wideband CDMA Technology (wideband code Division multiple access, W-CDMA), Code Division Multiple Access (Code division access, CDMA), time-division Multiple access technology (time division multiple access, TDMA), adopting wireless fidelity technology (Wireless, Fidelity, WiFi) (such as American Institute of Electrical and Electronics Engineers's standard IEEE 802.10A, IEEE 802.11b, IEEE802.11g and/ Or IEEE 802.11n), the networking telephone (Voice over internet protocal, VoIP), worldwide interoperability for microwave accesses (Worldwide Interoperability for Microwave Access, Wi-Max), other for mail, Instant Messenger The agreement and any other suitable communications protocol of news and short message, or even may include that those are not developed currently yet Agreement.

Voicefrequency circuit 110, receiver 101, sound jack 103, microphone 105 provide user and the electronic body portion jointly Audio interface between 10 or described main displays 120.Specifically, the voicefrequency circuit 110 receives from the processor 102 Voice data is converted to electric signal by voice data, by electric signal transmission to the receiver 101.The receiver 101 is by electric signal Be converted to the sound wave that human ear can be heard.The voicefrequency circuit 110 receives electric signal also from the microphone 105, by electric signal Voice data is converted to, and gives the processor 102 to be further processed data transmission in network telephony.Audio data can be with It is obtained from the memory 104 or by the RF modules 106.In addition, audio data can also be stored to the storage It is sent in device 104 or by the RF modules 106.

The setting of the sensor 114 is in the electronic body portion 10 or in the main display 120, the sensor 114 example includes but is not limited to：Optical sensor, operation sensor, pressure sensor, gravity accelerometer and Other sensors.

Specifically, the optical sensor may include light sensor 114F, pressure sensor 114G.Wherein, pressure sensing Device 114G can detect the sensor by pressing the pressure generated in mobile terminal 100.That is, pressure sensor 114G detection by with The pressure that contact between family and mobile terminal or pressing generate, for example, by between the ear and mobile terminal of user contact or Press the pressure generated.Therefore, whether pressure sensor 114G may be used to determine occurs between user and mobile terminal 100 Contact or the size of pressing and pressure.

Referring to Fig. 5, specifically in the embodiment shown in fig. 5, the light sensor 114F and the pressure Sensor 114G is arranged adjacent to the display panel 111.The light sensor 114F can have object close to the main display When shielding 120, such as when the electronic body portion 10 is moved in one's ear, the processor 102 closes display output.

As a kind of motion sensor, gravity accelerometer can detect in all directions (generally three axis) and accelerate The size of degree can detect that size and the direction of gravity, can be used to identify the application of 100 posture of the mobile terminal when static (such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, percussion) etc.. In addition, the electronic body portion 10 can also configure the other sensors such as gyroscope, barometer, hygrometer, thermometer, herein no longer It repeats,

In the present embodiment, the input module 118 may include being arranged the touch screen on the main display 120 109, the touch screen 109 collect user on it or neighbouring touch operation (for example user is any using finger, stylus etc. Operation of the suitable object or attachment on the touch screen 109 or near the touch screen 109), and according to presetting The corresponding attachment device of driven by program.Optionally, the touch screen 109 may include touch detecting apparatus and touch controller. Wherein, the touch orientation of the touch detecting apparatus detection user, and the signal that touch operation is brought is detected, it transmits a signal to The touch controller；The touch controller receives touch information from the touch detecting apparatus, and by the touch information It is converted into contact coordinate, then gives the processor 102, and order that the processor 102 is sent can be received and executed. Furthermore, it is possible to realize touching for the touch screen 109 using multiple types such as resistance-type, condenser type, infrared ray and surface acoustic waves Touch detection function.In addition to the touch screen 109, in other change embodiments, the input module 118 can also include it His input equipment, such as button 107.The button 107 for example may include the character keys for inputting character, and for triggering The control button of control function.The example of the control button includes " returning to main screen " button, on/off button etc..

The information and the electronics that the main display 120 is used to show information input by user, is supplied to user The various graphical user interface of body part 10, these graphical user interface can by figure, text, icon, number, video and its Arbitrary to combine to constitute, in an example, the touch screen 109 may be disposed on the display panel 111 thus with described Display panel 111 constitutes an entirety.

The power module 122 is used to provide supply of electric power to the processor 102 and other each components.Specifically, The power module 122 may include power-supply management system, one or more power supply (such as battery or alternating current), charging circuit, Power-fail detection circuit, inverter, indicator of the power supply status and any other and the electronic body portion 10 or the master The generation, management of electric power and the relevant component of distribution in display screen 120.

The mobile terminal 100 further includes locator 119, and the locator 119 is for determining 100 institute of the mobile terminal The physical location at place.In the present embodiment, the locator 119 realizes the positioning of the mobile terminal 100 using positioning service, The positioning service, it should be understood that the location information of the mobile terminal 100 is obtained by specific location technology (as passed through Latitude coordinate), it is marked on the electronic map by the technology of the position of positioning object or service.

It should be understood that above-mentioned mobile terminal 100 is not limited to intelligent mobile phone terminal, should refer to can move The computer equipment used in dynamic.Specifically, mobile terminal 100, refers to the mobile computer for being equipped with intelligent operating system Equipment, mobile terminal 100 include but not limited to smart mobile phone, smartwatch, tablet computer, etc..

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiments or example.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.

In addition, term " first ", " second " are used for description purposes only, it is not understood to indicate or imply relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present application, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discuss suitable Sequence, include according to involved function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be by the application Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (system of such as computer based system including processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicating, propagating or passing Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium includes following：Electricity with one or more wiring Interconnecting piece (mobile terminal), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other are suitable Medium, because can be for example by carrying out optical scanner to paper or other media, then into edlin, interpretation or when necessary with it His suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the application can be realized with hardware, software, firmware or combination thereof.Above-mentioned In embodiment, software that multiple steps or method can in memory and by suitable instruction execution system be executed with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of row technology or their combination are realized：With the logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit application-specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

Those skilled in the art are appreciated that realize all or part of step that above-described embodiment method carries Suddenly it is that relevant hardware can be instructed to complete by program, the program can be stored in a kind of computer-readable storage medium In matter, which includes the steps that one or a combination set of embodiment of the method when being executed.In addition, in each embodiment of the application In each functional unit can be integrated in a processing module, can also be that each unit physically exists alone, can also two A or more than two units are integrated in a module.The form that hardware had both may be used in above-mentioned integrated module is realized, also may be used It is realized in the form of using software function module.If the integrated module realized in the form of software function module and as Independent product sale in use, can also be stored in a computer read/write memory medium.

Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above Embodiments herein is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as the limit to the application System, those skilled in the art can be changed above-described embodiment, change, replace and become within the scope of application Type.

Finally it should be noted that：Above example is only to illustrate the technical solution of the application, rather than its limitations；Although The application is described in detail with reference to the foregoing embodiments, those skilled in the art are when understanding：It still can be with Technical scheme described in the above embodiments is modified or equivalent replacement of some of the technical features；And These modifications or replacements, do not drive corresponding technical solution essence be detached from each embodiment technical solution of the application spirit and Range.

Claims

1. a kind of image processing method, which is characterized in that the method includes：

It in images to be recognized, identifies level-one scene by deep layer convolutional neural networks, obtains the first classification results, per one kind institute It includes at least a kind of two level scene to state level-one scene；

In first classification results, identifies two level scene by shallow-layer convolutional neural networks, obtain the second classification results, institute The number of plies for stating deep layer convolutional neural networks is more than the number of plies of the shallow-layer convolutional neural networks；

2. according to the method described in claim 1, it is characterized in that, in images to be recognized, pass through deep layer convolutional neural networks It identifies level-one scene, obtains the first classification results, including：

Images to be recognized is obtained under image acquisition modality；

Classified to the level-one scene in the images to be recognized by deep layer convolutional neural networks；

The first classification results are obtained, first classification results include at least a kind of level-one in the images to be recognized Scene.

3. according to the method described in claim 2, it is characterized in that, in first classification results, pass through shallow-layer convolution god Through Network Recognition two level scene, the second classification results are obtained, including：

By shallow-layer convolutional neural networks to dividing per the two level scene in one kind level-one scene in first classification results Class；

The second classification results are obtained, second classification results include at least a kind of two level in the images to be recognized Scene.

4. according to the method described in claim 3, it is characterized in that, being based on second classification results, output is described to be identified The scene sophisticated category information of image, including：

To the two level scene setting classification marker in second classification results；

Output includes the scene sophisticated category information of the images to be recognized of the classification marker.

5. according to the method described in claim 1, it is characterized in that, in images to be recognized, pass through deep layer convolutional neural networks Identify level-one scene, after obtaining the first classification results, the method further includes：

Based on first classification results, the simple classification information of scene of the images to be recognized is exported.

6. according to the method described in claim 1, it is characterized in that, in images to be recognized, pass through deep layer convolutional neural networks Identify level-one scene, after obtaining the first classification results, the method further includes：

Judge whether to receive simple sort instructions；

If receiving simple sort instructions, first classification results are based on, the scene for exporting the images to be recognized is simply divided Category information,

If not receiving simple sort instructions, in first classification results, two level is identified by shallow-layer convolutional neural networks Scene obtains the second classification results.

7. according to the method described in claim 1, it is characterized in that, the method further includes：

Judge to whether there is non-classified level-one scene in the images to be recognized；

If there are non-classified level-one scene in the images to be recognized, input comprising in the non-classified level-one scene extremely The data set of few one kind level-one scene, the training deep layer convolutional neural networks.

8. according to the method described in claim 1, it is characterized in that, the method further includes：

Judge to whether there is non-classified two level scene in the images to be recognized；

If there are non-classified two level scene in the images to be recognized, input comprising in the non-classified two level scene extremely The data set of few one kind two level scene, the training shallow-layer convolutional neural networks.

9. according to the method described in claim 1, it is characterized in that, based on second classification results, wait knowing described in output After the scene sophisticated category information of other image, the method further includes：

Based on the scene sophisticated category information, each two level scene image in the images to be recognized is carried out at classification Reason.

10. according to the method described in claim 5, it is characterized in that, based on first classification results, wait knowing described in output After the simple classification information of scene of other image, the method further includes：

Based on the simple classification information of the scene, each level-one scene image in the images to be recognized is carried out at classification Reason.

11. a kind of image processing apparatus, which is characterized in that described device includes：

First-level class module, in images to be recognized, identifying level-one scene by deep layer convolutional neural networks, obtaining first Classification results include at least a kind of two level scene per a kind of level-one scene；

Secondary classification module, in first classification results, identifying two level scene by shallow-layer convolutional neural networks, obtaining The second classification results, the number of plies of the deep layer convolutional neural networks are taken to be more than the number of plies of the shallow-layer convolutional neural networks；

Output module exports the scene sophisticated category information of the images to be recognized for being based on second classification results.

12. a kind of mobile terminal, which is characterized in that including display, memory and processor, the display and described deposit Reservoir is couple to the processor, the memory store instruction, when executed by the processor, the processing Device executes such as claim 1-10 any one of them methods.

13. a kind of computer read/write memory medium for the program code that can perform with processor, which is characterized in that the journey Sequence code makes the processor execute such as claim 1-10 any one of them methods.