CN107944022A

CN107944022A - Picture classification method, mobile terminal and computer-readable recording medium

Info

Publication number: CN107944022A
Application number: CN201711316750.XA
Authority: CN
Inventors: 王秀琳
Original assignee: Nubia Technology Co Ltd
Current assignee: Nubia Technology Co Ltd
Priority date: 2017-12-11
Filing date: 2017-12-11
Publication date: 2018-04-20

Abstract

The invention discloses a kind of picture classification method, mobile terminal and computer-readable recording medium, this method includes：When the picture classification for detecting triggering instructs, character area and the main object region of default quantity are selected in picture center to be sorted；The each character area selected to frame carries out text information identification and carries out main object identification to each main object region, to obtain the text information and main object of the picture to be sorted；Keyword is extracted from the text information identified, and is classified according to the keyword and the main object to the picture to be sorted.The present invention determines the text information of picture to be sorted based on multiple character areas, and the main object of picture to be sorted is determined based on multiple main object regions, the effective accuracy for improving text information and main object, classify in combination with the keyword in main object and text information to picture, the effective accuracy for improving picture classification.

Description

Picture classification method, mobile terminal and computer-readable recording medium

Technical field

The present invention relates to the technical field of mobile terminal, more particularly to a kind of picture classification method, mobile terminal and calculating Machine readable storage medium storing program for executing.

Background technology

With the fast development of mobile terminal, the hardware configuration of mobile terminal is also higher and higher, the function that can be realized It is more and more, and integrated APP (Application, application program) is also more and more.User is in the process using mobile terminal In, it can store photo, the picture downloaded and the picture resource such as user's sectional drawing of shooting, but the picture resource in mobile terminal As user's usage time gradually increases, make troubles for user management and lookup picture resource.

At present, for ease of user management and lookup picture resource, the main object in picture resource, such as people are mainly identified Owner's body, landscape main body and food main body etc., and the classification based on main object classifies picture resource, however, picture Often possess multiple main objects in resource, such as same pictures are provided simultaneously with personage's main body and landscape main body, can not be based on Main object realizes the Accurate classification of picture resource, in addition, typically each including word in the sectional drawing of user or the picture of download, only The classification of picture resource is realized based on main object, and without considering the word in picture, it is easy to cause the classification of picture resource not Accurately.

Therefore, how to improve the accuracy of picture resource classification is current urgent problem to be solved.

The above is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that the above is existing skill Art.

The content of the invention

It is a primary object of the present invention to provide a kind of picture classification method, mobile terminal and computer-readable storage medium Matter, it is intended to solve how to improve the technical problem of the accuracy of picture resource classification.

To achieve the above object, the present invention provides a kind of picture classification method, and the picture classification method includes following step Suddenly：

When detect triggering picture classification instruct when, picture center to be sorted select default quantity character area and Main object region；

The each character area selected to frame carries out text information identification and carries out main body pair to each main object region As identification, to obtain the text information and main object of the picture to be sorted；

Keyword is extracted from the text information identified, and is treated according to the keyword and the main object to described Category images is classified.

Alternatively, it is described to include the step of picture center to be sorted selects the character area of default quantity：

It is default resolution ratio by the resolution adjustment of picture to be sorted, and obtains the pixel of the picture to be sorted after adjustment Matrix；

Arbitrarily two different pixels of selection, and according to two different pixels of selection from the pixel matrix Point frame selects character area, until frame chooses default quantity, and unduplicated character area.

Alternatively, it is described frame is selected each character area carry out text information identification and to each main object region into Row main object identifies, is included with obtaining the step of the text information and main object of the picture to be sorted：

The each character area selected to frame is cut, to obtain the word picture of each character area, and to frame The each main object region selected is cut, to obtain the main object picture in each main object region；

The word picture of each character area is inputted to predetermined depth neutral net, to obtain the picture to be sorted Text information；

The main object picture in each main object region is inputted to predetermined depth neutral net, to be treated described in acquisition point The main object of class picture.

Alternatively, it is described to input the word picture of each character area to predetermined depth neutral net, with described in acquisition The step of text information of picture to be sorted, includes：

The word picture of each character area is inputted to predetermined depth neutral net, to obtain the text of each character area Word content；

Error correction is carried out to the word content of each character area, and based on the word content of each character area after error correction Generate the text information of the picture to be sorted.

Alternatively, it is described to input the word picture of each character area to predetermined depth neutral net, with described in acquisition The step of text information of picture to be sorted, further includes：

The word picture of each character area is inputted to predetermined depth neutral net, to obtain point of each character area Category feature；

Determine the position of each character area, and the characteristic of division is identical, and the character area that the position is adjacent Merge, to obtain some merging character areas；

Some merging character areas are inputted to predetermined depth neutral net, to obtain each merging character area Word content；

Word content based on each merging character area generates the text information of the picture to be sorted.

Alternatively, the main object picture by each main object region is inputted to predetermined depth neutral net, with The step of main object for obtaining the picture to be sorted, includes：

The main object picture in each main object region is inputted to predetermined depth neutral net, to obtain each main body Main object in object picture, and the main object class probability of the main object；

The main object of the main object class probability maximum is determined as to the main object of the picture to be sorted.

Alternatively, it is described default quantity is selected in picture center to be sorted character area and main object region the step of Afterwards, the picture classification method further includes：

The character area selected to frame performs adjustment operation, to obtain the adjustment character area of given amount；

Each image block adjusted in character area is inputted to predetermined depth neutral net, has each been adjusted with obtaining The word class probability of character area；

The character area that frame is selected replaces with the adjustment character area of the word class probability maximum.

Alternatively, it is described that keyword is extracted from the text information identified, and according to the keyword and the main body After the step of object classifies the picture to be sorted, the picture classification method further includes：

When detecting picture resource idsplay order, picture resource is obtained, and according to each picture in the picture resource Classification show the picture resource.

In addition, to achieve the above object, the present invention also provides a kind of mobile terminal, the mobile terminal includes：Memory, Processor and the picture classification program that can be run on the memory and on the processor is stored in, the picture classification journey The step of sequence realizes picture classification method as described above when being performed by the processor.

The present invention also provides a kind of computer-readable recording medium, picture is stored with the computer-readable recording medium The step of sort program, the picture classification program realizes picture classification method as described above when being executed by processor.

The present invention provides a kind of picture classification method, mobile terminal and computer-readable recording medium, and the present invention is when detection To triggering picture classification instruction when, select character area and the main object region of default quantity in picture center to be sorted, Then each character area selected to frame carries out text information identification and carries out main object knowledge to each main object region Not, to obtain the text information of the picture to be sorted and main object, keyword finally is extracted from the text information identified, And classified according to the keyword and the main object to the picture to be sorted, multiple character areas and multiple masters are selected by frame Body subject area, and determine based on multiple character areas the text information of picture to be sorted, and based on multiple main object areas Domain determines the main object of picture to be sorted, the accuracy of text information and main object is effectively raised, in combination with master Keyword in body object and text information classifies picture, the effective accuracy for improving picture classification.

Brief description of the drawings

Fig. 1 is a kind of hardware architecture diagram for the mobile terminal for realizing each embodiment of the present invention；

Fig. 2 is a kind of communications network system Organization Chart provided in an embodiment of the present invention；

Fig. 3 is the flow diagram of picture classification method first embodiment of the present invention；

Fig. 4 is the refinement flow diagram of step S101 in picture classification method first embodiment of the present invention；

Fig. 5 is the refinement flow diagram of step S102 in picture classification method first embodiment of the present invention；

Fig. 6 is the flow diagram of picture classification method fourth embodiment of the present invention.

The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.

Embodiment

It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.

In follow-up description, the suffix using such as " module ", " component " or " unit " for representing element is only Be conducive to the explanation of the present invention, itself there is no a specific meaning.Therefore, " module ", " component " or " unit " can mix Ground uses.

Terminal can be implemented in a variety of manners.For example, terminal described in the present invention can include such as mobile phone, tablet Computer, laptop, palm PC, personal digital assistant (Personal Digital Assistant, PDA), portable Media player (Portable Media Player, PMP), guider, wearable device, Intelligent bracelet, pedometer etc. move Dynamic terminal, and the fixed terminal such as numeral TV, desktop computer.

It will be illustrated in subsequent descriptions by taking mobile terminal as an example, it will be appreciated by those skilled in the art that except special Outside element for moving purpose, construction according to the embodiment of the present invention can also apply to the terminal of fixed type.

Referring to Fig. 1, its hardware architecture diagram for a kind of mobile terminal of each embodiment of the realization present invention, the shifting Dynamic terminal 100 can include：RF (Radio Frequency, radio frequency) unit 101, WiFi module 102, audio output unit 103rd, A/V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108th, the component such as memory 109, processor 110 and power supply 111.It will be understood by those skilled in the art that shown in Fig. 1 Mobile terminal structure does not form the restriction to mobile terminal, and mobile terminal can be included than illustrating more or fewer components, Either combine some components or different components arrangement.

The all parts of mobile terminal are specifically introduced with reference to Fig. 1：

Radio frequency unit 101 can be used for receiving and sending messages or communication process in, the reception and transmission of signal, specifically, by base station Downlink information receive after, handled to processor 110；In addition, by the data sending of uplink to base station.In general, radio frequency unit 101 Including but not limited to antenna, at least one amplifier, transceiver, coupler, low-noise amplifier, duplexer etc..In addition, penetrate Frequency unit 101 can also be communicated by wireless communication with network and other equipment.Above-mentioned wireless communication can use any communication Standard or agreement, include but not limited to GSM (Global System of Mobile communication, global system for mobile telecommunications System), GPRS (General Packet Radio Service, general packet radio service), CDMA2000 (Code Division Multiple Access 2000, CDMA 2000), WCDMA (Wideband Code Division Multiple Access, wideband code division multiple access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access, TD SDMA), FDD-LTE (Frequency Division Duplexing-Long Term Evolution, frequency division duplex Long Term Evolution) and TDD-LTE (Time Division Duplexing-Long Term Evolution, time division duplex Long Term Evolution) etc..

WiFi belongs to short range wireless transmission technology, and mobile terminal can help user to receive and dispatch electricity by WiFi module 102 Sub- mail, browse webpage and access streaming video etc., it has provided wireless broadband internet to the user and has accessed.Although Fig. 1 shows Go out WiFi module 102, but it is understood that, it is simultaneously not belonging to must be configured into for mobile terminal, completely can be according to need To be omitted in the essential scope for do not change invention.

Audio output unit 103 can be in call signal reception pattern, call mode, record mould in mobile terminal 100 Formula, speech recognition mode, broadcast reception mode when under isotype, by radio frequency unit 101 or WiFi module 102 it is receiving or It is sound that the voice data stored in memory 109, which is converted into audio signal and exports,.Moreover, audio output unit 103 The relevant audio output of specific function performed with mobile terminal 100 can also be provided (for example, call signal receives sound, disappears Breath receives sound etc.).Audio output unit 103 can include loudspeaker, buzzer etc..

A/V input units 104 are used to receive audio or video signal.A/V input units 104 can include graphics processor (Graphics Processing Unit, GPU) 1041 and microphone 1042, graphics processor 1041 is in video acquisition mode Or the static images or the view data of video obtained in image capture mode by image capture apparatus (such as camera) carry out Reason.Picture frame after processing may be displayed on display unit 106.Picture frame after the processing of graphics processor 1041 can be deposited Storage is transmitted in memory 109 (or other storage mediums) or via radio frequency unit 101 or WiFi module 102.Mike Wind 1042 can connect in telephone calling model, logging mode, speech recognition mode etc. operational mode via microphone 1042 Quiet down sound (voice data), and can be voice data by such acoustic processing.Audio (voice) data after processing can To be converted to the form output that mobile communication base station can be sent to via radio frequency unit 101 in the case of telephone calling model. Microphone 1042 can implement various types of noises and eliminate (or suppression) algorithm to eliminate (or suppression) in reception and send sound The noise produced during frequency signal or interference.

Mobile terminal 100 further includes at least one sensor 105, such as optical sensor, motion sensor and other biographies Sensor.Specifically, optical sensor includes ambient light sensor and proximity sensor, wherein, ambient light sensor can be according to environment The light and shade of light adjusts the brightness of display panel 1061, and proximity sensor can close when mobile terminal 100 is moved in one's ear Display panel 1061 and/or backlight.As one kind of motion sensor, accelerometer sensor can detect in all directions (general For three axis) size of acceleration, size and the direction of gravity are can detect that when static, the application available for identification mobile phone posture (such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, percussion) etc.； The fingerprint sensor that can also configure as mobile phone, pressure sensor, iris sensor, molecule sensor, gyroscope, barometer, The other sensors such as hygrometer, thermometer, infrared ray sensor, details are not described herein.

Display unit 106 is used for the information for showing by information input by user or being supplied to user.Display unit 106 can wrap Display panel 1061 is included, liquid crystal display (Liquid Crystal Display, LCD), Organic Light Emitting Diode can be used Forms such as (Organic Light-Emitting Diode, OLED) configures display panel 1061.

User input unit 107 can be used for the numeral or character information for receiving input, and produce the use with mobile terminal The key signals input that family is set and function control is related.Specifically, user input unit 107 may include contact panel 1071 with And other input equipments 1072.Contact panel 1071, also referred to as touch-screen, collect user on it or neighbouring touch operation (for example user uses any suitable objects or annex such as finger, stylus on contact panel 1071 or in contact panel 1071 Neighbouring operation), and corresponding attachment device is driven according to formula set in advance.Contact panel 1071 may include touch detection Two parts of device and touch controller.Wherein, the touch orientation of touch detecting apparatus detection user, and detect touch operation band The signal come, transmits a signal to touch controller；Touch controller receives touch information from touch detecting apparatus, and by it Contact coordinate is converted into, then gives processor 110, and the order that processor 110 is sent can be received and performed.In addition, can To realize contact panel 1071 using polytypes such as resistance-type, condenser type, infrared ray and surface acoustic waves.Except contact panel 1071, user input unit 107 can also include other input equipments 1072.Specifically, other input equipments 1072 can wrap Include but be not limited to physical keyboard, in function key (such as volume control button, switch key etc.), trace ball, mouse, operation lever etc. One or more, do not limit herein specifically.

Further, contact panel 1071 can cover display panel 1061, when contact panel 1071 detect on it or After neighbouring touch operation, processor 110 is sent to determine the type of touch event, is followed by subsequent processing device 110 according to touch thing The type of part provides corresponding visual output on display panel 1061.Although in Fig. 1, contact panel 1071 and display panel 1061 be the component independent as two to realize the function that outputs and inputs of mobile terminal, but in certain embodiments, can The function that outputs and inputs of mobile terminal is realized so that contact panel 1071 and display panel 1061 is integrated, is not done herein specifically Limit.

Interface unit 108 is connected the interface that can pass through as at least one external device (ED) with mobile terminal 100.For example, External device (ED) can include wired or wireless head-band earphone port, external power supply (or battery charger) port, wired or nothing Line data port, memory card port, the port for connecting the device with identification module, audio input/output (I/O) end Mouth, video i/o port, ear port etc..Interface unit 108 can be used for receiving the input from external device (ED) (for example, number It is believed that breath, electric power etc.) and the input received is transferred to one or more elements in mobile terminal 100 or can be with For transmitting data between mobile terminal 100 and external device (ED).

Memory 109 can be used for storage software program and various data.Memory 109 can mainly include storing program area And storage data field, wherein, storing program area can storage program area, application program (such as the sound needed at least one function Sound playing function, image player function etc.) etc.；Storage data field can store according to mobile phone use created data (such as Voice data, phone directory etc.) etc..In addition, memory 109 can include high-speed random access memory, can also include non-easy The property lost memory, a for example, at least disk memory, flush memory device or other volatile solid-state parts.

Processor 110 is the control centre of mobile terminal, utilizes each of various interfaces and the whole mobile terminal of connection A part, by running or performing the software program and/or module that are stored in memory 109, and calls and is stored in storage Data in device 109, perform the various functions and processing data of mobile terminal, so as to carry out integral monitoring to mobile terminal.Place Reason device 110 may include one or more processing units；Preferably, processor 110 can integrate application processor and modulatedemodulate is mediated Device is managed, wherein, application processor mainly handles operating system, user interface and application program etc., and modem processor is main Handle wireless communication.It is understood that above-mentioned modem processor can not also be integrated into processor 110.

Mobile terminal 100 can also include the power supply 111 (such as battery) to all parts power supply, it is preferred that power supply 111 Can be logically contiguous by power-supply management system and processor 110, so as to realize management charging by power-supply management system, put The function such as electricity and power managed.

Although Fig. 1 is not shown, mobile terminal 100 can also be including bluetooth module etc., and details are not described herein.

In the mobile terminal shown in Fig. 1, processor 110 can be used for calling the picture classification stored in memory 109 Program, and perform following steps：

Further, processor 110 can be used for calling the picture classification program that stores in memory 109, also perform with Lower step：

For the ease of understanding the embodiment of the present invention, below to the communications network system that is based on of mobile terminal of the present invention into Row description.

Referring to Fig. 2, Fig. 2 is a kind of communications network system Organization Chart provided in an embodiment of the present invention, the communication network system Unite includes UE (User Equipment, the use for communicating connection successively for the LTE system of universal mobile communications technology, the LTE system Family equipment) 201, E-UTRAN (Evolved UMTS Terrestrial Radio Access Network, evolved UMTS lands Ground wireless access network) 202, EPC (Evolved Packet Core, evolved packet-based core networks) 203 and operator IP operation 204。

Specifically, UE201 can be above-mentioned terminal 100, and details are not described herein again.

E-UTRAN202 includes eNodeB2021 and other eNodeB2022 etc..Wherein, eNodeB2021 can be by returning Journey (backhaul) (such as X2 interface) is connected with other eNodeB2022, and eNodeB2021 is connected to EPC203, ENodeB2021 can provide the access of UE201 to EPC203.

EPC203 can include MME (Mobility Management Entity, mobility management entity) 2031, HSS (Home Subscriber Server, home subscriber server) 2032, other MME2033, SGW (Serving Gate Way, Gateway) 2034, PGW (PDN Gate Way, grouped data network gateway) 2035 and PCRF (Policy and Charging Rules Function, policy and rate functional entity) 2036 etc..Wherein, MME2031 be processing UE201 and The control node of signaling between EPC203, there is provided carrying and connection management.HSS2032 is all to manage for providing some registers Such as the function of attaching position register (not shown) etc, and preserve some and used in relation to service features, data rate etc. The dedicated information in family.All customer data can be transmitted by SGW2034, and PGW2035 can provide the IP of UE 201 Address is distributed and other functions, and PCRF2036 is strategy and the charging control strategic decision-making of business data flow and IP bearing resources Point, it selects and provides available strategy and charging control decision-making with charge execution function unit (not shown) for strategy.

IP operation 204 can include internet, Intranet, IMS (IP Multimedia Subsystem, IP multimedia System) or other IP operations etc..

Although above-mentioned be described by taking LTE system as an example, those skilled in the art it is to be understood that the present invention not only Suitable for LTE system, be readily applicable to other wireless communication systems, such as GSM, CDMA2000, WCDMA, TD-SCDMA with And following new network system etc., do not limit herein.

Based on above-mentioned mobile terminal hardware configuration and communications network system, each embodiment of the method for the present invention is proposed.

The present invention provides a kind of picture classification method.

With reference to Fig. 3, Fig. 3 is the flow diagram of picture classification method first embodiment of the present invention.

In the present embodiment, which includes：

Step S101, when the picture classification for detecting triggering instructs, default quantity is selected in picture center to be sorted Character area and main object region；

The picture classification method is applied to mobile terminal, which includes smart mobile phone and tablet computer etc., the shifting Some pictures are stored with dynamic terminal, the picture of storage includes but not limited to the photo of shooting, the network picture of download and user Sectional drawing etc., the picture format of storage include but not limited to JPEG (Joint Photographic Experts Group, joint photograph Piece expert group) form, GIF (Graphics Interchange Format, figures exchange) form, PNG (Portable Network Graphics, portable network figure) form and TIFF (TagImage FileFormat, tag tmage file) Form etc..

When the picture classification for detecting triggering instructs, picture to be sorted is obtained, and select in picture center to be sorted pre- If character area and the main object region of quantity, wherein, the triggering mode of picture classification instruction includes but not limited to timing and touches Hair, trigger and put out screen triggering etc. in real time, and clocked flip is the picture that storage is read with interval preset time, unfiled when reading Picture when, triggering picture classification instruction；Triggering in real time is when the photo for detecting current shooting, the picture currently downloaded or works as During preceding sectional drawing, triggering picture classification instruction；It is to put out screen holding state when detecting that mobile terminal is in put out screen triggering, and puts out screen and treats When the machine time exceedes predetermined threshold value, triggering picture classification instruction.It should be noted that above-mentioned default quantity, preset time and pre- If threshold value can be configured by those skilled in the art based on actual conditions, the present embodiment is not especially limited this.

Specifically, include with reference to Fig. 4, step S101：

Step S1011, is default resolution ratio by the resolution adjustment of picture to be sorted, and obtains the figure to be sorted after adjustment The pixel matrix of piece；

The mobile terminal is differentiated after picture to be sorted is got, by the resolution adjustment of picture to be sorted to be default Rate, and the pixel matrix of the picture to be sorted after adjustment is obtained, wherein, can be by picture to be sorted before resolution ratio is adjusted Current resolution compared with default resolution ratio, if the current resolution of picture to be sorted is identical with default resolution ratio, The resolution ratio of picture to be sorted need not be then adjusted, will if the current resolution of picture to be sorted is different from default resolution ratio The resolution adjustment of picture to be sorted is default resolution ratio, and the resolution ratio of picture to be sorted is higher, and pixel is more, works as resolution ratio During higher than certain value, processing speed can be influenced；The resolution ratio of picture to be sorted is lower, and pixel is fewer, when resolution ratio is less than one During definite value, text information and main object None- identified in picture can be caused, therefore, it is necessary to treat the resolution ratio of category images It is adjusted, while text information in can ensureing normally to identify picture and main object, improves processing speed, it is necessary to say Bright, above-mentioned default resolution ratio can be configured by those skilled in the art based on actual conditions, and the present embodiment does not make this It is specific to limit.

Step S1012, arbitrarily two different pixels of selection, and according to two differences of selection from pixel matrix Pixel frame select character area, until frame chooses default quantity, and unduplicated character area.

The mobile terminal is appointed after the pixel matrix for obtaining the picture to be sorted after adjusting from the pixel matrix Meaning two different pixels of selection, and select character area according to two different pixel frames of selection, until frame choose it is pre- If quantity, and unduplicated character area, wherein, according to the two of selection different pixel frames select the process of character area for Left upper apex first using a pixel in two pixels as character area, then will be another in two pixels Bottom right vertex of a pixel as character area, so that it is determined that the position of the character area of frame choosing and size.In addition, selected in frame During to character area, whether include text information in the character area of decision block choosing, if do not included in the character area of frame choosing Text information, then frame selects a character area again.It should be noted that above-mentioned default quantity, the i.e. frame of character area select number Amount can be configured by those skilled in the art based on actual conditions, and the present embodiment is not especially limited this.

It should be noted that the concrete mode and frame in the main object region of default quantity are selected in picture center to be sorted Select that the concrete mode of character area is similar, and details are not described herein, in addition, the frame of the frame selection operation of character area and main object selects Operation can carry out side by side at the same time, can also first carry out the frame selection operation of character area, then the frame selection operation of executive agent object, or Person first carries out the frame selection operation of main object, then performs the frame selection operation of character area, and the present embodiment is not especially limited this.

Step S102, each character area for being selected to frame carry out text information identification and to each main object region into Row main object identifies, to obtain the text information of picture to be sorted and main object；

For the mobile terminal after frame chooses character area and the main object region of default quantity, what frame was selected is each Character area carries out text information identification and carries out main object identification to each main object region, to obtain picture to be sorted Text information and main object, wherein, which includes but not limited to personage's main body, landscape main body and food main body Deng.

Specifically, include with reference to Fig. 5, step S102：

Step S1021, each character area selected to frame are cut, to obtain the word graph of each character area Piece, and each main object region selected to frame are cut, to obtain the main object figure in each main object region Piece；

For the mobile terminal after frame chooses character area and the main object region of default quantity, what frame was selected is each Character area is cut, to obtain the word picture of each character area, and each main object region selected to frame Cut, to obtain the main object picture in each main object region, wherein, cutting character area and main object area Before domain, the character area and main object region that can select frame are adjusted, and improve the identification of text information and main object Accuracy, the edge for specially detecting character area whether there is imperfect word, if detecting that the edge of character area is deposited In imperfect word, then corresponding character area is adjusted, and the character area after adjustment is cut, to obtain corresponding word The word picture in region, on the contrary do not adjust character area；And the edge in detection main object region is with the presence or absence of incomplete Main object, if detecting the incomplete main object of marginal existence in main object region, adjusts corresponding main body pair Cut as region, and to the main object region after adjustment, to obtain the main object in main object region, on the contrary it is uncomfortable Whole main object region.It should be noted that the adjustment in character area and main object region includes but not limited to translate adjustment With scaling adjustment etc..

Step S1022, the word picture of each character area is inputted to predetermined depth neutral net, to be sorted to obtain The text information of picture；

The mobile terminal cuts character area, after getting the word picture of character area, by each word The word picture in region is inputted to predetermined depth neutral net, will each word to obtain the text information of picture to be sorted The word picture in region is inputted to predetermined depth neutral net, is exported by the predetermined depth neutral net in each word picture Some word contents, and the word classification confidence level of each word content, and the word classification of each word content is put Reliability is compared two-by-two, then that word classification confidence level is maximum to determine the word content of word classification confidence level maximum Word content be determined as the word content of corresponding word picture, based on aforesaid way, in the word for determining each word picture Hold, the word content for being finally based on each word picture generates the text information of picture to be sorted.

Step S1023, the main object picture in each main object region is inputted to predetermined depth neutral net, to obtain Take the main object of picture to be sorted.

The mobile terminal cuts main object region, get main object region main object picture it Afterwards, the main object picture in each main object region is inputted to predetermined depth neutral net, to obtain picture to be sorted Main object.

Specifically, in the present embodiment, step S1023 includes：

The main object picture in each main object region is inputted to predetermined depth neutral net, to obtain each main body Main object in object picture, and the main object class probability of main object；

The main object of main object class probability maximum is determined as to the main object of picture to be sorted.

The mobile terminal cuts main object region, get main object region main object picture it Afterwards, the main object picture in each main object region is inputted to predetermined depth neutral net, passes through predetermined depth nerve net Network exports some main objects in each main object picture, and in some main objects each main object main body Object classification probability, the main object class probability of each main object in some main objects is compared two-by-two, with Determine the main object of main object class probability maximum, and the main object of main object class probability maximum is determined as pair The main object in main object region is answered, based on aforesaid way, determines the main object in each main object region, and main body The main object class probability of object, and the main object class probability of the main object in each main object region carries out two-by-two Compare, determine the main object region belonging to the main object of main object class probability maximum, and main object is classified generally The main object of rate maximum is determined as the main object of picture to be sorted.

It should be noted that above-mentioned predetermined depth neutral net is established based on convolutional neural networks, that is, utilize movement The picture resource of the relevant classifications such as character picture, scenery picture, personage's picture and the food picture of terminal, to deep neural network It is trained, and detects training result, when training result is higher than default predetermined threshold value, exports deep neural network, specifically Establishment step includes：

A, the image image datas and the label label datas (class belonging to every trained picture of calibration of fixed qty are read Not), wherein, read image data have one-to-one relation with label data；Read data markers are input_data；Wherein, pictorial information is labeled as：Input_data_image, label information are labeled as：input_data_ Label, wherein, the mode of digital independent can be random read take, or read in order；

B, the first deconvolution parameter weight1 is read, and convolution is carried out to the data input_data of reading using weight1 Operation, and be kernel1 by output token, wherein, convolution kernel number used in convolution operation and size answer phase with weight1 It is corresponding, then read the first amount of bias parameter：Bias1, and non-linearization is carried out to kernel1 using bias1 and activation primitive Processing, to obtain conv1, the present embodiment is not made to have to convolution kernel size, number, step-length, padding patterns and activation primitive Body limits；

C, pondization operation is carried out to conv1, is pool1 by output token, and local sound is done using LRN function pairs pool1 Normalized should be done, output token is：Norm1, then reads the second initial deconvolution parameter weight2, and utilizes weight2 Convolution operation is carried out to norm1, is kernel2 by output token；Subsequently read the second amount of bias parameter：Bias2, and utilize Bias2 and activation primitive carry out non-linearization processing to kernel2, to obtain conv2, and pondization operation are carried out to conv2, will Output token is pool2, carries out planarization process to pool2, the data markers after planarization are：reshape；

D, read the 3rd deconvolution parameter weight3 and the 3rd offset parameter bias3, and using weight3, bias3 and Activation primitive carries out non-linearization processing to reshape, obtains local3；It is inclined to read Volume Four product parameter weight4 and the 4th Parameter bias4 is put, and non-linearization processing is carried out to local3 using weight4, bias4 and activation primitive, is obtained local4；The 5th deconvolution parameter weight5 and the 5th offset parameter bias5 is read, and utilizes weight5, bias5 and activation Function pair local4 carries out non-linearization processing, obtains logits；

E, using logits and softmax functions, prediction label is calculated, and be marked as y；Utilize y, input_ Data_label, calculates the loss of this prediction, and is marked as cross_entropy；Calculate the equal of cross_entropy Value, and it is marked as cross_entropy_mean；Optimizer is selected, network parameter is optimized so that be directed to The cross_entropy_mean of input_data reaches minimum；

F, for input_data, the accuracy rate of the prediction of the network after calculation optimization, repeats the above-mentioned step of predetermined number of times Rapid a, b, c, d and e so that the rate of accuracy reached of network to predetermined threshold value, finally output reach the deep neural network of requirement, will be defeated The deep neural network gone out is set in the terminal.

It should be noted that the present embodiment is to above-mentioned first deconvolution parameter, the second deconvolution parameter, the 3rd deconvolution parameter, Four deconvolution parameters and the 5th deconvolution parameter, and the first amount of bias parameter, the second amount of bias parameter, the 3rd amount of bias parameter, Four amount of bias parameters and the 5th amount of bias parameter, and activation primitive, pond mode and optimizer and predetermined number of times and default Threshold value is not especially limited.

Step S103, extracts keyword from the text information identified, and is treated point according to keyword and main object Class picture is classified.

The mobile terminal is after the text information and main object of picture to be sorted is identified, from the word letter identified Keyword, the i.e. syntactic structure based on Chinese are extracted in breath, keyword is extracted from the text information identified, for example, word Information is " sky and sea ", then the syntactic structure based on Chinese, and the keyword of extraction is " sky " and " sea "；The movement is whole After end extracts keyword, category images is treated according to keyword and main object and is classified, i.e., from default keyword The picture/mb-type associated with keyword and main object is obtained in mapping table between main object and picture/mb-type, and will The picture/mb-type is determined as the picture/mb-type of picture to be sorted.In specific implementation, the present invention is also based on the key of extraction Word establishes search index for picture to be sorted, passes through keyword query picture easy to user.

Further, when detecting picture resource idsplay order, picture resource is obtained, and according to each in picture resource The classification of picture shows picture resource.

Mobile terminal shows picture resource according to the classification of each picture in picture resource, easy to user management and query graph Piece resource.

In the present embodiment, the present invention obtains picture to be sorted when the picture classification for detecting triggering instructs, and at this Picture center to be sorted selects character area and the main object region of default quantity, each character area then selected to frame Carry out text information identification and main object identification is carried out to each main object region, to obtain the word of the picture to be sorted Information and main object, finally extract keyword from the text information identified, and according to the keyword and the main object Classify to the picture to be sorted, multiple character areas and multiple main object regions are selected by frame, and be based on multiple words Region determines the text information of picture to be sorted, and the main body pair of picture to be sorted is determined based on multiple main object regions As the accuracy of text information and main object being effectively raised, in combination with the key in main object and text information Word classifies picture, the effective accuracy for improving picture classification.

Further, based on above-mentioned first embodiment, it is proposed that the second embodiment of picture classification method of the present invention, it is and preceding Stating embodiment, difference lies in step S1022 includes：

Error correction is carried out to the word content of each character area, and based on the word content of each character area after error correction Generate the text information of picture to be sorted.

It should be noted that the present invention proposes a kind of recognition accuracy for improving text information based on previous embodiment Concrete mode, is only explained below, other to can refer to previous embodiment.

The mobile terminal inputs the word picture of each character area to predetermined depth neutral net, passes through predetermined depth The word picture of each character area is identified in neutral net, then right to obtain the word content of each character area The word content of each character area carries out error correction, i.e., based on the syntactic structure of Chinese to the word content of each character area into Row error correction, and the word content based on each character area after error correction generates the text information of picture to be sorted.

In the present embodiment, the present invention is after the word content of each character area is got, to each word content Error correction is carried out, and the word content based on each character area after error correction generates the text information of picture to be sorted, effectively Improve the recognition accuracy of text information.

Further, based on above-mentioned first or second embodiments, it is proposed that the 3rd of picture classification method of the present invention implements Example, difference lies in step S1022 is further included with previous embodiment：

Determine the position of each character area, and characteristic of division is identical, and the adjacent character area in position merges, To obtain some merging character areas；

Some merging character areas are inputted to predetermined depth neutral net, to obtain each word for merging character area Content；

Word content based on each merging character area generates the text information of picture to be sorted.

It should be noted that the present invention proposes another recognition accuracy for improving text information based on previous embodiment Concrete mode, be only explained below, it is other to can refer to previous embodiment.

The mobile terminal inputs the word picture of each character area to predetermined depth neutral net, default deep by this Some word contents in each word picture of neutral net output, and the word classification confidence level of each word content are spent, And compared the word classification confidence level of each word content two-by-two, to determine in the word of word classification confidence level maximum Hold, then the word content of word classification confidence level maximum is determined as to the word content of corresponding word picture, based on above-mentioned side Formula, determines the word content of each word picture, and using the word content of definite each word picture as each literal field The characteristic of division in domain, it is then determined that the position of each character area, and the characteristic of division is identical, and the word that position is adjacent Region merges, and to obtain some merging character areas, finally inputs some merging character areas to predetermined depth nerve Network, to obtain each word content for merging character area, and the word content generation based on each merging character area is treated The text information of category images.

In the present embodiment, it is right after the present invention is identical by characteristic of division, and the adjacent character area in position merges Character area after merging is identified, and obtains the text information of picture to be sorted, further improves the identification of text information Accuracy.

Further, with reference to Fig. 6, based on above-mentioned first, second or third embodiment, it is proposed that picture classification side of the present invention The fourth embodiment of method, difference lies in after step S101, which further includes with previous embodiment：

Step S104, the character area selected to frame performs adjustment operation, to obtain the adjustment literal field of given amount Domain；

Step S105, each image block adjusted in character area is inputted to predetermined depth neutral net, to obtain Each word class probability for having adjusted character area；

Step S106, the character area that frame is selected replace with the adjustment character area of word class probability maximum.

For the mobile terminal after the character area that frame chooses default quantity, the character area selected to frame performs adjustment behaviour Make, to obtain the adjustment character area of given amount, and each image block adjusted in character area is inputted to default Deep neural network, to obtain each word class probability for having adjusted character area, the character area for then selecting frame replaces The adjustment character area of word class probability maximum is changed to, wherein, adjustment operation includes but not limited to translation and contracting Operation is put, which includes but not limited to horizontal translation operation, vertical translation pass and first horizontal translation, then vertical flat The translation of shifting or first vertical translation, then the translation of horizontal translation, the zoom operations include but not limited to reduce behaviour To make and amplifieroperation is, it is necessary to illustrate, above-mentioned given amount can be configured by those skilled in the art based on actual conditions, The present embodiment is not especially limited this.

In the present embodiment, the present invention is adjusted the character area of frame choosing, into one after frame selects character area Improve the resolution of the word content in the character area of frame choosing, the effective recognition accuracy for improving text information in step ground.

In addition, the embodiment of the present invention also proposes a kind of computer-readable recording medium, the computer-readable recording medium On be stored with picture classification program, the picture classification program realizes following steps when being executed by processor：

Further, following steps are realized when the picture classification program is executed by processor：

The specific embodiment of computer-readable recording medium of the present invention and each specific embodiment of above-mentioned picture classification method Essentially identical, therefore not to repeat here.

It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row His property includes, so that process, method, article or system including a series of elements not only include those key elements, and And other elements that are not explicitly listed are further included, or further include as this process, method, article or system institute inherently Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this Also there are other identical element in the process of key element, method, article or system.

The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on such understanding, technical scheme substantially in other words does the prior art Going out the part of contribution can be embodied in the form of software product, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disc, CD), including some instructions use so that a station terminal equipment (can be mobile phone, Computer, server, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.

It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair The equivalent structure or equivalent flow shift that bright specification and accompanying drawing content are made, is directly or indirectly used in other relevant skills Art field, is included within the scope of the present invention.

Claims

1. a kind of picture classification method, it is characterised in that the picture classification method comprises the following steps：

When the picture classification for detecting triggering instructs, the character area and main body of default quantity are selected in picture center to be sorted Subject area；

The each character area selected to frame carries out text information identification and carries out main object knowledge to each main object region Not, to obtain the text information and main object of the picture to be sorted；

Keyword is extracted from the text information identified, and according to the keyword and the main object to described to be sorted Picture is classified.

2. picture classification method as claimed in claim 1, it is characterised in that described to select present count in picture center to be sorted The step of character area of amount, includes：

It is default resolution ratio by the resolution adjustment of picture to be sorted, and obtains the pixel square of the picture to be sorted after adjustment Battle array；

Arbitrarily two different pixels of selection, and according to two different pixel frames of selection from the pixel matrix Character area is selected, until frame chooses default quantity, and unduplicated character area.

3. picture classification method as claimed in claim 1, it is characterised in that each character area selected to frame carries out Text information identifies and carries out main object identification to each main object region, to obtain the word of the picture to be sorted letter The step of breath and main object, includes：

The each character area selected to frame is cut, and to obtain the word picture of each character area, and frame is selected Each main object region cut, to obtain the main object picture in each main object region；

The word picture of each character area is inputted to predetermined depth neutral net, to obtain the word of the picture to be sorted Information；

The main object picture in each main object region is inputted to predetermined depth neutral net, to obtain the figure to be sorted The main object of piece.

4. picture classification method as claimed in claim 3, it is characterised in that the word picture by each character area is defeated Enter to predetermined depth neutral net, included with obtaining the step of the text information of the picture to be sorted：

The word picture of each character area is inputted to predetermined depth neutral net, to obtain in the word of each character area Hold；

Error correction, and the generation of the word content based on each character area after error correction are carried out to the word content of each character area The text information of the picture to be sorted.

5. picture classification method as claimed in claim 3, it is characterised in that the word picture by each character area is defeated Enter to predetermined depth neutral net, further included with obtaining the step of the text information of the picture to be sorted：

The word picture of each character area is inputted to predetermined depth neutral net, to obtain the classification of each character area spy Sign；

Determine the position of each character area, and the characteristic of division is identical, and the character area progress that the position is adjacent Merge, to obtain some merging character areas；

6. picture classification method as claimed in claim 3, it is characterised in that the main body pair by each main object region As picture is inputted to predetermined depth neutral net, included with obtaining the step of the main object of the picture to be sorted：

The main object picture in each main object region is inputted to predetermined depth neutral net, to obtain each main object Main object in picture, and the main object class probability of the main object；

7. such as the picture classification method any one of claim 1-6, it is characterised in that described in picture center to be sorted After the step of selecting character area and the main object region of default quantity, the picture classification method further includes：

Each image block adjusted in character area is inputted to predetermined depth neutral net, each word has been adjusted to obtain The word class probability in region；

8. such as the picture classification method any one of claim 1-6, it is characterised in that described from the word identified letter The step of keyword is extracted in breath, and is classified according to the keyword and the main object to the picture to be sorted it Afterwards, the picture classification method further includes：

When detecting picture resource idsplay order, picture resource is obtained, and according to point of each picture in the picture resource Class shows the picture resource.

9. a kind of mobile terminal, it is characterised in that the mobile terminal includes：Memory, processor and it is stored in the storage It is real when the picture classification program is performed by the processor on device and the picture classification program that can run on the processor Now such as the step of picture classification method described in any item of the claim 1 to 8.

10. a kind of computer-readable recording medium, it is characterised in that picture point is stored with the computer-readable recording medium Class method, realizes such as picture classification described in any item of the claim 1 to 8 when the picture classification program is executed by processor The step of method.