CN108537283A - A kind of image classification method and convolutional neural networks generation method - Google Patents
A kind of image classification method and convolutional neural networks generation method Download PDFInfo
- Publication number
- CN108537283A CN108537283A CN201810331479.5A CN201810331479A CN108537283A CN 108537283 A CN108537283 A CN 108537283A CN 201810331479 A CN201810331479 A CN 201810331479A CN 108537283 A CN108537283 A CN 108537283A
- Authority
- CN
- China
- Prior art keywords
- image
- convolutional neural
- neural networks
- text
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
A kind of convolutional neural networks generation method the invention discloses image classification method, for carrying out classification processing to image, convolutional neural networks generation method, mobile terminal and computing device for the word in image to be identified, described image sorting technique is suitable for executing in the terminal, the mobile terminal includes image library, multiple images are stored in described image library, described image sorting technique includes step:To each image in image library, classification processing is carried out to obtain its corresponding classification to the image;If the classification is text class, Text region is carried out to the image, to extract the text message that the image is included;The image store path and image name of the text message and the image are associated storage.
Description
Technical field
The present invention relates to technical field of image processing, more particularly to a kind of image classification method, for dividing image
The convolutional neural networks generation method of class processing, the convolutional neural networks generation side for the word in image to be identified
Method, mobile terminal and computing device.
Background technology
With the continuous development of hardware technology, more and more people begin to use such as smart mobile phone, tablet computer mobile
Terminal carries out photograph taking and stores, to record precious moment.When the number of pictures preserved in mobile terminal increasingly
When more, since photo is various, and different classifications is all adhered to separately, usually will appear the feelings that can not search out a certain photo in time
Condition brings poor experience to user.
Image in the photograph album of mobile terminal is typically divided into each classification, according to class by existing image classification algorithms
Image management is not carried out, but not carries out further operating again.Although such processing mode is conveniently used for being looked into according to classification
Image is looked for, then cannot achieve but if quickly to navigate to a certain image with specific information, such as user couple one
Image part above word has impression, but other content is forgotten, is just difficult to through user itself to the image at this time
Classification recognizes, quickly and accurately to get information included in required image and image.Accordingly, it is desirable to provide a kind of new
Image classification method improve above-mentioned processing procedure.
Invention content
For this purpose, the present invention provides a kind of image classification scheme, and provide the convolution for carrying out classification processing to image
Neural network generates scheme and the convolutional neural networks for the word in image to be identified generate scheme, to try hard to solve
Or at least alleviate above there are the problem of.
According to an aspect of the present invention, a kind of image classification method is provided, it is mobile whole suitable for executing in the terminal
End includes image library, multiple images is stored in image library, this method comprises the following steps:First, to each in image library
Image is opened, classification processing is carried out to obtain its corresponding classification to the image;If classification is text class, to the image into style of writing
Word identifies, to extract the text message that the image is included;By the image store path and image of text message and the image
Title is associated storage.
Optionally, in image classification method according to the present invention, when receiving the term of user's key entry, this method
Further include:Search whether that there are same or similar text messages according to term;If in the presence of text information is obtained
Associated image store path;Its corresponding image is found according to the image store path, by the image and text information
It is shown to user.
Optionally, in image classification method according to the present invention, Text region is carried out to the image, to extract the figure
As included text message the step of include:The corresponding character image region of each single word for obtaining that the image included;
Text region, the word for being included with each character image region of determination are carried out to each character image region respectively;Based on each word
Generate the corresponding text message of the image.
Optionally, in image classification method according to the present invention, the corresponding text envelope of the image is generated based on each word
The step of breath includes:Obtain the position relationship between each character image region in the image;According to position relationship, to each word figure
As the corresponding word in region is combined, to generate the corresponding text message of the image.
Optionally, it is stored in image classification method according to the present invention, in mobile terminal for dividing image
Class processing, trained first convolutional neural networks carry out classification processing to obtain its corresponding image type to the image
The step of include:The image is input in trained first convolutional neural networks and carries out image classification;According to the first volume
The output of product neural network determines the classification of the image.
Optionally, it is stored in image classification method according to the present invention, in mobile terminal for the text in image
Second convolutional neural networks that word is identified, trained carry out Text region to the image, are wrapped with extracting the image
The step of text message contained includes:The corresponding character image region of each single word for obtaining that the image included;Respectively will
Each character image region is input in trained second convolutional neural networks and carries out Text region, according to second convolutional Neural
The output of network determines the word that each character image region is included;The corresponding text message of the image is generated based on each word.
Optionally, in image classification method according to the present invention, trained first convolutional neural networks pass through following
Mode acquires:Process block is built, process block includes convolutional layer;Pond layer, full articulamentum and grader are built respectively;According to
Multiple process blocks and pond layer build the first convolutional neural networks, the first convolutional neural networks in conjunction with full articulamentum and grader
It is input with process block, is output with grader;According to image category data acquisition system pair the first convolution nerve net obtained in advance
Network is trained, and the classification corresponding to output instruction input picture so as to grader, image category data acquisition system includes multiple
Image category information, each image category information include meeting the first image classification corresponding with first image of pre-set dimension
Information.
Optionally, in image classification method according to the present invention, trained second convolutional neural networks pass through following
Mode acquires:The first process block is built, the first process block includes the first convolutional layer;Build second processing block, second processing
Block includes the first full articulamentum;The first pond layer, the second full articulamentum and the first grader are built respectively;According to one or more
First process block, the first pond layer and second processing block build the second convolution god in conjunction with the second full articulamentum and the first grader
Through network, the second convolutional neural networks are input with the first process block, are output with the first grader;According to the text obtained in advance
The second convolutional neural networks of word sets of image data pair are trained, to be wrapped in the output instruction input picture of the first grader
The word contained, text image data set include multiple character image information, and each character image information includes meeting first in advance
If text information included in the character image of size and the character image.
According to a further aspect of the invention, provide a kind of mobile terminal, including one or more processors, memory with
And one or more programs, wherein one or more programs are stored in memory and are configured as by one or more processors
It executes, one or more programs include the instruction for executing image classification method according to the present invention.
According to a further aspect of the invention, a kind of computer-readable storage medium of the one or more programs of storage is provided
Matter, one or more programs include instruction, are instructed when by mobile terminal execution so that mobile terminal execution is according to the present invention
Image classification method.
According to a further aspect of the invention, a kind of convolutional neural networks life for carrying out classification processing to image is provided
At method, suitable for being executed in computing device, this method comprises the following steps:First, process block is built, process block includes convolution
Layer;Pond layer, full articulamentum and grader are built respectively;According to multiple process blocks and pond layer, in conjunction with full articulamentum and classification
Device builds convolutional neural networks, and convolutional neural networks are input with process block, are output with grader;According to the figure obtained in advance
As categorical data set is trained convolutional neural networks, the class corresponding to output instruction input picture so as to grader
Not, image category data acquisition system includes multiple images classification information, and each image category information includes meet pre-set dimension
One image classification information corresponding with first image.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
Build process block the step of further include:Build active coating;Active coating is added after convolutional layer, to form process block.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
Pond layer is any in maximum pond layer and average pond layer.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
According to multiple process blocks and pond layer, include in conjunction with the step of full articulamentum and grader structure convolutional neural networks:According to pre-
If concatenate rule, after each process block is connected with maximum pond layer, connection is averaged pond layer;After average pond layer
The full articulamentum and grader being sequentially connected are added, to build with process block as input, with the convolutional Neural that grader is output
Network.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
Convolutional neural networks are trained according to the image category data acquisition system obtained in advance, input is indicated so as to the output of grader
The step of classification corresponding to image includes:To the image category information that each is extracted, wrapped with the image category information
The first image included is the input of first process block in convolutional neural networks, is believed with the classification included by the image category information
Breath is the output of grader, is trained to convolutional neural networks.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
The quantity of process block is 3.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
The quantity of maximum pond layer is 2, and the quantity of average pond layer is 1.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
Classification information is any one of animal class, Building class, class, landscape class, figure kind and text class in kind.
Optionally, in the convolutional neural networks generation method according to the present invention for carrying out classification processing to image,
Further including the step of image category data acquisition system is generated in advance, image category data acquisition system is generated in advance includes:Each is waited for
It handles picture and carries out image procossing, to obtain the first image that each pending picture is corresponding, meets pre-set dimension;To each
The first image for meeting pre-set dimension obtains its corresponding pending associated classification information of picture, according to classification information and is somebody's turn to do
First image generates corresponding image category information;Collect each image category information, to form image category data acquisition system.
According to a further aspect of the invention, a kind of convolutional Neural net for the word in image to be identified is provided
Network generation method, suitable for being executed in computing device, this method comprises the following steps:First, the first process block is built, at first
It includes the first convolutional layer to manage block;Second processing block is built, second processing block includes the first full articulamentum;The first pond is built respectively
Layer, the second full articulamentum and the first grader;According to one or more first process blocks, the first pond layer and second processing block,
Convolutional neural networks are built in conjunction with the second full articulamentum and the first grader, convolutional neural networks are input with the first process block,
It is output with the first grader;Convolutional neural networks are trained according to the text image data set obtained in advance, so as to
The word for including in the output instruction input picture of first grader, text image data set include multiple character image letters
Breath, each character image information include meeting word letter included in the character image and the character image of the first pre-set dimension
Breath.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, build the first process block the step of further include:Build the first active coating;The first active coating is added after the first convolutional layer,
To form the first process block.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, build second processing block the step of further include:Build the second active coating;The second activation of addition after the first full articulamentum
Layer, to form second processing block.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, the first pond layer is maximum pond layer.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, according to one or more first process blocks, the first pond layer and second processing block, classify in conjunction with the second full articulamentum and first
Device build convolutional neural networks the step of include:According to preset first concatenate rule, by each first process block, the first pond layer
After being connected with second processing block, the second full articulamentum is connected;The first grader is added after the second full articulamentum, with structure
It is input to build with the first process block, with the convolutional neural networks that the first grader is output.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, convolutional neural networks are trained according to the text image data set obtained in advance, so as to the output of the first grader
Indicate input picture in include word the step of include:To the character image information that each is extracted, with the character image
Character image included by information is the input of first the first process block in convolutional neural networks, with the word image information institute
Including text information be the first grader output, convolutional neural networks are trained.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, the quantity of the first process block is 5, and the quantity of second processing block is 1, and the quantity of the first pond layer is 3.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, text information is single word, and single word is any in numeric class word, alphabetic class word and Chinese character class word
Kind.
Optionally, in the convolutional neural networks generation method according to the present invention for the word in image to be identified
In, further including the step of text image data set is generated in advance, text image data set is generated in advance includes:To each
Pending word picture carries out image procossing, and to obtain, each pending word picture is corresponding, meets the text of the first pre-set dimension
Word image;To each character image, its corresponding pending associated text information of word picture is obtained, according to text information
Character image information corresponding with character image generation;Collect each character image information, to form text image data set.
According to a further aspect of the invention, provide a kind of computing device, including one or more processors, memory with
And one or more programs, wherein one or more programs are stored in memory and are configured as by one or more processors
It executes, one or more programs include for executing the convolutional Neural net according to the present invention for carrying out classification processing to image
The instruction of network generation method and/or convolutional neural networks generation method for the word in image to be identified.
According to a further aspect of the invention, a kind of computer-readable storage medium of the one or more programs of storage is provided
Matter, one or more programs include instruction, and instruction is when executed by a computing apparatus so that computing device executes according to the present invention
For carrying out the convolutional neural networks generation method of classification processing to image and/or for the word in image to be identified
Convolutional neural networks generation method.
Image classification method according to the present invention, to each image in image library, first to the image classification to obtain
Its corresponding classification carries out Text region if classification is text class to the image, to extract included text message,
By the image store path and image name associated storage of text message and the image.In the above scheme, if receiving user
When the term of key entry, it will search whether that there are same or similar text messages according to the term, and if it exists, then obtain
The image store path for taking text information association finds its corresponding image, by the image according to the image store path
It is shown to user with text information, to realize the quick and precisely positioning to image needed for user, greatly facilitates user
Search for obscuring image information content improves usage experience.In addition, using trained first convolutional neural networks come pair
Image is classified, and the word in image is identified by trained second convolutional neural networks, wherein the first volume
Product neural network and the second convolutional neural networks all have smaller network structure, then relying on capable and vigorous miniature neural network real
Existing image classification and Text region, may be implemented the processing in mobile phone mobile terminal or low profile edge equipment, when in use not
It needs to be communicated with server end, without uploading high in the clouds, avoids to communication network, such as the dependence of 4G networks, improve
Availability under no network or weak signal network, and due to being not necessarily to largely calculate service, also reduce corresponding operation dimension
Protect cost.
Convolutional neural networks generation method according to the present invention for carrying out classification processing to image, the convolutional Neural net
Network is miniature neural network, and structure is that each process block and maximum pond layer are carried out continuous heap according to preset concatenate rule
It is folded, and accordingly connect into average pond layer, full articulamentum and grader come what is realized, ensure that the feature of extraction is substantially better than manually
Design feature, to realize being obviously improved for accuracy of identification, to which False Rate be greatly reduced.Wherein, in addition to convolutional layer in process block
Except, active coating can also be accordingly added, to alleviate over-fitting.After completing training to the convolutional neural networks, the training
Good convolutional neural networks can be used as image classification model transplantations to mobile terminal to apply.
Convolutional neural networks generation method according to the present invention for the word in image to be identified, convolution god
Structure through network is according to preset first concatenate rule by each first process block, the first pond layer and second processing block phase
Even, it and accordingly connects and realizes into the second full articulamentum and the first grader, equally also ensure that extracted feature carries
There is abundant image information, contributes to the promotion of accuracy of identification.Wherein, the first process block can add the first active coating, at second
Reason block can add the second active coating, to alleviate over-fitting.After completing training to the convolutional neural networks, the trained volume
Product neural network can be used as Text region model transplantations to mobile terminal to apply.
Description of the drawings
To the accomplishment of the foregoing and related purposes, certain illustrative sides are described herein in conjunction with following description and drawings
Face, these aspects indicate the various modes that can put into practice principles disclosed herein, and all aspects and its equivalent aspect
It is intended to fall in the range of theme claimed.Read following detailed description in conjunction with the accompanying drawings, the disclosure it is above-mentioned
And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical reference numeral generally refers to identical
Component or element.
Fig. 1 shows the schematic diagram of mobile terminal 100 according to an embodiment of the invention;
Fig. 2 shows the flow charts of image classification method 200 according to an embodiment of the invention;
Fig. 3 shows the structural schematic diagram of process block according to an embodiment of the invention;
Fig. 4 shows the structural schematic diagram of the first convolutional neural networks according to an embodiment of the invention;
Fig. 5 A show the structural schematic diagram of the first process block according to an embodiment of the invention;
Fig. 5 B show the structural schematic diagram of second processing block according to an embodiment of the invention;
Fig. 6 shows the structural schematic diagram of the second convolutional neural networks according to an embodiment of the invention;
Fig. 7 shows the schematic diagram of computing device 700 according to an embodiment of the invention;
Fig. 8 shows the convolutional neural networks according to an embodiment of the invention for carrying out classification processing to image
The flow chart of generation method 800;And
Fig. 9 shows the convolutional Neural according to an embodiment of the invention for the word in image to be identified
The flow chart of network generation method 900.
Specific implementation mode
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
Completely it is communicated to those skilled in the art.
Fig. 1 is the structure diagram of mobile terminal 100.Mobile terminal 100 may include memory interface 102, one or more
A data processor, image processor and/or central processing unit 104 and peripheral interface 106.
Memory interface 102, one or more processors 104 and/or peripheral interface 106 either discrete component,
It can be integrated in one or more integrated circuits.In the mobile terminal 100, various elements can pass through one or more communication
Bus or signal wire couple.Sensor, equipment and subsystem may be coupled to peripheral interface 106, a variety of to help to realize
Function.
For example, motion sensor 110, light sensor 112 and range sensor 114 may be coupled to peripheral interface 106,
To facilitate the functions such as orientation, illumination and ranging.Other sensors 116 can equally be connected with peripheral interface 106, such as positioning system
System (such as GPS receiver), temperature sensor, biometric sensor or other sensor devices, it is possible thereby to help to implement phase
The function of pass.
Camera sub-system 120 and optical sensor 122 can be used for the camera of convenient such as recording photograph and video clipping
The realization of function, wherein the camera sub-system and optical sensor for example can be charge coupling device (CCD) or complementary gold
Belong to oxide semiconductor (CMOS) optical sensor.It can help to realize by one or more radio communication subsystems 124
Communication function, wherein radio communication subsystem may include radio-frequency transmitter and transmitter and/or light (such as infrared) receiver
And transmitter.The particular design and embodiment of radio communication subsystem 124 can depend on mobile terminal 100 is supported one
A or multiple communication networks.For example, mobile terminal 100 may include be designed to support LTE, 3G, GSM network, GPRS network,
EDGE network, Wi-Fi or WiMax network and BlueboothTMThe communication subsystem 124 of network.
Audio subsystem 126 can be coupled with loud speaker 128 and microphone 130, to help to implement to enable voice
Function, such as speech recognition, speech reproduction, digital record and telephony feature.I/O subsystems 140 may include touch screen control
Device 142 processed and/or other one or more input controllers 144.Touch screen controller 142 may be coupled to touch screen 146.It lifts
For example, the touch screen 146 and touch screen controller 142 can be detected using any one of a variety of touch-sensing technologies
The contact and movement or pause carried out therewith, wherein detection technology include but is not limited to capacitive character, resistive, infrared and table
Face technology of acoustic wave.Other one or more input controllers 144 may be coupled to other input/control devicess 148, such as one
Or the pointer device of multiple buttons, rocker switch, thumb wheel, infrared port, USB port, and/or stylus etc.It is described
One or more button (not shown)s may include the up/down for 130 volume of controlling loudspeaker 128 and/or microphone
Button.
Memory interface 102 can be coupled with memory 150.The memory 150 may include that high random access is deposited
Reservoir and/or nonvolatile memory, such as one or more disk storage equipments, one or more optical storage apparatus, and/
Or flash memories (such as NAND, NOR).Memory 150 can store an operating system 172, for example, Android, iOS or
The operating system of Windows Phone etc.The operating system 172 may include for handling basic system services and execution
The instruction of task dependent on hardware.Memory 150 can also store program 174.It, can be from memory when mobile device is run
Load operating system 172 in 150, and executed by processor 104.Program 174 at runtime, can also add from memory 150
It carries, and is executed by processor 104.Program 174 operates on operating system, is provided using operating system and bottom hardware
Interface realizes the various desired functions of user, such as instant messaging, web page browsing, pictures management.Program 174 can be independently of
What operating system provided, can also be that operating system is included.In addition, when program 174 is mounted in mobile terminal 100,
Drive module can be added to operating system.In some embodiments, mobile terminal 100 is configured as executing according to the present invention
Image classification method.Wherein, one or more programs 174 of mobile terminal 100 include for executing image according to the present invention
The instruction of sorting technique 200.
Fig. 2 shows the flow charts of image classification method 200 according to an embodiment of the invention.Image classification method
200 are suitable for executing in mobile terminal (such as mobile terminal 100 shown in FIG. 1), and mobile terminal 100 includes image library, the figure
As being stored with multiple images in library.According to one embodiment of present invention, the image library of mobile terminal 100 is construed as phase
Volume, the image stored in photograph album, can also either the photo that user is shot by the camera of mobile terminal 100
It is to utilize other approach, such as sectional drawing or preservation, the current page shown by the screen to mobile terminal 100 carries out image preservation
It is formed by picture, to this present invention and is not limited.According to one embodiment of present invention, the image of mobile terminal 100
10 images are stored in library, be denoted as respectively M1, M2 ..., M10, for ease of description, below will by taking image M1 as an example progress side
The related description of method 200.
Method 200 starts from step S210, in step S 210, to each image in image library, is carried out to the image
Classification processing is to obtain its corresponding classification.According to one embodiment of present invention, it is stored in mobile terminal 100 for figure
Picture carries out classification processing, trained first convolutional neural networks, can carry out classification processing to the image in the following way
To obtain its corresponding image type.First, which is input in trained first convolutional neural networks and carries out image
Classification determines the classification of the image further according to the output of first convolutional neural networks.It for ease of understanding, below will be first to obtaining
The process of trained first convolutional neural networks illustrates.
Specifically, first building process block, process block includes convolutional layer.In view of controlling over-fitting, according to the present invention
One embodiment can also build active coating when building process block, active coating is added after convolutional layer, with formed place
Manage block.Fig. 3 shows the structural schematic diagram of process block according to an embodiment of the invention.As shown in figure 3, in process block
In, including the convolutional layer and active coating that are sequentially connected.In this embodiment, using ReLU (Rectified Linear
Unit) activation primitive of the function as active coating, to adjust the output by convolutional layer, the output for avoiding next layer is last layer
Linear combination and arbitrary function can not be approached.
After the structure for completing process block, then pond layer, full articulamentum and grader are built respectively.One according to the present invention
Embodiment, pond layer are any in maximum pond layer and average pond layer, and pondization utilizes the principle of image local correlation, right
Image carries out sub-sample, to reduce under data processing and retain useful information.
Next, according to multiple process blocks and pond layer, the first convolution nerve net is built in conjunction with full articulamentum and grader
Network, the first convolutional neural networks are input with process block, are output with the grader.According to one embodiment of present invention, may be used
Build the first convolutional neural networks in the following way.First, according to preset concatenate rule, by each process block and maximum pond
Change layer carries out after being connected, and connects average pond layer, and the full articulamentum being sequentially connected then is added after average pond layer and is divided
Class device, to build with process block as input, with the first convolutional neural networks that the grader is output.Wherein, the number of process block
Amount is 3, and the quantity of maximum pond layer is 2, and the quantity of average pond layer is 1.
In this embodiment, 3 process blocks are connected with 2 maximum pond layers according to preset concatenate rule,
1 average pond layer is connected later, and adds the full articulamentum and grader being sequentially connected after average pond layer, to structure
It is input to build out with 1 process block, with the first convolutional neural networks that the grader is output.Fig. 4 is shown according to the present invention
One embodiment the first convolutional neural networks structural schematic diagram.As shown in figure 4, in the first convolutional neural networks, it is
Using process block A1 as input terminal, behind be sequentially connected maximum pond layer B1, process block A2, process block A3, maximum pond layer B2, flat
Equal pond layer C1, full articulamentum D1 and grader E1, wherein grader E1 are output end.Each processing unit illustrated in fig. 4
The order of connection is as arranged according to preset concatenate rule.It, can be according to practical application about pre-setting for concatenate rule
Scene, network training situation, system configuration and performance requirement etc. are suitably adjusted, these skills for understanding the present invention program
It can be readily apparent that for art personnel, and also within protection scope of the present invention, not repeated herein.
After building the first convolutional neural networks, start to be trained it.According to the image category data obtained in advance
Set is trained the first convolutional neural networks, the classification corresponding to output instruction input picture so as to grader, image
Categorical data set includes multiple images classification information, each image category information include meet pre-set dimension the first image and
The corresponding classification information of first image.It according to one embodiment of present invention, can be in the following way to the first convolutional Neural
Network is trained.In this embodiment, it to the image category information that each is extracted, is wrapped with the image category information
The first image included is the input of first process block in the first convolutional neural networks, with the class included by the image category information
Other information is the output of the grader, is trained to the first convolutional neural networks.Wherein, pre-set dimension be preferably 220px ×
220px, the first image are RGB triple channel images, corresponding classification information be animal class, Building class, class in kind, landscape class,
Any one of figure kind and text class.
Below by by taking an image category information X in image category data acquisition system as an example, to the first convolutional neural networks
Training process illustrate.Image category information X includes the first image X1 classification information X2 corresponding with first image, the
The size of one image X1 is 220px × 220px, and classification information X2 is text class.Training when, be with the first image X1 be processing
The input of block A1, classification information X2 are that the output of grader E1 carries out the training of the first convolutional neural networks.
Table 1 shows that the parameter setting example of process block A1~A3 according to an embodiment of the invention, table 2 are shown
The parameter setting example of maximum pond layer B1~B2 according to an embodiment of the invention and average pond layer C1.Wherein, right
For the value of 1 the inside circle zero padding this parameter of table, " 0 " indicates to operate without boundary zero padding, and " 1 " indicates to be inputted convolutional layer
The each row and each column of outside 1 pixel unit in edge of image is with 0 filling.If without particularly pointing out, it is related to boundary zero padding below
Content is subject to above description.The content of Tables 1 and 2 is specifically as follows respectively:
Processing unit | Convolution kernel size | Boundary zero padding | Step-length | Convolution nuclear volume |
Process block A1 | 5×5 | 0 | 4 | 45 |
Process block A2 | 1×1 | 0 | 1 | 45 |
Process block A3 | 3×3 | 1 | 1 | 100 |
Table 1
Processing unit | Pond block size | Step-length |
Maximum pond layer B1 | 3×3 | 2 |
Maximum pond layer B2 | 3×3 | 2 |
Average pond layer C1 | 4×4 | 2 |
Table 2
Parameter setting is carried out to process block A1~A3 with reference to table 1, with reference to table 2 to maximum pond layer B1~B2 and average pond
Layer C1 carries out parameter setting, and is based on the first image X1 processing of the above parameter pair.Specifically, first by the first image X1 inputs
It is RGB triple channel images to process block A1, the first image X1, size is 220px × 220px.Convolutional layer in process block A1 has
The number of parameters of 45 convolution kernels, each convolution kernel is 5 × 5 × 3, is equivalent to the convolution kernel of 45 5 × 5 sizes respectively at 3
Channel progress convolution, step-length 4, then after the convolution of the convolutional layer, according toIt is found that at this time
The size of the image arrived is 54px × 54px, that is, obtains the characteristic pattern of 45 54px × 54px sizes, whereinExpression takes downwards
It is whole.Since triple channel is combined carry out process of convolution in the convolutional layer, the active coating in process block A1
Input is the single channel image of 45 54px × 54px, and after the processing of the active coating, the output for obtaining process block A1 is 45
Open the characteristic pattern of 54px × 54px.
Then, into maximum pond layer B1.Maximum pond layer B1 uses Maximum overlap pond, i.e., to the spy of 54px × 54px
Sign figure carries out piecemeal, and the size of each block is 3 × 3, step-length 2, and counts the maximum value of each block, as image behind pond
Pixel value.According toIt is found that the characteristic pattern size of Chi Huahou is 26px × 26px, then by maximum pond
After layer B1, the characteristic pattern of 45 26px × 26px is obtained.
Next, the characteristic pattern of 45 26px × 26px of maximum pond layer B1 outputs is input in process block A2, locate
Convolutional layer in reason block A2 has 45 convolution kernels, and the number of parameters of each convolution kernel is 1 × 1, is equivalent to 45 1 × 1 sizes
Convolution kernel carries out convolution, step-length 1.According toIt is found that the size of the image obtained at this time be 26px ×
26px obtains the characteristic pattern of 45 26px × 26px sizes.Using the processing of active coating in process block A2, process block is obtained
The output of A2 is the characteristic pattern of 45 26px × 26px.The characteristic pattern of this 45 26px × 26px is input to process block A3 again,
Convolutional layer in process block A3 has 100 convolution kernels, and the number of parameters of each convolution kernel is 3 × 3, is equivalent to 100 3 × 3 big
Small convolution kernel carries out convolution, step-length 1.By each row of outside 1 pixel unit in the edge of the convolutional layer institute input feature vector figure
With each row with 0 filling, then after the convolution of the convolutional layer, according toIt is found that obtain at this time
The size of image is 26px × 26px, that is, obtains the characteristic pattern of 100 26px × 26px sizes.Using being activated in process block A3
The processing of layer, the output for obtaining process block A3 is the characteristic pattern of 100 26px × 26px.
At this point, the characteristic pattern for 100 26px × 26px that process block A3 is exported is in the processing by maximum pond layer B2
Afterwards, according toIt is found that the output of maximum pond layer B2 is the characteristic pattern of 100 12px × 12px.By this
Input of the characteristic pattern of 100 12px × 12px as average pond layer C1, average pond layer C1 is using average overlapping pool, i.e.,
Piecemeal is carried out to the characteristic pattern of 12px × 12px, the size of each block is 4 × 4, step-length 2, and counts the average value of each block,
Pixel value as image behind pond.ByIt is found that the characteristic pattern size of Chi Huahou is 5px × 5px, then pass through
It crosses after averagely pond layer C1, obtains the characteristic pattern of 100 5px × 5px.Hereafter, into full articulamentum D1, due to being to image
Classification be identified, be classification problem more than one, and in this embodiment image category be animal class, Building class, material object
This 6 type of class, landscape class, figure kind and text class is other any, therefore the output of full articulamentum D1 is also 6, right respectively
The probability for answering 6 kinds of classifications to occur.It is the corresponding classification of maximum probability that grader E1, which selects softmax graders, output, such
Classification information X2 that Wei be corresponding to the first image X1.It is the technological means of maturation about the content of softmax graders, this
Place is not repeated.In order to train first convolutional neural networks, according to the corresponding classification information X2 of the first image X1 of input
For this foreseen outcome of text class, the output of grader E1 is adjusted, by the method backpropagation of minimization error to adjust
Each parameter in whole first convolutional neural networks.It is trained by a large amount of picture type information in image type data acquisition system
Afterwards, trained first convolutional neural networks are obtained.
In addition, for training the image category data acquisition system of the first convolutional neural networks to need to be generated in advance, according to
Another embodiment of the present invention, can be generated in advance image category data acquisition system in the following way.First, each is waited locating
It manages picture and carries out image procossing, to obtain the first image that each pending picture is corresponding, meets pre-set dimension.Wherein, it presets
Size is 220px × 220px, when handling pending picture, typically using the most short side of the pending picture as base
The most short side is adjusted to 224px, such as the pending picture of a 112px × 200px by standard, be adjusted to 224px ×
The size of 400px, then cut from the middle of the picture after adjustment, with obtain the pending picture is corresponding, 220px ×
First image of 220px sizes.Obtain each pending picture is corresponding, meet the first image of pre-set dimension after, to each
The first image is opened, its corresponding pending associated classification information of picture is obtained, is given birth to according to category information and first image
At corresponding image category information, collect each image category information, to form image category data acquisition system.
Based on this, according to one embodiment of present invention, image M1 is input to trained first convolutional neural networks
Middle carry out image classification, the output of grader E1 is 6 probability values, maximum in trained first convolutional neural networks
Probability value is 0.77, and the 6th output for being grader E1, corresponding classification is text class, thus can determine that image M1 is corresponding
Classification is text class.
In turn, step S220 is executed, if the category is text class, Text region is carried out to the image, to extract this
The text message that image is included.According to one embodiment of present invention, can word knowledge be carried out to the image in the following way
Not, to extract the text message that the image is included.First, the corresponding word of each single word for obtaining that the image included
Image-region, then Text region is carried out to each character image region respectively, with the word that each character image region of determination is included,
It is finally based on each word and generates the corresponding text message of the image.
So, according to step S210 it is found that the classification of image M1 is text class, then image M1 each lists for being included first are obtained
The corresponding character image region of a word, then Text region is carried out to each character image region respectively, with each character image of determination
The word that region is included.In this embodiment, it is stored in mobile terminal 100 for the word in image to be identified
, trained second convolutional neural networks, can Text region be carried out to image M1 in the following way, to extract the image
Including text message.First, the corresponding character image region of each single word that image M1 is included is obtained, it respectively will be each
Character image region is input in trained second convolutional neural networks and carries out Text region, then, according to second convolution
The output of neural network determines the word that each character image region is included, then generates the corresponding text of the image based on each word
Information.
Since image M1 contains 3 single words, by the corresponding character image region of this 3 single words be denoted as Q1,
Q2, Q3 then need to carry out Text region to character image region Q1, Q2 and Q3 respectively, to determine its word for being included.Below
The explanation of Text region process is carried out by taking the Q1 of character image region as an example.Certainly, for ease of understanding, next first trained to obtaining
The process of the second good convolutional neural networks illustrates.
Specifically, first building the first process block, the first process block includes the first convolutional layer.In view of control over-fitting is existing
As according to one embodiment of present invention, when building the first process block, the first active coating can also be built, in the first convolution
The first active coating is added after layer, to form the first process block.Fig. 5 A show according to an embodiment of the invention first
The structural schematic diagram of process block.As shown in Figure 5A, in the first process block, including the first convolutional layer being sequentially connected and first swashs
Layer living.In this embodiment, using activation letter of ReLU (the Rectified Linear Unit) functions as the first active coating
Number avoids next layer of output that from can not being approached for the linear combination of last layer and appoints to adjust the output by the first convolutional layer
Meaning function.
Second processing block is built again, and second processing block includes the first full articulamentum.In view of controlling over-fitting, according to
One embodiment of the present of invention can also build the second active coating when building second processing block, after the first full articulamentum
The second active coating is added, to form second processing block.Fig. 5 B show second processing block according to an embodiment of the invention
Structural schematic diagram.As shown in Figure 5 B, in second processing block, including the first full articulamentum being sequentially connected and the second activation
Layer.In this embodiment, using activation letter of ReLU (the Rectified Linear Unit) functions as the second active coating
Number avoids next layer of output that from can not being approached for the linear combination of last layer to adjust the output by the first full articulamentum
Arbitrary function.
After the structure for completing the first process block and second processing block, build respectively the first pond layer, the second full articulamentum and
First grader.According to one embodiment of present invention, the first pond layer is maximum pond layer.
Then, according to one or more first process blocks, the first pond layer and second processing block, in conjunction with the second full articulamentum
With the first grader build the second convolutional neural networks, the second convolutional neural networks with the first process block be input, with this first
Grader is output.According to one embodiment of present invention, the second convolutional neural networks can be built in the following way.It is first
First, according to preset first concatenate rule, after each first process block, the first pond layer are connected with second processing block, even
The second full articulamentum is connect, then adds the first grader after the second full articulamentum, to build with the first process block as input,
With the second convolutional neural networks that first grader is output.Wherein, the quantity of process block is 5 at first, second processing block
Quantity be 1, the quantity of the first pond layer is 3.
In this embodiment, according to preset first concatenate rule by 5 the first process blocks, 3 the first pond layers and 1
A second processing block is connected, and connects the second full articulamentum later, and add the first grader after the second full articulamentum,
To construct with 1 the first process block as input, with the second convolutional neural networks that first grader is output.Fig. 6 shows
The structural schematic diagram of the second convolutional neural networks according to an embodiment of the invention is gone out.As shown in fig. 6, in the second convolution
In neural network, be using the first process block F1 as input terminal, behind be sequentially connected the first pond layer G1, the first process block F2,
One pond layer G2, the first process block F3, the first process block F4, the first process block F5, the first pond layer G3, second processing block H1,
Second full articulamentum J1 and the first grader K1, wherein the first grader K1 is output end.Each processing unit illustrated in fig. 6
The order of connection is as arranged according to preset first concatenate rule.It, can basis about pre-setting for the first concatenate rule
Practical application scene, network training situation, system configuration and performance requirement etc. are suitably adjusted, these are for understanding the present invention
It can be readily apparent that for the technical staff of scheme, and also within protection scope of the present invention, not gone to live in the household of one's in-laws on getting married herein
It states.
After building the second convolutional neural networks, start to be trained it.According to the text image data obtained in advance
Set is trained the second convolutional neural networks, so as to the text for including in the output instruction input picture of first grader
Word, text image data set include multiple character image information, and each character image information includes meeting the first pre-set dimension
Character image and the character image included in text information.According to one embodiment of present invention, such as lower section can be passed through
The second convolutional neural networks of formula pair are trained.In this embodiment, to the character image information that each is extracted, with this
Character image included by character image information is the input of first the first process block in the second convolutional neural networks, with this article
Text information included by word image information is the output of first grader, is trained to the second convolutional neural networks.Its
In, the first pre-set dimension is preferably 114px × 114px, and character image is single channel image, and corresponding text information is single
Word, single word are any in numeric class word, alphabetic class word and Chinese character class word.Numeric class word includes 0
~9 this 10 numbers, alphabetic class word include this 26 small English alphabets of a~z and A~Z this 26 capitalization English letters, in
Literary Chinese characters kind word includes 3755 first-level Chinese characters of GB 2312 (Chinese Character Set Code for Informati) standard, it is known that text
Word information is any of 10+26 × 2+3755=3817 single words.
Below by by taking a character image information Y in text image data set as an example, to the second convolutional neural networks
Training process illustrate.Character image information Y includes character image Y1 text information Y2 corresponding with the character image, text
The size of word image Y1 is 114px × 114px, and text information Y2 is Chinese character class word " spoon ".It is with word in training
The output that image Y1 is the input of the first process block F1, text information Y2 is the first grader K1 carries out the second convolutional neural networks
Training.
Table 3 shows that the parameter setting example of the first process block F1~F5 according to an embodiment of the invention, table 4 are shown
The parameter setting example of the first pond layer G1~G3 according to an embodiment of the invention is gone out.The content of table 3 and table 4 is specific
It is as follows respectively:
Processing unit | Convolution kernel size | Boundary zero padding | Step-length | Convolution nuclear volume |
First process block F1 | 11×11 | 0 | 4 | 96 |
First process block F2 | 5×5 | 1 | 1 | 256 |
First process block F3 | 3×3 | 1 | 1 | 384 |
First process block F4 | 3×3 | 1 | 1 | 384 |
First process block F5 | 3×3 | 1 | 1 | 256 |
Table 3
Processing unit | Pond block size | Step-length |
First pond layer G1 | 3×3 | 2 |
First pond layer G2 | 3×3 | 2 |
First pond layer G3 | 3×3 | 2 |
Table 4
Parameter setting is carried out to first process block F1~F5 with reference to table 3, first pond layer G1~G3 is joined with reference to table 4
Number setting, and based on the above parameter to character image Y1 processing.After character image Y1 is input to the first process block F1, warp
The relevant treatment for crossing subsequent processing units, the output for obtaining the first pond layer G3 is the characteristic pattern of 256 3px × 3px.It needs
Bright, first process block F1~F5 can refer to the relevant treatment of image the processing procedure of process block A2 and A3 as above, and first
Pond layer G1~G3 can refer to the relevant treatment of image the processing procedure of as above maximum pond layer B1 and B2, only in parameter
In setting, such as the quantity of convolution kernel and size, pond block size, step-length, whether boundary zero padding exist it is different, herein no longer
It repeats.
Next, the output of the first pond layer G3 is input in second processing block H1, second processing block H1 includes successively
The first connected full articulamentum and the second active coating.The characteristic pattern of above-mentioned 256 3px × 3px enters the of second processing block H1
After one full articulamentum, the characteristic pattern of 4096 1px × 1px is obtained.At this point, the characteristic pattern of 1px × 1px actually only has 1
Pixel value, therefore the output of the first full articulamentum can be considered one 1 × 4096 feature vector.By this 4096 1px × 1px
Characteristic pattern be input to the active coating in second processing block H1, by the processing of the active coating, obtain the defeated of second processing block H1
Go out for the characteristic pattern of 4096 1px × 1px.
Finally, into the second full articulamentum J1, the output of second processing block H1 obtains after the second full articulamentum J1 processing
Obtained the characteristic pattern of 4096 1px × 1px.It is classification problem more than one due to being that word is identified, and in the embodiment party
Text information is any of 3817 single words in formula, therefore the output of the first grader K1 is also 3817, respectively
The probability that corresponding 3817 single words occur, and softmax graders are selected, output is the corresponding single text of maximum probability
Word, the single word are the text information Y2 corresponding to character image Y1.In order to train second convolutional neural networks, according to defeated
The corresponding text information Y2 of character image Y1 entered are " spoon " this foreseen outcome, are adjusted to the output of the first grader K1
It is whole, by the method backpropagation of minimization error to adjust each parameter in the second convolutional neural networks.By character image number
After a large amount of character image information is trained in set, trained second convolutional neural networks are obtained.
In addition, for training the text image data set of the second convolutional neural networks to need to be generated in advance, according to
Another embodiment of the present invention, can be generated in advance character image according to set in the following way.First, pending to each
Word picture carries out image procossing, and to obtain, each pending word picture is corresponding, meets the character image of the first pre-set dimension.
Wherein, the first pre-set dimension is that 114px × 114px will typically be waited locating when handling pending word picture with this
Reason word picture zooms to the first pre-set dimension, to form corresponding character image.Later, it to each character image, obtains
Its corresponding pending associated text information of word picture, according to text information word corresponding with character image generation
Image information collects each character image information, to form text image data set.
Based on this, according to one embodiment of present invention, character image region Q1 is input to trained second convolution
Text region is carried out in neural network.In view of the input of the second convolutional neural networks is single channel image, it will usually to word
Image-region Q1 first carries out gray proces, and the RGB triple channel images of script are converted to gray level image to generate corresponding single-pass
Road image, then the single channel image is input to trained second convolutional neural networks.In turn, character image region Q1 is held
After row gray proces, it is character image region R1 to obtain its corresponding single channel image, and character image region R1 is by training
After the processing of the second good convolutional neural networks, the output for obtaining the first grader K1 is 3817 probability values, maximum
Probability value is 0.63, and the 965th output for being the first grader K1, corresponding word is " small ", thus can determine character image
The word that region Q1 is included is " small ".So, it is based on processing procedure as above, it may be determined that character image region Q2 and Q3 are included
Word be " sesame " and " fiber crops " respectively.
After the word that each character image region for obtaining image M1 is included, need based on each Character generation diagram as M pairs
The text message answered.According to one embodiment of present invention, can to generate the image based on each word in the following way corresponding
Text message.First, the position relationship between each character image region in the image is obtained, it is right then according to the position relationship
The corresponding word in each character image region is combined, to generate the corresponding text message of the image.In this embodiment, first
The position relationship between character image region Q1, Q1 and Q3 is obtained, position relationship here is not limited to coordinate position, anteroposterior position
Relationship, overlying relation etc. are set, character image region Q1, Q2 and Q3 is obtained as sequential position relationship side by side, recycles language
Adopted correlation technology obtains the corresponding text messages of image M after being combined " small ", " sesame " and " fiber crops " be " small sesame ".It needs
It is bright, the division and acquisition in character image region are carried out to image, and carry out according to location information and semantic association technology
The processing that text message generates, can refer to existing mature technology, is not repeated herein.
Finally, in step S230, text information and the image store path and image name of the image are closed
Connection storage.According to one embodiment of present invention, it is known that the image store path of image M1 is /storage/emulated/0/
DCIM/Camera/IMG_20171213_185253.jpg, image name IMG_20171213_185253.jpg, by text envelope
Breath " small sesame " and the image store path and image name of image M1 are associated storage, for example can be stored in mobile terminal
In 100 memory 150.If it is worth noting that, the text message generated includes multiple and different content, can be used as follows
The symbol of scribing line etc is separated processing, such as " small sesame _ 7.59 yuan/jin ".
In practical applications, typically by the image classification model based on above-mentioned trained first convolutional neural networks,
And the Text region model encapsulation based on trained second convolutional neural networks is being related to picture storage, query function etc
Mobile application in, such as take pictures class application, mobile phone photo album.Before downloading this kind of mobile application of installation or mobile terminal manufacture
System configuration process in, image classification model, Text region model, categorical data and lteral data etc. are directly deployed in shifting
Dynamic terminal 100, shared memory space is smaller, and memory source occupancy is low, and has higher accuracy of identification and accuracy rate, response
Speed can provide the user with better experience.
After the image store path for the image that text message is corresponding and image name associated storage, this can be passed through
One incidence relation quickly and accurately shows the relevant text message of term institute and image of its key entry to user.According to this hair
Another bright embodiment first searches whether exist and its phase when receiving the term of user's key entry according to the term
Same or similar text message, and if it exists, the image store path for then obtaining text information association is stored further according to the image
Path searching shows the image and text information to user to its corresponding image.In this embodiment, user keys in
Term be " bank ", then found in the presence of text message similar with its according to the term, text information is " to promote trade and investment
Bank _ all-purpose card _ 622588120816xxxx_ Unionpay ", wherein containing " bank " this word.Next, obtaining the text
The image store path of information association, it is /storage/emulated/0/DCIM/Camera/ to obtain the image store path
IMG_20171210_185214.jpg finds its corresponding image according to the image store path, which is image M2,
Image M2 and text information are shown to user.
Fig. 7 is the block diagram of Example Computing Device 700.In basic configuration 702, computing device 700, which typically comprises, is
System memory 706 and one or more processor 704.Memory bus 708 can be used for storing in processor 704 and system
Communication between device 706.
Depending on desired configuration, processor 704 can be any kind of processing, including but not limited to:Microprocessor
(μ P), microcontroller (μ C), digital information processor (DSP) or any combination of them.Processor 704 may include such as
The cache of one or more rank of on-chip cache 710 and second level cache 712 etc, processor core
714 and register 716.Exemplary processor core 714 may include arithmetic and logical unit (ALU), floating-point unit (FPU),
Digital signal processing core (DSP core) or any combination of them.Exemplary Memory Controller 718 can be with processor
704 are used together, or in some implementations, and Memory Controller 718 can be an interior section of processor 704.
Depending on desired configuration, system storage 706 can be any type of memory, including but not limited to:Easily
The property lost memory (RAM), nonvolatile memory (ROM, flash memory etc.) or any combination of them.System stores
Device 706 may include operating system 720, one or more program 722 and program data 724.In some embodiments,
Program 722 may be arranged to be executed instruction using program data 724 by one or more processors 704 on an operating system.
Computing device 700 can also include contributing to from various interface equipments (for example, output equipment 742, Peripheral Interface
744 and communication equipment 746) to basic configuration 702 via the communication of bus/interface controller 730 interface bus 740.Example
Output equipment 742 include graphics processing unit 748 and audio treatment unit 750.They can be configured as contribute to via
One or more port A/V 752 is communicated with the various external equipments of such as display or loud speaker etc.Outside example
If interface 744 may include serial interface controller 754 and parallel interface controller 756, they, which can be configured as, contributes to
Via one or more port I/O 758 and such as input equipment (for example, keyboard, mouse, pen, voice-input device, touch
Input equipment) or the external equipment of other peripheral hardwares (such as printer, scanner etc.) etc communicated.Exemplary communication is set
Standby 746 may include network controller 760, can be arranged to convenient for via one or more communication port 764 and one
The communication that other a or multiple computing devices 762 pass through network communication link.
Network communication link can be an example of communication media.Communication media can be usually presented as in such as carrier wave
Or the computer-readable instruction in the modulated data signal of other transmission mechanisms etc, data structure, program module, and can
To include any information delivery media." modulated data signal " can such signal, one in its data set or more
It is a or it change can the mode of coding information in the signal carry out.As unrestricted example, communication media can be with
Include the wire medium of such as cable network or private line network etc, and such as sound, radio frequency (RF), microwave, infrared
(IR) the various wireless mediums or including other wireless mediums.Term computer-readable medium used herein may include depositing
Both storage media and communication media.
Computing device 700 can be implemented as server, such as file server, database server, application program service
Device and WEB server etc. can also be embodied as a part for portable (or mobile) electronic equipment of small size, these electronic equipments
Can be such as cellular phone, personal digital assistant (PDA), personal media player device, wireless network browsing apparatus, individual
Helmet, application specific equipment or may include any of the above function mixing apparatus.Computing device 700 can also be real
It includes desktop computer and the personal computer of notebook computer configuration to be now.
In some embodiments, computing device 700 is configured as executing according to the present invention for classifying to image
The convolutional neural networks generation method of processing and/or the convolutional neural networks generation side for the word in image to be identified
Method.Wherein, one or more programs 722 of computing device 700 include according to the present invention for being carried out to image for executing
The convolutional neural networks generation method 800 handled of classifying and/or the convolutional Neural net for the word in image to be identified
The instruction of network generation method 900.
Fig. 8 shows that the convolutional neural networks according to an embodiment of the invention for carrying out classification processing to image are given birth to
At the flow chart of method 800.Convolutional neural networks generation method 800 for carrying out classification processing to image is suitable for setting in calculating
It is executed in standby (such as computing device 700 shown in Fig. 7).
As shown in figure 8, method 800 starts from step S810.In step S810, process block is built, process block includes convolution
Layer.According to one embodiment of present invention, process block can be built in the following way.First, active coating is built, then in the volume
The active coating is added after lamination, to form process block.
Then, S820 is entered step, builds pond layer, full articulamentum and grader respectively.Wherein, pond layer is maximum pond
Change any in layer and average pond layer.
Next, in step S830, according to multiple process blocks and pond layer, in conjunction with full articulamentum and grader structure volume
Product neural network, the convolutional neural networks are input with process block, are output with the grader.An implementation according to the present invention
Example can build convolutional neural networks in the following way according to multiple process blocks and pond layer in conjunction with full articulamentum and grader.
In this embodiment, first according to preset concatenate rule, after each process block is connected with maximum pond layer, connection is average
Pond layer, then the full articulamentum and grader being sequentially connected are added after average pond layer, to build with process block as input,
With the convolutional neural networks that the grader is output.Wherein, the quantity of process block is 3, and the quantity of maximum pond layer is 2, average
The quantity of pond layer is 1.
Finally, step S840 is executed, the convolutional neural networks are carried out according to the image category data acquisition system obtained in advance
Training, the classification corresponding to output instruction input picture so as to the grader, image category data acquisition system includes multiple images
Classification information, each image category information include meeting the first image classification letter corresponding with first image of pre-set dimension
Breath.According to one embodiment of present invention, the convolutional neural networks can be trained in the following way.Specifically, to every
The one image category information extracted is the in the convolutional neural networks with the first image included by the image category information
The input of one process block, with the output that the classification information included by the image category information is the grader, to convolution god
It is trained through network.Wherein, classification information is in animal class, Building class, class in kind, landscape class, figure kind and text class
It is any.
It is according to the present invention for training the image type data acquisition system of the convolutional neural networks to need to be generated in advance
Image type data acquisition system can be generated in advance in another embodiment in the following way.Figure is carried out to each pending picture
As processing, to obtain the first image that each pending picture is corresponding, meets pre-set dimension, to each sufficient pre-set dimension that is filled
First image obtains its corresponding pending associated classification information of picture, is generated according to category information and first image
Corresponding image category information, collects each image category information, to form image category data acquisition system.
It should be noted that generating the convolutional Neural for carrying out classification processing to image in above-mentioned steps S810~S840
The process of network, and the process for the image type data acquisition system for training the convolutional neural networks is generated in advance, handle details
And embodiment can be found in the related content for being related to the first convolutional neural networks in method 200 in step S210, details are not described herein again.
Fig. 9 shows the convolutional Neural net according to an embodiment of the invention for the word in image to be identified
The flow chart of network generation method 900.Convolutional neural networks generation method 900 for the word in image to be identified is suitable for
It is executed in computing device (such as computing device 700 shown in Fig. 7).
As shown in figure 9, method 900 starts from step S910.In step S910, the first process block, the first process block are built
Including the first convolutional layer.According to one embodiment of present invention, the first process block can be built in the following way.First, it builds
Then first active coating adds first active coating after first convolutional layer, to form the first process block.
In step S920, second processing block is built, second processing block includes the first full articulamentum.According to the present invention one
A embodiment can build second processing block in the following way.First, the second active coating is built, then in the first full connection
Second active coating is added after layer, to form second processing block.
Then, S930 is entered step, builds the first pond layer, the second full articulamentum and the first grader respectively.Wherein,
One pond layer is maximum pond layer.
Next, in step S940, according to one or more first process blocks, the first pond layer and second processing block,
Convolutional neural networks are built in conjunction with the second full articulamentum and the first grader, which is defeated with the first process block
Enter, is output with first grader.It according to one embodiment of present invention, can be in the following way according to one or more the
One process block, the first pond layer and second processing block build convolutional neural networks in conjunction with the second full articulamentum and the first grader.
In this embodiment, according to preset first concatenate rule, by each first process block, the first pond layer and second processing block into
After row is connected, the second full articulamentum is connected, the first grader is added after the second full articulamentum, to build with the first process block
It is the convolutional neural networks exported with first grader for input.Wherein, the quantity of the first process block is 5, second processing block
Quantity be 1, the quantity of the first pond layer is 3.
Finally, step S950 is executed, the convolutional neural networks are carried out according to the text image data set obtained in advance
Training, so as to the word for including in the output instruction input picture of the first grader, text image data set includes multiple texts
Word image information, each character image information include meeting included in the character image and the character image of the first pre-set dimension
Text information.According to one embodiment of present invention, the convolutional neural networks can be trained in the following way.Specifically
, to the character image information that each is extracted, with the character image included by the word image information for the convolutional Neural
The input of first the first process block in network is first grader with the text information included by the word image information
Output, is trained the convolutional neural networks.Wherein, text information is single word, and single word is numeric class word, word
It is any in female class word and Chinese character class word.
It is according to the present invention for training the text image data set of the convolutional neural networks to need to be generated in advance
Text image data set can be generated in advance in another embodiment in the following way.To each pending word picture into
Row image procossing, with obtain each pending word picture it is corresponding, meet the first pre-set dimension character image, to each Zhang Wen
Word image obtains its corresponding pending associated text information of word picture, is given birth to according to the text information and the character image
At corresponding character image information, collect each character image information, to form text image data set.
It should be noted that generating the convolution for the word in image to be identified in above-mentioned steps S910~S950
The process of neural network, and the process for the text image data set for training the convolutional neural networks is generated in advance, processing
Details and embodiment can be found in the related content for being related to the second convolutional neural networks in method 200 in step S220, herein no longer
It repeats.
Image in the photograph album of mobile terminal is typically divided into each classification, according to class by existing image classification algorithms
Image management is not carried out, but not carries out further operating again, if quickly to navigate to a certain figure with specific information
As then cannot achieve.Image classification method according to the ... of the embodiment of the present invention, to each image in image library, first to the image
Classification is to obtain its corresponding classification, if classification is text class, carries out Text region to the image, is included to extract
Text message, by the image store path and image name associated storage of text message and the image.In the above scheme, if connecing
When receiving the term of user's key entry, it will search whether that there are same or similar text messages according to the term, if
In the presence of, then the image store path of text information association is obtained, its corresponding image is found according to the image store path,
The image and text information are shown to user, it is greatly square to realize the quick and precisely positioning to image needed for user
Search of the user for obscuring image information content, improve usage experience.In addition, utilizing trained first convolutional Neural
Network is identified the word in image by trained second convolutional neural networks to classify to image,
In the first convolutional neural networks and the second convolutional neural networks all have smaller network structure, then relying on capable and vigorous small-sized god
Through real-time performance image classification and Text region, the processing in mobile phone mobile terminal or low profile edge equipment may be implemented,
It need not be communicated with server end when use, without uploading high in the clouds, be avoided to communication network, such as the dependence of 4G networks
Property, the availability under no network or weak signal network is improved, and due to being not necessarily to largely calculate service, also reduced corresponding
Operation maintenance cost.
A5. the method as described in any one of A1-4 is stored with for being carried out at classification to image in the mobile terminal
Reason, trained first convolutional neural networks, the described pair of image carry out classification processing to obtain its corresponding image type
The step of include:The image is input in trained first convolutional neural networks and carries out image classification;According to the first volume
The output of product neural network determines the classification of the image.A6. the method as described in any one of A1-5 is deposited in the mobile terminal
It contains for second convolutional neural networks that the word in image is identified, trained, the described pair of image is into style of writing
Word identifies, includes the step of text message that the image is included to extract:Obtain each single word that the image is included
Corresponding character image region;Each character image region is input in trained second convolutional neural networks into style of writing respectively
Word identifies, the word that each character image region is included is determined according to the output of second convolutional neural networks;Based on each word
Generate the corresponding text message of the image.A7. the method as described in A5 or 6, trained first convolutional neural networks are logical
Following manner is crossed to acquire:Process block is built, the process block includes convolutional layer;Respectively build pond layer, full articulamentum and
Grader;According to multiple process blocks and pond layer, the first convolutional neural networks, institute are built in conjunction with the full articulamentum and grader
It is input that the first convolutional neural networks, which are stated, with process block, is output with the grader;According to the image category number obtained in advance
First convolutional neural networks are trained according to set, corresponding to the output instruction input picture so as to the grader
Classification, described image categorical data set include multiple images classification information, and each image category information includes meeting to preset ruler
The first very little image classification information corresponding with first image.A8. the method as described in any one of A5-7, it is described to train
The second convolutional neural networks acquire in the following manner:The first process block is built, first process block includes first
Convolutional layer;Second processing block is built, the second processing block includes the first full articulamentum;The first pond layer, second are built respectively
Full articulamentum and the first grader;According to one or more first process blocks, the first pond layer and second processing block, in conjunction with described
Second full articulamentum and the first grader build the second convolutional neural networks, and second convolutional neural networks are with the first process block
It is output with first grader for input;According to the text image data set obtained in advance to second convolution god
It is trained through network, so as to the word for including in the output instruction input picture of first grader, the character image
Data acquisition system includes multiple character image information, each character image information include meet the first pre-set dimension character image and
Text information included in the character image.
B12. the step of method as described in B11, the structure process block further includes:Build active coating;In the convolution
The active coating is added after layer, to form process block.B13. the method as described in B11 or 12, the pond layer are maximum pond
Change any in layer and average pond layer.B14. the method as described in B13, it is described according to multiple process blocks and pond layer, in conjunction with
The full articulamentum and grader structure convolutional neural networks the step of include:According to preset concatenate rule, by each process block
After being connected with maximum pond layer, average pond layer is connected;What addition was sequentially connected after the average pond layer connects entirely
Layer and grader are connect, to build with process block as input, with the convolutional neural networks that the grader is output.B15. such as B11-
Method described in any one of 14, the image category data acquisition system that the basis obtains in advance carry out the convolutional neural networks
It trains, includes so as to the step of the indicating the classification corresponding to input picture that export of the grader:Each is extracted
Image category information is first process block in the convolutional neural networks with the first image included by the image category information
Input, with the classification information included by the image category information be the grader output, to the convolutional neural networks
It is trained.B16. the quantity of the method as described in any one of B11-15, the process block is 3.B17. as appointed in B14-16
The quantity of method described in one, maximum pond layer is 2, and the quantity of the average pond layer is 1.B18. as in B11-17
Any one of them method, the classification information are in animal class, Building class, class in kind, landscape class, figure kind and text class
It is any.B19. the method as described in any one of B11-18 further includes that image category data acquisition system is generated in advance, described advance
Generate image category data acquisition system the step of include:Image procossing is carried out to each pending picture, it is each pending to obtain
Picture is corresponding, meets the first image of pre-set dimension;To the first image of each sufficient pre-set dimension that is filled, it is corresponding to obtain its
The pending associated classification information of picture, according to classification information image category information corresponding with the first image generation;
Collect each image category information, to form image category data acquisition system.
C21. the step of method as described in C20, the first process block of the structure further includes:Build the first active coating;
First active coating is added after first convolutional layer, to form the first process block.C22. the side as described in C20 or 21
The step of method, the structure second processing block further includes:Build the second active coating;Institute is added after the described first full articulamentum
The second active coating is stated, to form second processing block.C23. the method as described in any one of C20-22, first pond layer are
Maximum pond layer.C24. the method as described in any one of C20-23, it is described according to one or more first process blocks, the first pond
Change layer and second processing block, includes in conjunction with the step of the described second full articulamentum and the first grader structure convolutional neural networks:
According to preset first concatenate rule, after each first process block, the first pond layer are connected with second processing block, connection the
Two full articulamentums;First grader is added after the described second full articulamentum, to build with the first process block as input,
With the convolutional neural networks that first grader is output.C25. the method as described in any one of C20-24, the basis
The text image data set obtained in advance is trained the convolutional neural networks, so as to the output of first grader
Indicate input picture in include word the step of include:To the character image information that each is extracted, with the character image
Character image included by information is the input of first the first process block in the convolutional neural networks, is believed with the character image
The included text information of breath is the output of first grader, is trained to the convolutional neural networks.C26. such as
The quantity of method described in any one of C20-25, first process block is 5, and the quantity of the second processing block is 1, described
The quantity of first pond layer is 3.C27. the method as described in any one of C20-26, the text information are single word, institute
It is any in numeric class word, alphabetic class word and Chinese character class word to state single word.C28. as any in C20-27
Method described in, further includes that text image data set is generated in advance, the step that text image data set is generated in advance
Suddenly include:Image procossing is carried out to each pending word picture, each pending word picture is corresponding, meets the to obtain
The character image of one pre-set dimension;To each character image, its corresponding pending associated word letter of word picture is obtained
Breath, according to text information character image information corresponding with character image generation;Collect each character image information, with shape
At word sets of image data.
In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention
Example can be put into practice without these specific details.In some instances, well known method, knot is not been shown in detail
Structure and technology, so as not to obscure the understanding of this description.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each inventive aspect,
Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:It is i.e. required to protect
Shield the present invention claims the feature more features than being expressly recited in each claim.More precisely, as following
As claims reflect, inventive aspect is all features less than single embodiment disclosed above.Therefore, it abides by
Thus the claims for following specific implementation mode are expressly incorporated in the specific implementation mode, wherein each claim itself
As a separate embodiment of the present invention.
Those skilled in the art should understand that the module of the equipment in example disclosed herein or unit or groups
Between can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example
In different one or more equipment.Module in aforementioned exemplary can be combined into a module or be segmented into addition multiple
Submodule.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment
Change and they are arranged in the one or more equipment different from the embodiment.It can be the module or list in embodiment
Member or group between be combined into one between module or unit or group, and can be divided into addition multiple submodule or subelement or
Between subgroup.Other than such feature and/or at least some of process or unit exclude each other, it may be used any
Combination is disclosed to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so to appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power
Profit requires, abstract and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation
It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included certain features rather than other feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
One of meaning mode can use in any combination.
In addition, be described as herein can be by the processor of computer system or by executing for some in the embodiment
The combination of method or method element that other devices of the function are implemented.Therefore, have for implementing the method or method
The processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, device embodiment
Element described in this is the example of following device:The device is used to implement performed by the element by the purpose in order to implement the invention
Function.
Various technologies described herein are realized together in combination with hardware or software or combination thereof.To the present invention
Method and apparatus or the process and apparatus of the present invention some aspects or part can take embedded tangible media, such as it is soft
The form of program code (instructing) in disk, CD-ROM, hard disk drive or other arbitrary machine readable storage mediums,
Wherein when program is loaded into the machine of such as computer etc, and is executed by the machine, the machine becomes to put into practice this hair
Bright equipment.
In the case where program code executes on programmable computers, computing device generally comprises processor, processor
Readable storage medium (including volatile and non-volatile memory and or memory element), at least one input unit, and extremely
A few output device.Wherein, memory is configured for storage program code;Processor is configured for according to the memory
Instruction in the said program code of middle storage executes the image classification method of the present invention, for carrying out classification processing to image
Convolutional neural networks generation method and/or convolutional neural networks generation method for the word in image to be identified.
By way of example and not limitation, computer-readable medium includes computer storage media and communication media.It calculates
Machine readable medium includes computer storage media and communication media.Computer storage media storage such as computer-readable instruction,
The information such as data structure, program module or other data.Communication media is generally modulated with carrier wave or other transmission mechanisms etc.
Data-signal processed embodies computer-readable instruction, data structure, program module or other data, and includes that any information passes
Pass medium.Above any combination is also included within the scope of computer-readable medium.
As used in this, unless specifically stated, come using ordinal number " first ", " second ", " third " etc.
Description plain objects are merely representative of the different instances for being related to similar object, and are not intended to imply that the object being described in this way must
Must have the time it is upper, spatially, in terms of sequence or given sequence in any other manner.
Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited from
It is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that
The language that is used in this specification primarily to readable and introduction purpose and select, rather than in order to explain or limit
Determine subject of the present invention and selects.Therefore, without departing from the scope and spirit of the appended claims, for this
Many modifications and changes will be apparent from for the those of ordinary skill of technical field.For the scope of the present invention, to this
The done disclosure of invention is illustrative and not restrictive, and it is intended that the scope of the present invention be defined by the claims appended hereto.
Claims (10)
1. a kind of image classification method, suitable for executing in the terminal, the mobile terminal includes image library, described image library
In be stored with multiple images, the method includes the steps:
To each image in image library, classification processing is carried out to obtain its corresponding classification to the image;
If the classification is text class, Text region is carried out to the image, to extract the text message that the image is included;
The image store path and image name of the text message and the image are associated storage.
2. the method as described in claim 1, when receiving the term of user's key entry, the method further includes:
Search whether that there are same or similar text messages according to the term;
If in the presence of the image store path of text information association is obtained;
Its corresponding image is found according to the image store path, the image and text information are shown to user.
3. method as claimed in claim 1 or 2, the described pair of image carries out Text region, is included to extract the image
Text message the step of include:
The corresponding character image region of each single word for obtaining that the image included;
Text region, the word for being included with each character image region of determination are carried out to each character image region respectively;
The corresponding text message of the image is generated based on each word.
4. method as claimed in claim 3, described the step of generating the image corresponding text message based on each word, includes:
Obtain the position relationship between each character image region in the image;
According to the position relationship, the corresponding word in each character image region is combined, to generate the corresponding text of the image
This information.
5. a kind of mobile terminal, including:
One or more processors;
Memory;And
One or more programs, wherein one or more of programs are stored in the memory and are configured as by described one
A or multiple processors execute, and one or more of programs include for executing according to described in any one of claim 1-4
Method instruction.
6. a kind of computer readable storage medium of the one or more programs of storage, one or more of programs include instruction,
Described instruction is when by mobile terminal execution so that the mobile terminal execution method as claimed in one of claims 1-4.
7. a kind of convolutional neural networks generation method for carrying out classification processing to image, suitable for being executed in computing device,
The method includes the steps:
Process block is built, the process block includes convolutional layer;
Pond layer, full articulamentum and grader are built respectively;
According to multiple process blocks and pond layer, convolutional neural networks, the convolution are built in conjunction with the full articulamentum and grader
Neural network is input with process block, is output with the grader;
The convolutional neural networks are trained according to the image category data acquisition system obtained in advance, so as to the grader
Classification corresponding to output instruction input picture, described image categorical data set includes multiple images classification information, Mei Getu
As classification information includes meeting the first image classification information corresponding with first image of pre-set dimension.
8. a kind of convolutional neural networks generation method for the word in image to be identified, suitable for being held in computing device
Row, the method includes the steps:
The first process block is built, first process block includes the first convolutional layer;
Second processing block is built, the second processing block includes the first full articulamentum;
The first pond layer, the second full articulamentum and the first grader are built respectively;
According to one or more first process blocks, the first pond layer and second processing block, in conjunction with the described second full articulamentum and the
One grader builds convolutional neural networks, and the convolutional neural networks are input with the first process block, with first grader
For output;
The convolutional neural networks are trained according to the text image data set obtained in advance, so as to first classification
The word for including in the output instruction input picture of device, the text image data set includes multiple character image information, often
A character image information includes meeting text information included in the character image and the character image of the first pre-set dimension.
9. a kind of computing device, including:
One or more processors;
Memory;And
One or more programs, wherein one or more of programs are stored in the memory and are configured as by described one
A or multiple processors execute, one or more of programs include for execute according to the method for claim 7 and/or
The instruction of method according to any one of claims 8.
10. a kind of computer readable storage medium of the one or more programs of storage, one or more of programs include instruction,
Described instruction is when executed by a computing apparatus so that the computing device is executed according to the method for claim 7 and/or weighed
Profit requires the method described in 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810331479.5A CN108537283A (en) | 2018-04-13 | 2018-04-13 | A kind of image classification method and convolutional neural networks generation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810331479.5A CN108537283A (en) | 2018-04-13 | 2018-04-13 | A kind of image classification method and convolutional neural networks generation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108537283A true CN108537283A (en) | 2018-09-14 |
Family
ID=63480410
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810331479.5A Pending CN108537283A (en) | 2018-04-13 | 2018-04-13 | A kind of image classification method and convolutional neural networks generation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108537283A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109753580A (en) * | 2018-12-21 | 2019-05-14 | Oppo广东移动通信有限公司 | A kind of image classification method, device, storage medium and electronic equipment |
CN110222168A (en) * | 2019-05-20 | 2019-09-10 | 平安科技(深圳)有限公司 | A kind of method and relevant apparatus of data processing |
CN110489578A (en) * | 2019-08-12 | 2019-11-22 | 腾讯科技(深圳)有限公司 | Image processing method, device and computer equipment |
CN111147891A (en) * | 2019-12-31 | 2020-05-12 | 杭州威佩网络科技有限公司 | Method, device and equipment for acquiring information of object in video picture |
CN111209423A (en) * | 2020-01-07 | 2020-05-29 | 腾讯科技(深圳)有限公司 | Image management method and device based on electronic album and storage medium |
CN111369489A (en) * | 2018-12-24 | 2020-07-03 | Tcl集团股份有限公司 | Image identification method and device and terminal equipment |
CN111383302A (en) * | 2018-12-29 | 2020-07-07 | 中兴通讯股份有限公司 | Image collocation method and device, terminal and computer readable storage medium |
CN111783786A (en) * | 2020-07-06 | 2020-10-16 | 上海摩勤智能技术有限公司 | Picture identification method and system, electronic equipment and storage medium |
CN111783624A (en) * | 2020-06-29 | 2020-10-16 | 厦门市美亚柏科信息股份有限公司 | Pedestrian re-identification method and device with translation invariance reserved and storage medium |
CN111860672A (en) * | 2020-07-28 | 2020-10-30 | 北京邮电大学 | Fine-grained image classification method based on block convolutional neural network |
CN112149653A (en) * | 2020-09-16 | 2020-12-29 | 北京达佳互联信息技术有限公司 | Information processing method, information processing device, electronic equipment and storage medium |
CN112634123A (en) * | 2019-10-08 | 2021-04-09 | 北京京东尚科信息技术有限公司 | Image processing method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193918A (en) * | 2010-03-01 | 2011-09-21 | 汉王科技股份有限公司 | Video retrieval method and device |
CN106503732A (en) * | 2016-10-13 | 2017-03-15 | 北京云江科技有限公司 | Text image and the sorting technique and categorizing system of non-textual image |
CN106776710A (en) * | 2016-11-18 | 2017-05-31 | 广东技术师范学院 | A kind of picture and text construction of knowledge base method based on vertical search engine |
CN107239802A (en) * | 2017-06-28 | 2017-10-10 | 广东工业大学 | A kind of image classification method and device |
CN107562742A (en) * | 2016-06-30 | 2018-01-09 | 苏宁云商集团股份有限公司 | A kind of image processing method and device |
-
2018
- 2018-04-13 CN CN201810331479.5A patent/CN108537283A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102193918A (en) * | 2010-03-01 | 2011-09-21 | 汉王科技股份有限公司 | Video retrieval method and device |
CN107562742A (en) * | 2016-06-30 | 2018-01-09 | 苏宁云商集团股份有限公司 | A kind of image processing method and device |
CN106503732A (en) * | 2016-10-13 | 2017-03-15 | 北京云江科技有限公司 | Text image and the sorting technique and categorizing system of non-textual image |
CN106776710A (en) * | 2016-11-18 | 2017-05-31 | 广东技术师范学院 | A kind of picture and text construction of knowledge base method based on vertical search engine |
CN107239802A (en) * | 2017-06-28 | 2017-10-10 | 广东工业大学 | A kind of image classification method and device |
Non-Patent Citations (2)
Title |
---|
司马海峰 等: "《遥感图像分类中的智能计算方法》", 31 January 2018 * |
路甬祥: "《现代科学技术大众百科-科学篇》", 30 August 1999 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109753580A (en) * | 2018-12-21 | 2019-05-14 | Oppo广东移动通信有限公司 | A kind of image classification method, device, storage medium and electronic equipment |
CN111369489A (en) * | 2018-12-24 | 2020-07-03 | Tcl集团股份有限公司 | Image identification method and device and terminal equipment |
CN111369489B (en) * | 2018-12-24 | 2024-04-16 | Tcl科技集团股份有限公司 | Image identification method and device and terminal equipment |
CN111383302A (en) * | 2018-12-29 | 2020-07-07 | 中兴通讯股份有限公司 | Image collocation method and device, terminal and computer readable storage medium |
CN110222168A (en) * | 2019-05-20 | 2019-09-10 | 平安科技(深圳)有限公司 | A kind of method and relevant apparatus of data processing |
CN110222168B (en) * | 2019-05-20 | 2023-08-18 | 平安科技(深圳)有限公司 | Data processing method and related device |
CN110489578A (en) * | 2019-08-12 | 2019-11-22 | 腾讯科技(深圳)有限公司 | Image processing method, device and computer equipment |
CN110489578B (en) * | 2019-08-12 | 2024-04-05 | 腾讯科技(深圳)有限公司 | Picture processing method and device and computer equipment |
CN112634123A (en) * | 2019-10-08 | 2021-04-09 | 北京京东尚科信息技术有限公司 | Image processing method and device |
CN111147891A (en) * | 2019-12-31 | 2020-05-12 | 杭州威佩网络科技有限公司 | Method, device and equipment for acquiring information of object in video picture |
CN111209423B (en) * | 2020-01-07 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Image management method and device based on electronic album and storage medium |
CN111209423A (en) * | 2020-01-07 | 2020-05-29 | 腾讯科技(深圳)有限公司 | Image management method and device based on electronic album and storage medium |
CN111783624A (en) * | 2020-06-29 | 2020-10-16 | 厦门市美亚柏科信息股份有限公司 | Pedestrian re-identification method and device with translation invariance reserved and storage medium |
CN111783624B (en) * | 2020-06-29 | 2023-03-24 | 厦门市美亚柏科信息股份有限公司 | Pedestrian re-identification method and device with translation invariance reserved and storage medium |
CN111783786A (en) * | 2020-07-06 | 2020-10-16 | 上海摩勤智能技术有限公司 | Picture identification method and system, electronic equipment and storage medium |
CN111860672A (en) * | 2020-07-28 | 2020-10-30 | 北京邮电大学 | Fine-grained image classification method based on block convolutional neural network |
CN112149653A (en) * | 2020-09-16 | 2020-12-29 | 北京达佳互联信息技术有限公司 | Information processing method, information processing device, electronic equipment and storage medium |
CN112149653B (en) * | 2020-09-16 | 2024-03-29 | 北京达佳互联信息技术有限公司 | Information processing method, information processing device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108537283A (en) | A kind of image classification method and convolutional neural networks generation method | |
CN109299315B (en) | Multimedia resource classification method and device, computer equipment and storage medium | |
KR101887558B1 (en) | Training method and apparatus for convolutional neural network model | |
CA2792336C (en) | Intuitive computing methods and systems | |
CN110149541A (en) | Video recommendation method, device, computer equipment and storage medium | |
CN108197602A (en) | A kind of convolutional neural networks generation method and expression recognition method | |
WO2018113512A1 (en) | Image processing method and related device | |
CN110020140A (en) | Recommendation display methods, apparatus and system | |
WO2020173115A1 (en) | Network module, distribution method and apparatus, and electronic device and storage medium | |
CN110162604B (en) | Statement generation method, device, equipment and storage medium | |
CN107909016A (en) | A kind of convolutional neural networks generation method and the recognition methods of car system | |
CN110147533B (en) | Encoding method, apparatus, device and storage medium | |
JP7324838B2 (en) | Encoding method and its device, apparatus and computer program | |
CN108537193A (en) | Ethnic attribute recognition approach and mobile terminal in a kind of face character | |
CN110162956B (en) | Method and device for determining associated account | |
CN109963072B (en) | Focusing method, focusing device, storage medium and electronic equipment | |
CN111428645A (en) | Method and device for detecting key points of human body, electronic equipment and storage medium | |
CN111881813B (en) | Data storage method and system of face recognition terminal | |
CN109241437A (en) | A kind of generation method, advertisement recognition method and the system of advertisement identification model | |
CN108351962A (en) | Object detection with adaptivity channel characteristics | |
CN113505256A (en) | Feature extraction network training method, image processing method and device | |
CN113763931B (en) | Waveform feature extraction method, waveform feature extraction device, computer equipment and storage medium | |
CN117910478A (en) | Training method of language model and text generation method | |
CN111694768B (en) | Operation method, device and related product | |
WO2021016932A1 (en) | Data processing method and apparatus, and computer-readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180914 |