CN109118456A

CN109118456A - Image processing method and device

Info

Publication number: CN109118456A
Application number: CN201811124831.4A
Authority: CN
Inventors: 胡耀全
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2018-09-26
Filing date: 2018-09-26
Publication date: 2019-01-01
Anticipated expiration: 2038-09-26
Also published as: CN109118456B; WO2020062494A1

Abstract

The embodiment of the present application discloses image processing method and device.One specific embodiment of this method include: obtain include target image, change of scale is carried out to the image, the image at least one scale that obtains that treated；By acquired image, image inputs convolutional neural networks with treated, obtains the candidate frame of the position of characteristic pattern and multiple instruction targets；In the candidate frame in each image, candidate frame of the size in preset size range is determined；It determines the region corresponding in this feature figure of at least one candidate frame in the candidate frame in size range, obtains feature corresponding to the region, input the full articulamentum of the convolutional neural networks.Method provided by the embodiments of the present application can be by determining the candidate frames of different size ranges from the image of different scale, to different size of Target Acquisition feature more abundant.

Description

Image processing method and device

Technical field

The invention relates to field of computer technology, and in particular at Internet technical field more particularly to image Manage method and apparatus.

Background technique

There is feature fast and accurately since convolutional neural networks carry out image procossing, thus is more and more applied With it is universal.There is target very rich in some images, it is not only large number of, and also the size of target may also have very big difference Not.

Summary of the invention

The embodiment of the present application proposes image processing method and device.

In a first aspect, the embodiment of the present application provides a kind of image processing method, comprising: the image comprising target is obtained, Change of scale is carried out to image, the image at least one scale that obtains that treated；By acquired image and treated figure As input convolutional neural networks, the candidate frame of the position of characteristic pattern and multiple instruction targets is obtained, wherein each target is corresponding extremely Few two candidate frames；In the candidate frame in each image, candidate frame of the size in preset size range is determined, wherein The size range of candidate frame corresponding to the image of different scale is different；Determine at least one in the candidate frame in size range A candidate frame region corresponding in characteristic pattern, obtains feature corresponding to region, inputs the full connection of convolutional neural networks Layer.

In some embodiments, determining at least one candidate frame in the candidate frame in size range in characteristic pattern Before corresponding region, method further include: non-maxima suppression is carried out to the candidate frame in preset size range, with To at least one candidate frame.

In some embodiments, change of scale is carried out to image, comprising: up-sampling and/or down-sampling are carried out to image, In, the size range of candidate frame corresponding to the image that down-sampling obtains is to up-sample more than or equal to the first preset threshold The size range of candidate frame corresponding to the image arrived is less than or equal to the second preset threshold, and the first preset threshold is greater than second Preset threshold.

In some embodiments, the size range of candidate frame corresponding to acquired image is in third predetermined threshold value and Between four preset thresholds, wherein third predetermined threshold value is greater than the 4th preset threshold, and it is pre- that third predetermined threshold value is greater than or equal to first If threshold value, the 4th preset threshold is less than or equal to the second preset threshold.

In some embodiments, in response to there are at least two graphical rules to be greater than acquired figure in treated image The scale of picture, at least two images, the size range of candidate frame corresponding to the lesser image of scale is specified less than first The size range of threshold value, candidate frame corresponding to the biggish image of scale is less than the second specified threshold, and the first specified threshold is big In the second specified threshold.

In some embodiments, in response in treated image there are more than two graphical rules be less than it is acquired The scale of image, in more than two images, the size range of candidate frame corresponding to the lesser image of scale is greater than third The size range of specified threshold, candidate frame corresponding to the biggish image of scale is greater than the 4th specified threshold, and third specifies threshold Value is greater than the 4th specified threshold.

Second aspect, the embodiment of the present application provide a kind of image processing apparatus, comprising: acquiring unit is configured to obtain The image comprising target is taken, change of scale is carried out to image, the image at least one scale that obtains that treated；Input unit, It is configured to acquired image and treated image input convolutional neural networks obtaining characteristic pattern and multiple instruction targets Position candidate frame, wherein corresponding at least two candidate frames of each target；Determination unit is configured in each image Candidate frame in, determine candidate frame of the size in preset size range, wherein candidate corresponding to the image of different scale The size range of frame is different；Area determination unit is configured to determine at least one of candidate frame in size range time The region for selecting frame corresponding in characteristic pattern obtains feature corresponding to region, inputs the full articulamentum of convolutional neural networks.

In some embodiments, device further include: selection unit is configured to the time in preset size range Frame is selected to carry out non-maxima suppression, to obtain at least one candidate frame.

In some embodiments, the acquiring unit, is further configured to: up-sampling and/or down-sampling are carried out to image, Wherein, the size range of candidate frame corresponding to the image that down-sampling obtains is more than or equal to the first preset threshold, up-sampling The size range of candidate frame corresponding to obtained image is less than or equal to the second preset threshold, and the first preset threshold is greater than the Two preset thresholds.

The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors；Storage dress It sets, for storing one or more programs, when one or more programs are executed by one or more processors, so that one or more A processor realizes the method such as any embodiment in image processing method.

Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence realizes the method such as any embodiment in image processing method when the program is executed by processor.

Image procossing scheme provided by the embodiments of the present application carries out scale to image firstly, obtaining the image comprising target Transformation, the image at least one scale that obtains that treated.Later, by acquired image, image inputs convolution with treated Neural network obtains the candidate frame of the position of characteristic pattern and multiple instruction targets, wherein each target corresponding at least two is candidate Frame.Then, in the candidate frame in each image, candidate frame of the size in preset size range is determined, wherein different rulers The size range of candidate frame corresponding to the image of degree is different.Finally, determining at least one in the candidate frame in size range A candidate frame region corresponding in characteristic pattern, obtains feature corresponding to region, inputs the full connection of convolutional neural networks Layer.Method provided by the embodiments of the present application can by determining the candidate frames of different size ranges from the image of different scale, with To different size of Target Acquisition feature more abundant.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that this application can be applied to exemplary system architecture figures therein；

Fig. 2 is the flow chart according to one embodiment of the image processing method of the application；

Fig. 3 is the schematic diagram according to an application scenarios of the image processing method of the application；

Fig. 4 is the flow chart according to another embodiment of the image processing method of the application；

Fig. 5 is the structural schematic diagram according to one embodiment of the image processing apparatus of the application；

Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the exemplary system of the embodiment of the image processing method or image processing apparatus of the application System framework 100.

As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..

User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed on terminal device 101,102,103, such as image processing application, Video class application, live streaming application, instant messaging tools, mailbox client, social platform software etc..

Here terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102, 103 be hardware when, can be the various electronic equipments with display screen, including but not limited to smart phone, tablet computer, electronics Book reader, pocket computer on knee and desktop computer etc..It, can be with when terminal device 101,102,103 is software It is mounted in above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into (such as providing distribution in it The multiple softwares or software module of formula service), single software or software module also may be implemented into.It is not specifically limited herein.

Server 105 can be to provide the server of various services, such as provide support to terminal device 101,102,103 Background server.Background server can carry out analyzing etc. to data such as the images received processing, and by processing result (example Such as feature) feed back to terminal device.

It should be noted that image processing method provided by the embodiment of the present application can be by server 105 or terminal Equipment 101,102,103 executes, correspondingly, image processing apparatus can be set in server 105 or terminal device 101, 102, in 103.

It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.

With continued reference to Fig. 2, the process 200 of one embodiment of the image processing method according to the application is shown.The figure As processing method, comprising the following steps:

Step 201, obtain include target image, change of scale is carried out to image, at least one ruler that obtains that treated The image of degree.

In the present embodiment, the executing subject (such as server shown in FIG. 1 or terminal device) of image processing method can To obtain the image for including target, and change of scale is carried out to acquired image, at least one ruler that obtains that treated The image of degree.The object for having certain meaning that target is presented by image, such as trees, house.It may include phase in the picture Same target or various targets not of uniform size, that pattern is different.

Herein, scale refers to the pixel number of image.For example, the scale of acquired image is 224 × 224, scale becomes The graphical rule obtained after changing is 256 × 256.Specifically, change of scale can be using at least one in up-sampling and down-sampling Kind.

Step 202, by acquired image and treated that image inputs convolutional neural networks, characteristic pattern and multiple is obtained Indicate the candidate frame of the position of target, wherein corresponding at least two candidate frames of each target.

In the present embodiment, acquired image can be inputted convolutional neural networks by above-mentioned executing subject, also, can also Convolutional neural networks are inputted with the image for obtaining change of scale, obtain the candidate frame of the position of multiple instruction targets (proposals) and characteristic pattern (feature map).Specifically, above-mentioned executing subject can be determined using various ways and be waited Select frame.Such as in the case where convolutional neural networks include region candidate network (Region Proposal Network RPN), Candidate frame can be determined using region candidate network.Further, it is also possible to true using selective search (Selective Search) Determine candidate frame.Characteristic pattern can be obtained by the convolutional layer of convolutional neural networks, and the characteristic pattern obtained by different images convolution is not Together.Here candidate frame can be expressed as positions and dimensions.Position can indicate by the coordinate of some point of candidate frame, than Such as midpoint or top left corner apex.Size can pass through area, perimeter or width, high expression.

Step 203, in the candidate frame in each image, candidate frame of the size in preset size range is determined, In, the size range of candidate frame corresponding to the image of different scale is different.

In the present embodiment, above-mentioned executing subject can determine size in preset ruler in the candidate frame of each image Candidate frame in very little range.Because the size range of candidate frame corresponding to the image of different scale is different, determining ruler When candidate frame in very little range, the size of candidate frame determined by the image to different scale is not quite similar.Corresponding to image Candidate frame, which refers to, inputs the obtained candidate frame of convolutional neural networks for image.

For example, the available scale of above-mentioned executing subject is 224 × 224 original image, and carries out down-sampling, obtains one The small image that a scale is 112 × 112.The corresponding candidate frame of original image and the corresponding candidate frame of small image can be set respectively in advance Determine size range: less than 8 × 8 and be greater than 8 × 8, or less than 9 × 9 and greater than 8 × 8 etc..

In some optional implementations of the present embodiment, in response to there are at least two images in treated image Scale is greater than the scale of acquired image, at least two images, the size of candidate frame corresponding to the lesser image of scale Range is less than the first specified threshold, and the size range of candidate frame corresponding to the biggish image of scale is less than the second specified threshold Value, the first specified threshold are greater than the second specified threshold.

In response in treated image there are the scale that more than two graphical rules are less than acquired image, two In above image, the size range of candidate frame corresponding to the lesser image of scale be greater than third specified threshold, scale compared with The size range of candidate frame corresponding to big image is greater than the 4th specified threshold, and third specified threshold is greater than the 4th specified threshold Value.

Numerical value in these optional implementations, in the size range of candidate frame corresponding to the biggish image of scale Smaller, and the numerical value in the size range of candidate frame corresponding to the lesser image of scale is larger, the two size ranges can be with It partially overlaps.

For example, original image scale is 128 × 128, and after up-sampling, obtained image is the A figure that scale is 224 × 224 The B image that picture and scale are 256 × 256.The size range of candidate frame corresponding to A image can be less than 6 × 6 (here two A 6 be respectively wide and high pixel numbers), the size range of candidate frame corresponding to B image can be less than 5 × 5.

Clarification of objective in the biggish image of the scale of these implementations is easier to be acquired, and can embody target More details.And the target in the lesser image of scale is more able to reflect the global feature of target.Therefore, it is possible to from scale compared with Emphasis determines lesser target in big image, and emphasis determines biggish target in the lesser image of scale, with more acurrate Ground obtains different size of clarification of objective.

Corresponding to step 204, determining at least one candidate frame in the candidate frame in size range in characteristic pattern Region obtains the feature in region, inputs the full articulamentum of convolutional neural networks.

In the present embodiment, above-mentioned executing subject can determine at least one of the candidate frame in size range candidate Frame region corresponding in characteristic pattern.Later, the feature in region is obtained, and acquired feature is inputted into convolutional neural networks Full articulamentum (Connected Layer), (such as can be to full articulamentum to carry out the subsequent processing of convolutional neural networks As a result classified and returned), obtain the final output of convolutional neural networks.Above-mentioned executing subject is in the feature for obtaining region When, the eigenmatrix of part corresponding to above-mentioned zone can be determined from the eigenmatrix corresponding to characteristic pattern, and extract Come.

Characteristic pattern corresponding to different images is different.Candidate frame in the size range corresponding to each image is multiple In the case where, it can determine each candidate frame corresponding different zones of institute in characteristic pattern.

Above-mentioned steps 204 can be real by the specific pond layer (ROI Pooling Layer) in convolutional neural networks It is existing.

In some optional implementations of the present embodiment, before above-mentioned steps 204, this method can also include:

Non-maxima suppression is carried out to the candidate frame in preset size range, to obtain at least one above-mentioned candidate Frame.

In these optional implementations, above-mentioned executing subject can to the candidate frame in preset size range into Row non-maxima suppression (Non-Maximum Suppression, NMS), to be generated by above-mentioned non-maxima suppression process At least one candidate frame stated.Then, above-mentioned executing subject can determine at least one candidate frame generated in characteristic pattern Corresponding region.Non-maxima suppression can screen candidate frame, obtain and the callout box place for label target The candidate frame being positioned relatively close to.

These implementations can remove the poor candidate frame of accuracy by non-maxima suppression, and increase obtains target The accuracy of the feature taken.

With continued reference to the schematic diagram that Fig. 3, Fig. 3 are according to the application scenarios of the image processing method of the present embodiment.? In the application scenarios of Fig. 3, the available image 302 comprising target of executing subject 301 carries out change of scale to image 302, obtains The image 303 of at least one scale to treated；By acquired image, image inputs convolutional neural networks with treated, Obtain the candidate frame 305 of the position of characteristic pattern 304 and multiple instruction targets, wherein corresponding at least two candidate frames of each target； In the candidate frame in each image, candidate frame 306 of the size in preset size range is determined, wherein different scale The size range of candidate frame corresponding to image is different；Determine that at least one candidate frame in the candidate frame in size range exists Corresponding region 307 in characteristic pattern, obtains feature 308 corresponding to region, inputs the full articulamentum of convolutional neural networks.

The method provided by the above embodiment of the application can be by determining different size ranges from the image of different scale Candidate frame, more abundant and accurate feature can be obtained with the target to all size.

With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of image processing method.The image procossing The process 400 of method, comprising the following steps:

Step 401, obtain include target image, up-sampling and/or down-sampling are carried out to image, obtain that treated extremely The image of few a kind of scale, wherein the size range of candidate frame corresponding to the image that down-sampling obtains is more than or equal to the One preset threshold, the size range of candidate frame corresponding to the image up-sampled be less than or equal to the second preset threshold, First preset threshold is greater than the second preset threshold.

In the present embodiment, executing subject (such as server shown in FIG. 1 or the end of image processing method operation thereon End equipment) the available image comprising target, up-sampling and down-sampling are carried out to image, the image that obtains that treated.Processing Image afterwards includes at least two scales.Specifically, the size model of candidate frame corresponding to the large scale image up-sampled Numerical value in enclosing is smaller, and the numerical value in the size range of candidate frame corresponding to the small scale image that down-sampling obtains is larger.

In some optional implementations of the present embodiment, the size range of candidate frame corresponding to acquired image Between third predetermined threshold value and the 4th preset threshold, wherein third predetermined threshold value is greater than the 4th preset threshold, and third presets threshold Value is greater than or equal to the first preset threshold, and the 4th preset threshold is less than or equal to the second preset threshold.

In these implementations, the numerical value of the size range of candidate frame corresponding to acquired original image is placed in the middle.In this way, The target of some moderate dimensions can be determined from original image, these targets are obtained from original image with the size for these targets Feature, thus the relatively accurately moderate target of detecting size.

Step 402, by acquired image and treated that image inputs convolutional neural networks, characteristic pattern and multiple is obtained Indicate the candidate frame of the position of target, wherein corresponding at least two candidate frames of each target.

In the present embodiment, acquired image can be inputted convolutional neural networks by above-mentioned executing subject, also, can also Convolutional neural networks are inputted with the image for obtaining change of scale, to obtain the candidate frame and feature of the position of multiple instruction targets Figure.Specifically, above-mentioned executing subject can determine candidate frame using various ways.

Step 403, in the candidate frame in each image, candidate frame of the size in preset size range is determined, In, the size range of candidate frame corresponding to the image of different scale is different.

Corresponding to step 404, determining at least one candidate frame in the candidate frame in size range in characteristic pattern Region obtains the feature in region, inputs the full articulamentum of convolutional neural networks.

In the present embodiment, above-mentioned executing subject can determine at least one of the candidate frame in size range candidate Frame, the corresponding region in characteristic pattern.Later, the feature in region is obtained, and acquired feature is inputted into convolutional Neural net The full articulamentum of network obtains the final output of convolutional neural networks to carry out the subsequent processing of convolutional neural networks.Above-mentioned execution Main body can determine spy corresponding to target area from the eigenmatrix corresponding to characteristic pattern when obtaining the feature in region The part of matrix is levied, and is extracted.

The present embodiment can obtain the image of different scale by up-sampling, down-sampling, get to different size of mesh Mark obtains feature abundant.Further, the present embodiment can be by the candidate frame of at least three kinds size ranges, more accurately Get different size clarification of objective in image.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of image procossing dresses The one embodiment set, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively In kind electronic equipment.

As shown in figure 5, the image processing apparatus 500 of the present embodiment includes: acquiring unit 501, input unit 502, determines Unit 503 and area determination unit 504.Wherein, acquiring unit 501, be configured to obtain include target image, to image into Row change of scale, the image at least one scale that obtains that treated；Input unit 502 is configured to acquired image Image inputs convolutional neural networks with treated, obtains the candidate frame of the position of characteristic pattern and multiple instruction targets, wherein every Corresponding at least two candidate frames of a target；Determination unit 503 is configured in the candidate frame in each image, determines size Candidate frame in preset size range, wherein the size range of candidate frame corresponding to the image of different scale is different；Area Domain determination unit 504 is configured to determine the institute in characteristic pattern of at least one candidate frame in the candidate frame in size range Corresponding region obtains feature corresponding to region, inputs the full articulamentum of convolutional neural networks.

In some embodiments, the available image comprising target of acquiring unit 501, and to acquired image into Row change of scale, with the image at least one scale that obtains that treated.Pair for having certain meaning that target is presented by image As, such as trees, house.

In some embodiments, acquired image can be inputted convolutional neural networks by input unit 502, also, The image that change of scale obtains convolutional neural networks be can be inputted into, candidate frame and the spy of the position of multiple instruction targets obtained Sign figure.Specifically, above-mentioned executing subject can determine candidate frame using various ways.

In some embodiments, determination unit 503 can determine size in preset ruler in the candidate frame of each image Candidate frame in very little range.Because the size range of candidate frame corresponding to the image of different scale is different, determining ruler When candidate frame in very little range, the size of candidate frame determined by the image to different scale is not quite similar.Corresponding to image Candidate frame, which refers to, inputs the obtained candidate frame of convolutional neural networks for image.

In some embodiments, area determination unit 504 can determine at least one in the candidate frame in size range A candidate frame region corresponding in characteristic pattern.Later, the feature in region is obtained, and acquired feature input convolution is refreshing Full articulamentum through network obtains the final output of convolutional neural networks to carry out the subsequent processing of convolutional neural networks.

In some optional implementations of the present embodiment, the device further include: selection unit is configured to pre- If size range in candidate frame carry out non-maxima suppression, to obtain at least one candidate frame.

In some optional implementations of the present embodiment, which is further configured to: being carried out to image Up-sampling and/or down-sampling, wherein the size range of candidate frame corresponding to the image that down-sampling obtains is more than or equal to the One preset threshold, the size range of candidate frame corresponding to the image up-sampled be less than or equal to the second preset threshold, First preset threshold is greater than the second preset threshold.

In some optional implementations of the present embodiment, in response to there are more than two figures in treated image As scale is less than the scale of acquired image, in more than two images, candidate frame corresponding to the lesser image of scale Size range is greater than third specified threshold, and the size range of candidate frame corresponding to the biggish image of scale is to refer to greater than the 4th Determine threshold value, third specified threshold is greater than the 4th specified threshold.

Below with reference to Fig. 6, it illustrates the computer systems 600 for the electronic equipment for being suitable for being used to realize the embodiment of the present application Structural schematic diagram.Electronic equipment shown in Fig. 6 is only an example, function to the embodiment of the present application and should not use model Shroud carrys out any restrictions.

As shown in fig. 6, computer system 600 includes central processing unit (CPU and/or GPU) 601, it can be according to depositing Storage is loaded into random access storage device (RAM) 603 in the program in read-only memory (ROM) 602 or from storage section 608 Program and execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various journeys Sequence and data.Central processing unit 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) Interface 605 is also connected to bus 604.

I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.；Storage section 608 including hard disk etc.； And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon Computer program be mounted into storage section 608 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media 611 are mounted.When the computer program is executed by central processing unit 601, limited in execution the present processes above-mentioned Function.It should be noted that the computer-readable medium of the application can be computer-readable signal media or computer can Read storage medium either the two any combination.Computer readable storage medium for example can be --- but it is unlimited In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or any above combination.It calculates The more specific example of machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, portable of one or more conducting wires Formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In this application, computer readable storage medium can be it is any include or storage program Tangible medium, which can be commanded execution system, device or device use or in connection.And in this Shen Please in, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by Instruction execution system, device or device use or program in connection.The journey for including on computer-readable medium Sequence code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned Any appropriate combination.

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include acquiring unit, input unit, determination unit and area determination unit.Wherein, the title of these units is under certain conditions simultaneously The restriction to the unit itself is not constituted, for example, acquiring unit is also described as " image comprising target being obtained, to figure As carrying out change of scale, the unit of the image at least one scale that obtains that treated ".

As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in device described in above-described embodiment；It is also possible to individualism, and without in the supplying device.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should Device: obtain include target image, change of scale is carried out to image, the image at least one scale that obtains that treated；It will Image inputs convolutional neural networks to acquired image with treated, obtains the time of the position of characteristic pattern and multiple instruction targets Select frame, wherein corresponding at least two candidate frames of each target；In the candidate frame in each image, determine size preset Candidate frame in size range, wherein the size range of candidate frame corresponding to the image of different scale is different；It determines in size At least one candidate frame in candidate frame in range region corresponding in characteristic pattern, obtains feature corresponding to region, Input the full articulamentum of convolutional neural networks.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of image processing method, comprising:

Obtain include target image, change of scale is carried out to described image, the image at least one scale that obtains that treated；

By acquired image, image inputs convolutional neural networks with treated, obtains the position of characteristic pattern and multiple instruction targets The candidate frame set, wherein corresponding at least two candidate frames of each target；

In the candidate frame in each image, candidate frame of the size in preset size range is determined, wherein different scale The size range of candidate frame corresponding to image is different；

It determines the region corresponding in the characteristic pattern of at least one candidate frame in the candidate frame in size range, obtains Feature corresponding to the region inputs the full articulamentum of the convolutional neural networks.

2. according to the method described in claim 1, wherein, in candidate frame at least one of of the determination in size range Before candidate frame region corresponding in the characteristic pattern, the method also includes:

Non-maxima suppression is carried out to the candidate frame in preset size range, to obtain at least one described candidate frame.

3. method described in one of -2 according to claim 1, wherein described to carry out change of scale to described image, comprising:

Up-sampling and/or down-sampling are carried out to described image, wherein the size of candidate frame corresponding to the image that down-sampling obtains Range be more than or equal to the first preset threshold, the size range of candidate frame corresponding to the image that up-samples be less than or Equal to the second preset threshold, first preset threshold is greater than second preset threshold.

4. according to the method described in claim 3, wherein, the size range of candidate frame corresponding to acquired image is described Between third predetermined threshold value and the 4th preset threshold, wherein the third predetermined threshold value is greater than the 4th preset threshold, described Third predetermined threshold value is greater than or equal to first preset threshold, and it is default that the 4th preset threshold is less than or equal to described second Threshold value.

5. according to the method described in claim 1, wherein, in response to there are at least two graphical rules are big in treated image In the scale of acquired image, at least two image, the size model of candidate frame corresponding to the lesser image of scale It encloses for less than the first specified threshold, the size range of candidate frame corresponding to the biggish image of scale is less than the second specified threshold Value, first specified threshold are greater than second specified threshold.

6. according to the method described in claim 1, wherein, in response to there are more than two graphical rules in treated image Less than the scale of acquired image, in described two above images, the ruler of candidate frame corresponding to the lesser image of scale Very little range is greater than third specified threshold, and the size range of candidate frame corresponding to the biggish image of scale is specified greater than the 4th Threshold value, the third specified threshold are greater than the 4th specified threshold.

7. a kind of image processing apparatus, comprising:

Acquiring unit, be configured to obtain include target image, change of scale is carried out to described image, obtains that treated extremely A kind of image of few scale；

Input unit is configured to acquired image and treated image input convolutional neural networks obtaining characteristic pattern With the candidate frame of the position of multiple instruction targets, wherein corresponding at least two candidate frames of each target；

Determination unit is configured in the candidate frame in each image, determines candidate of the size in preset size range Frame, wherein the size range of candidate frame corresponding to the image of different scale is different；

Area determination unit is configured to determine at least one candidate frame in the candidate frame in size range in the feature Corresponding region, obtains feature corresponding to the region, inputs the full articulamentum of the convolutional neural networks in figure.

8. device according to claim 7, wherein described device further include:

Selection unit is configured to carry out non-maxima suppression to the candidate frame in preset size range, described to obtain At least one candidate frame.

9. the device according to one of claim 7-8, wherein the acquiring unit is further configured to:

10. device according to claim 9, wherein the size range of candidate frame corresponding to acquired image is in institute It states between third predetermined threshold value and the 4th preset threshold, wherein the third predetermined threshold value is greater than the 4th preset threshold, institute Third predetermined threshold value is stated more than or equal to first preset threshold, it is pre- that the 4th preset threshold is less than or equal to described second If threshold value.

11. device according to claim 7, wherein in response to there are at least two graphical rules in treated image Greater than the scale of acquired image, at least two image, the size of candidate frame corresponding to the lesser image of scale Range is less than the first specified threshold, and the size range of candidate frame corresponding to the biggish image of scale is less than the second specified threshold Value, first specified threshold are greater than second specified threshold.

12. device according to claim 7, wherein in response to there are more than two image rulers in treated image Degree is less than the scale of acquired image, in described two above images, candidate frame corresponding to the lesser image of scale Size range is greater than third specified threshold, and the size range of candidate frame corresponding to the biggish image of scale is to refer to greater than the 4th Determine threshold value, the third specified threshold is greater than the 4th specified threshold.

13. a kind of electronic equipment, comprising:

One or more processors；

Storage device, for storing one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method as claimed in any one of claims 1 to 6.

14. a kind of computer readable storage medium, is stored thereon with computer program, wherein when the program is executed by processor Realize such as method as claimed in any one of claims 1 to 6.