CN110287955A

CN110287955A - Target area determines model training method, device and computer readable storage medium

Info

Publication number: CN110287955A
Application number: CN201910485063.3A
Authority: CN
Inventors: 朱延东; 王长虎
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Douyin Vision Co Ltd; Douyin Vision Beijing Co Ltd
Priority date: 2019-06-05
Filing date: 2019-06-05
Publication date: 2019-09-27
Anticipated expiration: 2039-06-05
Also published as: CN110287955B

Abstract

A kind of target area of the disclosure determines model training method, device, electronic equipment and computer readable storage medium.Wherein method includes: to carry out target area identification to sample image, obtains the image-region comprising target area；Training sample set is formed by the image-region of multiple fixed sizes, and the image-region in training sample set is marked with multiple key points, multiple key points are in target area；The trained channel stand-alone training of each of convolutional neural networks obtains the target area comprising multiple trained channels and determines model up to meeting the respective condition of convergence；Multiple trained channels are respectively used to predict displacement of multiple key points with respect to reference point.The embodiment of the present disclosure is respectively trained training sample set by parallel multiple trained channels, obtain multiple trained channels, multiple trained channels are respectively used to predict displacement of multiple key points with respect to reference point, to obtain target area according to multiple key points, target area recognition accuracy can be improved.

Description

Target area determines model training method, device and computer readable storage medium

Technical field

This disclosure relates to which a kind of target area determines model training technical field, determined more particularly to a kind of target area Model training method, device and computer readable storage medium.

Background technique

Many in the video image of shooting all includes automobile, and include general in the image of automobile all can include license plate, by It is related to privacy in license plate, it is therefore desirable to processing be carried out to the license plate in video image or cover the vehicle using other images Board.And when handling the image comprising license plate, identify that the license plate area in image is crucial.

In the prior art for the identification of license plate area, usually judge that each pixel in license plate is prospect using model Or background, wherein prospect is the license plate area to be identified, background refers to the region in image other than license plate.But use this Kind of method, classifier are only concerned whether pixel is pixel in license plate, when be used only pixel whether be license plate area pixel this When one Rule of judgment, it is easy to influence license plate precise area by some erroneous judgement points.

Summary of the invention

The technical issues of disclosure solves is to provide a kind of target area and determines model training method, at least partly to solve The certainly misjudged technical problem in target area in the prior art.In addition, also provide a kind of target area determine model training apparatus, Target area determines that model training hardware device, computer readable storage medium and target area determine model training terminal.

To achieve the goals above, according to one aspect of the disclosure, the following technical schemes are provided:

A kind of target area determines model training method, comprising:

Target area identification is carried out to sample image, obtains the image-region comprising the target area；

By described image area zoom to fixed size, training sample is formed by the image-region of multiple fixed sizes Set, and the image-region in the training sample set is marked with multiple key points, wherein the multiple key point In in the target area；

The training sample set is inputted into convolutional neural networks；Wherein, the convolutional layer of the convolutional neural networks includes Parallel multiple convolution kernels, the corresponding trained channel of each convolution kernel；

It, up to meeting the respective condition of convergence, is wrapped according to the training sample set stand-alone training in each trained channel Target area containing multiple trained channels determines model；Wherein, the target area determines multiple trained channel difference of model For predicting displacement of the multiple key point with respect to reference point, wherein the reference point is located in the target area.

Further, each trained channel is according to the training sample set stand-alone training until meeting respective receipts Condition is held back, the target area comprising multiple trained channels is obtained and determines model, comprising:

Determine the parameter in each trained channel；

Prediction bits are calculated according to determining parameter, multiple key points of label and the reference point in each trained channel It moves；

Prediction error is determined according to the predictive displacement of each image-region and real displacement；

The parameter that the prediction error is greater than the default corresponding trained channel of error is readjusted, continues to repeat the correspondence Training channel training process, until the corresponding trained channel prediction error be less than or equal to the default error, Terminate the training process in the corresponding trained channel.

Further, the target area is polygonal region, and the key point is the vertex of the polygonal region.

Further, the reference point is the central point in described image region.

Further, the target area is license plate area.

A kind of target area determines method, comprising:

Target area identification is carried out to video frame currently entered, obtains the image-region comprising the target area；

By described image area zoom to fixed size；

The image-region input of the fixed size is determined into mould using the described in any item target areas claim 1-5 The target area that type training method is trained determines model；

Determine that multiple trained channels of model predict to obtain the position of multiple opposite reference points respectively by the target area It moves；Wherein, the reference point is located in the target area；

Multiple key points are determined according to the displacement of the multiple opposite reference point and the reference point coordinate；

Target area is determined according to the multiple key point.

It is further, described that target area is determined according to the multiple key point, comprising:

The enclosed region that the multiple key point line is formed is as target area.

Further, the method also includes:

Next frame video image is obtained, and using next frame video image as the video frame currently entered；

Expand predeterminable area around on the basis of described image region, and the image-region after will be enlarged by is scaled to fix Size is as the corresponding image-region comprising target area of next frame video image.

Further, the target area is license plate area.

A kind of target area determines model training apparatus, comprising:

Specimen discerning module obtains the figure comprising the target area for carrying out target area identification to sample image As region；

Training set determining module, is used for by described image area zoom to fixed size, by multiple fixed sizes Image-region form training sample set, and the image-region in the training sample set is marked with multiple keys Point, wherein the multiple key point is in the target area；

Training set input module, for the training sample set to be inputted convolutional neural networks；Wherein, the convolution The convolutional layer of neural network includes parallel multiple convolution kernels, the corresponding trained channel of each convolution kernel；

Model training module, for each trained channel according to the training sample set stand-alone training until meeting respective The condition of convergence, obtain the target area comprising multiple trained channels and determine model；Wherein, the target area determines model Multiple trained channels are respectively used to predict displacement of the multiple key point with respect to reference point, wherein the reference point is located at institute It states in target area.

Further, the model training module is specifically used for: determining the parameter in each trained channel；Each trained channel Predictive displacement is calculated according to determining parameter, multiple key points of label and the reference point；

Prediction error is determined according to the predictive displacement of each image-region and real displacement；Readjust the prediction error The parameter in trained channel corresponding greater than default error continues the training process for repeating the corresponding trained channel, until institute The prediction error for stating corresponding trained channel is less than or equal to the default error, terminates the training in the corresponding trained channel Process.

Further, the reference point is the central point in described image region.

Further, the target area is license plate area.

A kind of target area determining device, comprising:

Region identification module is obtained for carrying out target area identification to video frame currently entered comprising the target The image-region in region；

Image scaling module is used for described image area zoom to fixed size；

Image input module, for the image-region input of the fixed size to be used any one of claim 1-5 institute The target area that the target area stated determines that model training method is trained determines model；

Prediction module, for determining that multiple trained channels of model predict to obtain multiple phases respectively by the target area Displacement to reference point；Wherein, the reference point is located in the target area；

Key point determining module, for more according to the displacement and reference point coordinate determination of the multiple opposite reference point A key point；

Area determination module, for determining target area according to the multiple key point.

Further, the area determination module is specifically used for: the enclosed region that the multiple key point line is formed As target area.

Further, the region identification module is also used to: obtaining next frame video image, and by the next frame video Image is as the video frame currently entered；Expand predeterminable area around on the basis of described image region, and will be enlarged by Image-region afterwards is scaled to fixed size as the corresponding image-region comprising target area of next frame video image.

Further, the target area is license plate area.

A kind of electronic equipment, comprising:

Memory, for storing non-transitory computer-readable instruction；And

Processor, for running the computer-readable instruction, so that being realized when processor execution above-mentioned any one Target area described in determines model training method.

A kind of computer readable storage medium, for storing non-transitory computer-readable instruction, when the non-transitory When computer-readable instruction is executed by computer, so that the computer executes target area described in above-mentioned any one and determines Model training method.

A kind of electronic equipment, comprising:

Memory, for storing non-transitory computer-readable instruction；And

Processor, for running the computer-readable instruction, so that being realized when processor execution above-mentioned any one Target area described in determines method.

A kind of computer readable storage medium, for storing non-transitory computer-readable instruction, when the non-transitory When computer-readable instruction is executed by computer, so that the computer executes target area described in above-mentioned any one and determines Method.

To achieve the goals above, according to the another aspect of the disclosure, and also the following technical schemes are provided:

A kind of target area determines that model training terminal, including any of the above-described target area determine model training apparatus.

A kind of reading data terminal, including any of the above-described reading data device.

The embodiment of the present disclosure is respectively trained training sample set by parallel multiple trained channels, obtains multiple Training channel, and multiple trained channels are respectively used to prediction multiple key points associated with target area with respect to reference point It is accurate to can be improved target area identification so that the displacement according to multiple key points with respect to reference point obtains target area for displacement Rate.

Above description is only the general introduction of disclosed technique scheme, in order to better understand the technological means of the disclosure, and It can be implemented in accordance with the contents of the specification, and to allow the above and other objects, features and advantages of the disclosure can be brighter Show understandable, it is special below to lift preferred embodiment, and cooperate attached drawing, detailed description are as follows.

Detailed description of the invention

Fig. 1 a is the flow diagram that model training method is determined according to the target area of an embodiment of the present disclosure；

Fig. 1 b is the convolution that the convolutional layer in model training method is determined according to the target area of an embodiment of the present disclosure Process schematic；

Fig. 1 c is the convolution that the convolutional layer in model training method is determined according to the target area of an embodiment of the present disclosure Result schematic diagram；

Fig. 2 is the flow diagram that method is determined according to the target area of an embodiment of the present disclosure；

Fig. 3 is the structural schematic diagram that model training apparatus is determined according to the target area of an embodiment of the present disclosure；

Fig. 4 is the structural schematic diagram according to the target area determining device of an embodiment of the present disclosure；

Fig. 5 is the structural schematic diagram according to the electronic equipment of an embodiment of the present disclosure.

Specific embodiment

Illustrate embodiment of the present disclosure below by way of specific specific example, those skilled in the art can be by this specification Disclosed content understands other advantages and effect of the disclosure easily.Obviously, described embodiment is only the disclosure A part of the embodiment, instead of all the embodiments.The disclosure can also be subject to reality by way of a different and different embodiment It applies or applies, the various details in this specification can also be based on different viewpoints and application, in the spirit without departing from the disclosure Lower carry out various modifications or alterations.It should be noted that in the absence of conflict, the feature in following embodiment and embodiment can To be combined with each other.Based on the embodiment in the disclosure, those of ordinary skill in the art are without making creative work Every other embodiment obtained belongs to the range of disclosure protection.

It should be noted that the various aspects of embodiment within the scope of the appended claims are described below.Ying Xian And be clear to, aspect described herein can be embodied in extensive diversified forms, and any specific structure described herein And/or function is only illustrative.Based on the disclosure, it will be understood by one of ordinary skill in the art that one described herein Aspect can be independently implemented with any other aspect, and can combine the two or both in these aspects or more in various ways. For example, carry out facilities and equipments in terms of any number set forth herein can be used and/or practice method.In addition, can make With other than one or more of aspect set forth herein other structures and/or it is functional implement this equipment and/or Practice the method.

It should also be noted that, diagram provided in following embodiment only illustrates the basic structure of the disclosure in a schematic way Think, component count, shape and the size when only display is with component related in the disclosure rather than according to actual implementation in schema are drawn System, when actual implementation kenel, quantity and the ratio of each component can arbitrarily change for one kind, and its assembly layout kenel can also It can be increasingly complex.

In addition, in the following description, specific details are provided for a thorough understanding of the examples.However, fields The skilled person will understand that the aspect can be practiced without these specific details.

Embodiment one

In order to solve the technical problem that target area correct recognition rata is low in the prior art, the embodiment of the present disclosure provides one kind Target area determines model training method.As shown in Figure 1a, it includes walking as follows which, which determines model training method mainly, Rapid S11 to step S14.Wherein:

Step S11: target area identification is carried out to sample image, obtains the image-region comprising the target area.

Wherein, target area can be license plate area.

Wherein, sample image is to input the image that convolutional neural networks are trained as training sample.Specifically, can be with It is obtained by the camera of terminal device, or is stored in advance in local data base, obtained from local data base.

Specifically, can be identified using existing convolutional neural networks model to sample image, obtain comprising described The image-region of target area.

Step S12: described image area zoom to fixed size is made of the image-region of multiple fixed sizes Training sample set, and the image-region in the training sample set is marked with multiple key points, wherein it is the multiple Key point is in the target area.

In order to guarantee that the target area that training obtains determines that model has general applicability, it can to the figure of arbitrary size As region is predicted, image area size is unified for fixed size by the present embodiment, wherein fixed size customized can be set It sets.

Wherein, key point is the pixel in target area, specifically can be with the pixel of target area boundaries.Seat can be used Mark indicates the key point.For example, if target area is license plate area, since license plate area is usually a rectangular area, that Key point can be the pixel of four apex angles of vehicle region.

Step S13: the training sample set is inputted into convolutional neural networks；Wherein, the volume of the convolutional neural networks Lamination includes parallel multiple convolution kernels, the corresponding trained channel of each convolution kernel.

Wherein, convolutional neural networks (Convolutional Neural Networks, CNN) are a kind of comprising convolution meter The feedforward neural network of depth structure is calculated and had, mainly includes input layer, convolutional layer, pond layer, full articulamentum and output layer. Also, a convolutional neural networks may include multiple convolutional layers.Herein, convolutional neural networks can be straight barrel type convolution Neural network, or deep learning convolutional neural networks are not specifically limited here.

Wherein, convolutional layer includes convolution kernel, and convolution kernel can be a matrix, for carrying out convolution, tool to input picture Body calculation method is the element multiplication to the difference of the image of input local matrix and each position of convolution nuclear matrix, then phase Add.Herein, each trained channel corresponds to different convolution kernels.

For example, as shown in Figure 1 b, input is the matrix of a two-dimensional 3x4, and convolution kernel is the square of a 2x2 Battle array.It is assumed that convolution is that a primary mobile pixel carrys out convolution, then first to the upper left corner part 2x2 of input and convolution The element multiplication of nuclear convolution, i.e., each position is added again, and the element of the S00 of obtained output matrix S is worth for aw+bx+ey+ fzaw+bx+ey+fz.It then is that (b, c, f, g) four elements are constituted now by the part of input to one pixel of right translation Matrix and convolution kernel carry out convolution, this results in the element of the S01 of output matrix S, same method, and available output matrix The S02 of S, S10, S11, S12, S10, S11, the element of S12.As illustrated in figure 1 c, the matrix for finally obtaining convolution output is one The matrix S of 2x3.

Wherein, each trained channel is independent, and in addition to using different convolution kernels in convolutional layer, multiple training is logical Road shares other layers of the convolutional neural networks.

Wherein, training channel number is determined by the key point number marked, if the key point number of label is 6, Corresponding trained channel just has 6, the corresponding trained channel of a key point.

Step S14: each trained channel is according to the training sample set stand-alone training until meeting respective convergence item Part obtains the target area comprising multiple trained channels and determines model；Wherein, the target area determines multiple training of model Channel is respectively used to predict displacement of the multiple key point with respect to reference point, wherein the reference point is located at the target area In domain.

Wherein, reference point customized can be chosen.Specifically, reference point can be marked before training, it can also be according to mark Reference point is calculated in the key point of note.For example, when target area is license plate area, corresponding key point is the four of license plate area When a vertex, the central point of license plate area can be determined according to four key points of label, using the central point as reference point.

The present embodiment is respectively trained training sample set by parallel multiple trained channels, so that training obtains Target area determine model include multiple trained channels, and multiple trained channels be respectively used to prediction with the target area The displacement of associated multiple opposite reference points, available multiple key points relevant to target area, thus according to multiple Key point obtains target area, can be improved target area recognition accuracy.

In an alternative embodiment, step S14 includes:

Step S141: the parameter in each trained channel is determined.

Wherein, the parameter includes the corresponding parameter of convolution kernel of convolutional layer, such as the size of convolution matrix, such as can be with It is set as the matrix of 3*3, different convolution kernels can be set in different convolutional layers.In addition, it can include the parameter of pond layer, example It can be the pond matrix of 3*3 or the parameter of output layer, such as linear coefficient matrix and bias such as the size of pond matrix Vector etc..Also, the corresponding parameter in each trained channel is all different.

Step S142: each trained channel is calculated according to determining parameter, multiple key points of label and the reference point Obtain predictive displacement.

Specifically, practicing the input layer that sample set passes through the convolutional neural networks first, training sample set is converted For multi-C vector, convolutional calculation then is carried out by convolutional layer, obtains convolution stage corresponding characteristic image.Herein, it rolls up Lamination includes parallel multiple convolution kernels, then input picture enters after convolutional layer, carries out convolutional calculation from different convolution kernels, Multiple convolution results are obtained, are predicted subsequently into pond layer, full articulamentum and output layer.

Step S143: prediction error is determined according to the predictive displacement of each image-region and real displacement.

Wherein, default error customized can be arranged.

Specifically, the loss between predicted characteristics data and real features data can be calculated using loss function, will damage It loses as prediction error.

Step S144: the parameter that the prediction error is greater than the default corresponding trained channel of error is readjusted, weight is continued The training process in the multiple corresponding trained channel, until the prediction error in the corresponding trained channel is less than or equal to described Default error, terminates the training process in the corresponding trained channel.

In an alternative embodiment, the target area is polygonal region, and the key point is the polygon The vertex in region.

Specifically, corresponding polygon is that quadrangle is rectangle when the target area is license plate area.Right When training sample is marked, the vertex of target area is corresponded in tag image region.For example, four of the rectangle are pushed up Point.

Embodiment two

Determine that the low technical problem of accuracy, the embodiment of the present disclosure also provide one to solve target area in the prior art Kind target area determines method, as shown in Fig. 2, specifically including:

S21: target area identification is carried out to video frame currently entered, obtains the image district comprising the target area Domain.

Wherein, the video frame of input can be obtained in real time by camera, or obtain pre-stored video figure from local Picture.

Wherein, target area can be license plate area, and corresponding target area is rectangular area.

Specifically, can tentatively be identified using existing convolutional neural networks model to the video frame of input, obtain Image-region comprising the target area.

S22: by described image area zoom to fixed size.

Wherein, the image-region of the training sample inputted when fixed size determines model training with target area is consistent.

S23: the image-region input target area of the fixed size is determined that model training method is trained to obtain Target area determine model.

Wherein, target area determine model using target area described in above-described embodiment one determine model training method into Row training obtains, and specific training process is referring to above-described embodiment one.

S24: determine that multiple trained channels of model predict to obtain multiple opposite reference points respectively by the target area Displacement；Wherein, the reference point is located in the target area.

Wherein, the displacement of one relatively described reference point of a corresponding prediction in trained channel.

Wherein, reference point can be preset, specifically may be configured as target area central point or target area its Its pixel.

S25: multiple key points are determined according to the displacement of the multiple opposite reference point and the reference point coordinate.

For example, if target area can be license plate area, target area corresponds to a rectangular area, in training mesh Mark region determines that the upper left that rectangular area can be chosen when model, lower-left, upper right, the point of bottom right four are trained as key point, Obtain four trained channels, be respectively used to prediction upper left, lower-left, upper right, the key point of bottom right four with respect to reference point displacement.That , in the present embodiment, can determine that four trained Channel Predictions of model obtain four opposite reference points by target area Further upper left, lower-left, upper right, bottom right is calculated according to four displacements of reference point and the coordinate of reference point relatively in displacement Four key points.

S26: target area is determined according to the multiple key point.

The present embodiment determines that multiple trained channels of model predict to obtain multiple opposite reference points respectively by target area Displacement, multiple key points can be obtained according to the displacements of multiple opposite reference points, and then obtain target area, can be improved mesh Mark region recognition accuracy rate.

In an alternative embodiment, step S26 is specifically included:

Specifically, referring to the example in above-mentioned steps 25, if obtaining upper left, lower-left, upper right, the key point of bottom right four, Then to this four key point lines, an available rectangular area, then the rectangular area is target area.

In an alternative embodiment, the method also includes:

Step 27: obtaining next frame video image, and using next frame video image as the view currently entered Frequency frame.

Step 28: expanding predeterminable area around on the basis of described image region, and the image-region contracting after will be enlarged by Fixed size is put into as the corresponding image-region comprising target area of next frame video image.

Specifically, the image-region in step S21 can be expanded when determining the target area of next frame video image A part executes step S22-S26 as the image-region of next frame video image later, in this way, due to regarding in continuous several frames In frequency image, the change in location of target area is little, and after image-region is expanded a part every time, target area generally also exists In image-region, license plate area identification directly can be carried out using step S22-S26 in this way, no longer needed to using complicated target area Domain identification model carries out initial identification to next frame image and obtains the image-region comprising target area, more efficient in this way.

Those skilled in the art will be understood that on the basis of above-mentioned each embodiment, can also carry out obvious variant (example Such as, cited mode is combined) or equivalent replacement.

Hereinbefore, although according to above-mentioned sequence describe target area determine it is each in model training method embodiment A step, it will be apparent to one skilled in the art that the step in the embodiment of the present disclosure not necessarily executes in the order described above, Can with inverted order, it is parallel, other sequences such as intersect and execute, moreover, on the basis of above-mentioned steps, those skilled in the art can also be with Other steps are added, the mode of these obvious variants or equivalent replacement should also be included within the protection scope of the disclosure, This is repeated no more.

It is below embodiment of the present disclosure, embodiment of the present disclosure can be used for executing embodiments of the present disclosure realization The step of, for ease of description, part relevant to the embodiment of the present disclosure is illustrated only, it is disclosed by specific technical details, it asks Referring to embodiments of the present disclosure.

Embodiment three

Determine that the low technical problem of accuracy, the embodiment of the present disclosure provide one kind to solve target area in the prior art Target area determines model training apparatus.The device can execute target area described in above-described embodiment one and determine model training Step in embodiment of the method.As shown in figure 3, the device mainly includes: specimen discerning module 31, training set determining module 32, training set input module 33 and model training module 34；Wherein,

Specimen discerning module 31 is used to carry out target area identification to sample image, obtains the figure comprising the target area As region；

Training set determining module 32 is used for by described image area zoom to fixed size, by multiple fixed sizes Image-region form training sample set, and the image-region in the training sample set is marked with multiple keys Point, wherein the multiple key point is in the target area；

Training set input module 33 is used to the training sample set inputting convolutional neural networks；Wherein, the volume The convolutional layer of product neural network includes parallel multiple convolution kernels, the corresponding trained channel of each convolution kernel；

Model training module 34 is for each trained channel according to the training sample set stand-alone training until meeting each From the condition of convergence, obtain the target area comprising multiple trained channels and determine model；Wherein, the target area determines model Multiple trained channels be respectively used to predict displacement of the multiple key point with respect to reference point, wherein the reference point is located at In the target area.

Further, the model training module 34 is specifically used for: determining the parameter in each trained channel；Each training is logical Predictive displacement is calculated according to determining parameter, multiple key points of label and the reference point in road；According to each image district The predictive displacement and real displacement in domain determine prediction error；It readjusts the prediction error and is greater than the default corresponding training of error The parameter in channel continues the training process for repeating the corresponding trained channel, until the prediction in the corresponding trained channel Error is less than or equal to the default error, terminates the training process in the corresponding trained channel.

Further, the reference point is the central point in described image region.

Further, the target area is license plate area.

Related target area determines that the working principle of model training apparatus embodiment, technical effect of realization etc. are described in detail The related description in model training method embodiment can be determined with reference to preceding aim region, details are not described herein.

Example IV

Determine that the low technical problem of accuracy, the embodiment of the present disclosure provide one kind to solve target area in the prior art Target area determining device.The device can execute target area described in above-described embodiment two and determine that model training method is implemented Step in example.As shown in figure 4, the device mainly includes: region identification module 41, image scaling module 42, image input mould Block 43, prediction module 44, key point determining module 45 and area determination module 46；Wherein,

Region identification module 41 is used to carry out target area identification to video frame currently entered, obtains comprising the target The image-region in region；

Image scaling module 42 is used for described image area zoom to fixed size；

Image input module 43 is used for the image-region input by the fixed size using any one of claim 1-5 institute The target area that the target area stated determines that model training method is trained determines model；

Prediction module 44 is used to determine that multiple trained channels of model predict to obtain respectively by the target area multiple The displacement of opposite reference point；Wherein, the reference point is located in the target area；

Key point determining module 45 is used to be determined according to the displacement and the reference point coordinate of the multiple opposite reference point Multiple key points；

Area determination module 46 is used to determine target area according to the multiple key point.

Further, the area determination module 46 is specifically used for: the closed area that the multiple key point line is formed Domain is as target area.

Further, the target area is license plate area.

The detailed descriptions such as the technical effect of working principle, realization in relation to target area determining device embodiment can refer to Related description in preceding aim area determination method embodiment, details are not described herein.

Embodiment five

Below with reference to Fig. 5, it illustrates the structural schematic diagrams for the electronic equipment for being suitable for being used to realize the embodiment of the present disclosure.This Electronic equipment in open embodiment can include but is not limited to such as mobile phone, laptop, digit broadcasting receiver, PDA (personal digital assistant), PAD (tablet computer), PMP (portable media player), car-mounted terminal (such as vehicle mounted guidance Terminal) etc. mobile terminal and such as number TV, desktop computer etc. fixed terminal.Electronic equipment shown in Fig. 5 An only example, should not function to the embodiment of the present disclosure and use scope bring any restrictions.

As shown in figure 5, electronic equipment may include processing unit (such as central processing unit, graphics processor etc.) 501, Random access storage device can be loaded into according to the program being stored in read-only memory (ROM) 502 or from storage device 508 (RAM) program in 503 and execute various movements appropriate and processing.In RAM 503, it is also stored with electronic device institute The various programs and data needed.Processing unit 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/defeated (I/O) interface 505 is also connected to bus 504 out.

In general, following device can connect to I/O interface 505: including such as touch screen, touch tablet, keyboard, mouse, figure As the input unit 506 of sensor, microphone, accelerometer, gyroscope etc.；Including such as liquid crystal display (LCD), loudspeaking The output device 507 of device, vibrator etc.；Storage device 508 including such as tape, hard disk etc.；And communication device 509.It is logical T unit 509 can permit electronic equipment and wirelessly or non-wirelessly be communicated with other equipment to exchange data.Although Fig. 5 is shown Electronic equipment with various devices, it should be understood that being not required for implementing or having all devices shown.It can replace Implement or have more or fewer devices in generation ground.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communication device 509, or from storage device 508 It is mounted, or is mounted from ROM 502.When the computer program is executed by processing unit 501, the embodiment of the present disclosure is executed Method in the above-mentioned function that limits.

It should be noted that the above-mentioned computer-readable medium of the disclosure can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the disclosure, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In open, computer-readable signal media may include in a base band or as the data-signal that carrier wave a part is propagated, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable and deposit Any computer-readable medium other than storage media, the computer-readable signal media can send, propagate or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: electric wire, optical cable, RF (radio frequency) etc. are above-mentioned Any appropriate combination.

Above-mentioned computer-readable medium can be included in above-mentioned electronic equipment；It is also possible to individualism, and not It is fitted into the electronic equipment.

Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are by the electricity When sub- equipment executes, so that the electronic equipment: carrying out target area identification to sample image, obtain comprising the target area Image-region；By described image area zoom to fixed size, training sample is formed by the image-region of multiple fixed sizes This set, and the image-region in the training sample set is marked with multiple key points, wherein the multiple key point In the target area；The training sample set is inputted into convolutional neural networks；Wherein, the convolutional neural networks Convolutional layer includes parallel multiple convolution kernels, the corresponding trained channel of each convolution kernel；Each trained channel is according to the instruction Practice sample set stand-alone training until meeting the respective condition of convergence, obtains the target area comprising multiple trained channels and determine mould Type；Wherein, the target area determines that multiple trained channels of model are respectively used to predict the multiple key point with respect to reference The displacement of point, wherein the reference point is located in the target area.

The calculating of the operation for executing the disclosure can be write with one or more programming languages or combinations thereof Machine program code, above procedure design language include object oriented program language-such as Java, Smalltalk, C+ +, it further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the disclosure, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, block diagram and/or or each box in flow chart and block diagram and/or or the box in flow chart combination, can be with It is realized with the dedicated hardware based system for executing defined functions or operations, or specialized hardware and computer can be used The combination of instruction is realized.

Being described in unit involved in the embodiment of the present disclosure can be realized by way of software, can also be by hard The mode of part is realized.Wherein, the title of unit does not constitute the restriction to the unit itself under certain conditions.

Above description is only the preferred embodiment of the disclosure and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that the open scope involved in the disclosure, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from design disclosed above, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed in the disclosure Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of target area determines model training method characterized by comprising

By described image area zoom to fixed size, training sample set is formed by the image-region of multiple fixed sizes Close, and the image-region in the training sample set is marked with multiple key points, wherein the multiple key point in In the target area；

Each trained channel, up to meeting the respective condition of convergence, is obtained comprising more according to the training sample set stand-alone training The target area in a trained channel determines model；Wherein, the target area determines that multiple trained channels of model are respectively used to Predict displacement of the multiple key point with respect to reference point, wherein the reference point is located in the target area.

2. the method according to claim 1, wherein each trained channel is according to the training sample set Stand-alone training obtains the target area comprising multiple trained channels and determines model up to meeting the respective condition of convergence, comprising:

Determine the parameter in each trained channel；

Predictive displacement is calculated according to determining parameter, multiple key points of label and the reference point in each trained channel；

The parameter that the prediction error is greater than the default corresponding trained channel of error is readjusted, continues to repeat the corresponding instruction Practice the training process in channel, until the prediction error in the corresponding trained channel is less than or equal to the default error, terminates The training process in the corresponding trained channel.

3. method according to claim 1 or 2, which is characterized in that the target area is polygonal region, the key Point is the vertex of the polygonal region.

4. method according to claim 1 or 2, which is characterized in that the reference point is the central point in described image region.

5. method according to claim 1 or 2, which is characterized in that the target area is license plate area.

6. a kind of target area determines method characterized by comprising

By described image area zoom to fixed size；

The image-region input of the fixed size is determined that model is instructed using the described in any item target areas claim 1-5 Practice the target area that method is trained and determines model；

Determine that multiple trained channels of model predict to obtain the displacement of multiple opposite reference points respectively by the target area；Its In, the reference point is located in the target area；

Target area is determined according to the multiple key point.

7. according to the method described in claim 6, it is characterized in that, described determine target area according to the multiple key point, Include:

8. according to the method described in claim 6, it is characterized in that, the method also includes:

Expand predeterminable area around on the basis of described image region, and the image-region after will be enlarged by is scaled to fixed size As the corresponding image-region comprising target area of next frame video image.

9. according to the described in any item methods of claim 6-8, which is characterized in that the target area is license plate area.

10. a kind of target area determines model training apparatus characterized by comprising

Specimen discerning module obtains the image district comprising the target area for carrying out target area identification to sample image Domain；

Training set determining module, is used for by described image area zoom to fixed size, by the figure of multiple fixed sizes As region composition training sample set conjunction, and the image-region in the training sample set is marked with multiple key points, In, the multiple key point is in the target area；

Training set input module, for the training sample set to be inputted convolutional neural networks；Wherein, the convolutional Neural The convolutional layer of network includes parallel multiple convolution kernels, the corresponding trained channel of each convolution kernel；

Model training module, for each trained channel according to the training sample set stand-alone training until meeting respective receipts Condition is held back, the target area comprising multiple trained channels is obtained and determines model；Wherein, the target area determines the multiple of model Training channel is respectively used to predict displacement of the multiple key point with respect to reference point, wherein the reference point is located at the mesh It marks in region.

11. a kind of target area determining device characterized by comprising

Region identification module is obtained for carrying out target area identification to video frame currently entered comprising the target area Image-region；

Image scaling module is used for described image area zoom to fixed size；

Image input module, for the image-region input of the fixed size is described in any item using claim 1-5 The target area that target area determines that model training method is trained determines model；

Prediction module, for determining that multiple trained channels of model predict to obtain multiple opposite ginsengs respectively by the target area According to the displacement of point；Wherein, the reference point is located in the target area；

Key point determining module, for determining multiple passes according to the displacement and the reference point coordinate of the multiple opposite reference point Key point；

12. a kind of electronic equipment, comprising:

Memory, for storing non-transitory computer-readable instruction；And

Processor, for running the computer-readable instruction, so that realizing according to claim 1-5 when the processor executes Any one of described in target area determine model training method.

13. a kind of computer readable storage medium, for storing non-transitory computer-readable instruction, when the non-transitory meter When calculation machine readable instruction is executed by computer, so that the computer perform claim requires target described in any one of 1-5 Region determines model training method.

14. a kind of electronic equipment, comprising:

Memory, for storing non-transitory computer-readable instruction；And

Processor, for running the computer-readable instruction, so that realizing when the processor executes according to claim 6-9 Any one of described in target area determine method.

15. a kind of computer readable storage medium, for storing non-transitory computer-readable instruction, when the non-transitory meter When calculation machine readable instruction is executed by computer, so that the computer perform claim requires target described in any one of 6-9 Area determination method.