CN110443824A

CN110443824A - Method and apparatus for generating information

Info

Publication number: CN110443824A
Application number: CN201810410251.5A
Authority: CN
Inventors: 胡馨文
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-05-02
Filing date: 2018-05-02
Publication date: 2019-11-12

Abstract

The embodiment of the present application discloses the method and apparatus for generating information.One specific embodiment of this method includes: to obtain to carry out shooting obtained video to target to be tracked；Classify to every frame image in video, wherein image category includes detection class and tracking class；For every frame image in video, if the image is detection class image, the detection class image is input to target detection model trained in advance, obtain location information of the target to be tracked in the detection class image, if the image is tracking class image, location information of the target to be tracked in the previous frame image of the tracking class image is tracked, determines location information of the target to be tracked in the tracking class image；Summarize target to be tracked and detect the location information in class image and each frame tracking class image in each frame, generates the location information in the every frame image of target to be tracked in video.This embodiment improves the accuracy that detecting and tracking is carried out to target.

Description

Method and apparatus for generating information

Technical field

The invention relates to field of computer technology, and in particular to the method and apparatus for generating information.

Background technique

Target detection technique is now widely used for smart home and industrial safety-security area in video.Target detection technique is logical It crosses to analyze the video of camera acquisition frame by frame and obtains target area, while the motion profile of target can also be obtained with this.

Target detection technique is normally based on the detection of mobile target in existing video.Specifically, pass through static image As foreground image, analyze the difference of each frame image and foreground image in real time, to the pixel of each position establish its belong to it is quiet The only probability of target or mobile target, merges into target area for the position for belonging to mobile target.

Summary of the invention

The embodiment of the present application proposes the method and apparatus for generating information.

In a first aspect, the embodiment of the present application provides a kind of method for generating information, comprising: obtain to mesh to be tracked Mark carries out shooting obtained video；Classify to every frame image in video, wherein image category include detection class and with Track class；For every frame image in video, if the image is detection class image, which is input to training in advance Target detection model obtains location information of the target to be tracked in the detection class image, right if the image is tracking class image Location information of the target to be tracked in the previous frame image of the tracking class image is tracked, determine target to be tracked this with Location information in track class image, wherein target detection model is for detecting the position of target in the picture；Summarize mesh to be tracked The location information being marked in each frame detection class image and each frame tracking class image, generates the every frame figure of target to be tracked in video Location information as in.

In some embodiments, target detection model is trained as follows obtains: training sample set is obtained, Wherein, training sample includes the markup information of sample image and sample image, and there are sample object, sample images in sample image Markup information be that obtained information is labeled to the position of the sample object in sample image；It will be in training sample set Sample image as input, using the markup information of the sample image of input as export, to initial target detection model progress Training, obtains target detection model.

In some embodiments, the location information to target to be tracked in the previous frame image of the tracking class image carries out Tracking, determines location information of the target to be tracked in the tracking class image, comprising: utilizes the target following based on average drifting Algorithm analyzes location information of the target to be tracked in the previous frame image of the tracking class image and the tracking class image, Obtain location information of the target to be tracked in the tracking class image.

In some embodiments, classify to every frame image in video, comprising: according to shooting sequence in video Every frame image is ranked up；The image for coming odd bits is divided into detection class image, the image for coming even bit is divided into Track class image.

In some embodiments, classify to every frame image in video, comprising: make the first frame image in video For image to be classified；It executes following classifying step: image to be classified being divided into detection class image, by the rear pre- of image to be classified If number frame image is divided into tracking class image, determines in video with the presence or absence of still non-classified image, not yet divide if it does not exist The image of class, then completion of classifying；There is still non-classified image in video in response to determining, by the rear present count of image to be classified Mesh adds a frame image as image to be classified, and continues to execute classifying step.

Second aspect, the embodiment of the present application provide a kind of for generating the device of information, comprising: acquiring unit is matched Acquisition is set to target to be tracked is carried out to shoot obtained video；Taxon is configured to every frame image in video Classify, wherein image category includes detection class and tracking class；Detecting and tracking unit is configured to for every in video The detection class image is input in advance trained target detection model if the image is detection class image by frame image, obtain to Location information of the target in the detection class image is tracked, if the image is tracking class image, to target to be tracked in the tracking Location information in the previous frame image of class image is tracked, and determines position letter of the target to be tracked in the tracking class image Breath, wherein target detection model is for detecting the position of target in the picture；Generation unit is configured to summarize mesh to be tracked The location information being marked in each frame detection class image and each frame tracking class image, generates the every frame figure of target to be tracked in video Location information as in.

In some embodiments, detecting and tracking unit is further configured to: utilizing the target following based on average drifting Algorithm analyzes location information of the target to be tracked in the previous frame image of the tracking class image and the tracking class image, Obtain location information of the target to be tracked in the tracking class image.

In some embodiments, taxon is further configured to: according to shooting sequence to every frame image in video It is ranked up；The image for coming odd bits is divided into detection class image, the image for coming even bit is divided into tracking class figure Picture.

In some embodiments, taxon is further configured to: using the first frame image in video as to be sorted Image；It executes following classifying step: image to be classified being divided into detection class image, by the rear preset number frame of image to be classified Image is divided into tracking class image, determines and whether there is still non-classified image in video, if it does not exist still non-classified image, Then classify completion；There is still non-classified image in video in response to determining, the rear preset number of image to be classified is added into a frame Image is as image to be classified, and continues to execute classifying step.

The third aspect, the embodiment of the present application provide a kind of electronic equipment, which includes: one or more processing Device；Storage device is stored thereon with one or more programs；When one or more programs are executed by one or more processors, So that one or more processors realize the method as described in implementation any in first aspect.

Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should The method as described in implementation any in first aspect is realized when computer program is executed by processor.

Method and apparatus provided by the embodiments of the present application for generating information shoot target to be tracked obtaining After obtained video, it can classify to the image in video；Later, image in video is detection class image In the case of, it can use target detection model and obtain location information of the target to be tracked in detection class image；In video It, can be to location information of the target to be tracked in the previous frame image of tracking class image in the case that image is tracking class image It is tracked, to obtain location information of the target to be tracked in tracking class image.Exist in this way, target to be tracked can be obtained The location information in every frame image in video.Image in video is divided into detection class image and tracking class image, is utilized Target detection model detects the detection class image in video, is carried out using tracking to the tracking class image in video Tracking improves the accuracy that detecting and tracking is carried out to target.

Detailed description of the invention

By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:

Fig. 1 is that this application can be applied to exemplary system architectures therein；

Fig. 2 is the flow chart according to one embodiment of the method for generating information of the application；

Fig. 3 is provided by Fig. 2 for generating the flow chart of an application scenarios of the method for information；

Fig. 4 is the flow chart according to another embodiment of the method for generating information of the application；

Fig. 5 is the structural schematic diagram according to one embodiment of the device for generating information of the application；

Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.

Specific embodiment

The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

Fig. 1 is shown can be using the method for generating information of the application or the implementation of the device for generating information The exemplary system architecture 100 of example.

As shown in Figure 1, may include filming apparatus 101, network 102 and server 103 in system architecture 100.Network 102 To provide the medium of communication link between filming apparatus 101 and server 103.Network 102 may include various connection classes Type, such as wired, wireless communication link or fiber optic cables etc..

Filming apparatus 101 can be interacted by network 102 with server 103, to receive or send message etc..Filming apparatus 101 can be hardware, be also possible to software.When filming apparatus 101 is hardware, can be with the various of video capture function Electronic equipment, including but not limited to camera, video camera and camera etc..When filming apparatus 101 is software, can install In above-mentioned cited electronic equipment.Multiple softwares or software module may be implemented into it, and single software also may be implemented into Or software module.It is not specifically limited herein.

Server 103 can be to provide the server of various services, such as at the video uploaded to filming apparatus 101 The video processing service device of reason.Video processing service device can shoot obtained view to carrying out to target to be tracked of receiving Frequency etc. carries out the processing such as analyzing, and generates processing result (such as position in the every frame image of target to be tracked in video is believed Breath).

It should be noted that server 103 can be hardware, it is also possible to software.It, can when server 103 is hardware To be implemented as the distributed server cluster that multiple servers form, individual server also may be implemented into.When server 103 is When software, multiple softwares or software module (such as providing Distributed Services) may be implemented into, also may be implemented into single Software or software module.It is not specifically limited herein.

It should be noted that the method provided by the embodiment of the present application for generating information is generally held by server 103 Row, correspondingly, the device for generating information is generally positioned in server 103.

It should be understood that the number of filming apparatus, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of filming apparatus, network and server.

With continued reference to Fig. 2, it illustrates the processes according to one embodiment of the method for generating information of the application 200.The method for being used to generate information, comprising the following steps:

Step 201, it obtains and target to be tracked is carried out to shoot obtained video.

It in the present embodiment, can be with for generating the executing subject (such as server 103 shown in FIG. 1) of the method for information Target to be detected is carried out to shoot obtained view from filming apparatus acquisition by wired connection mode or radio connection Frequently.Wherein, filming apparatus can be the various electronic equipments with video capture function, including but not limited to camera, camera shooting Machine, camera etc..Target to be tracked is transportable various objects, including but not limited to people, animal, vehicle etc..Example Such as, the people that filming apparatus can be specified to some shoots, and captured video is uploaded to above-mentioned executing subject in real time In.

Step 202, classify to every frame image in video.

In the present embodiment, above-mentioned executing subject can classify to every frame image in video.For example, can be according to Shooting sequence successively classifies to every frame image in video.Wherein, the classification of image may include detection class and tracking class. Detection class image is the subsequent image for carrying out target detection.Tracking class image is the subsequent image for carrying out target following.

In some optional implementations of the present embodiment, above-mentioned executing subject can be in the following way in video Every frame image classify:

Firstly, using the first frame image in video as image to be classified.

Later, it executes following classifying step: image to be classified being divided into detection class image, by the rear pre- of image to be classified If number frame (such as two frames) image is divided into tracking class image.

Then, it is determined that whether there is still non-classified image in video.

Still non-classified image if it does not exist, classification are completed.

The rear preset number of image to be classified is added a frame image as figure to be sorted by still non-classified image if it exists Picture, and continue to execute classifying step.

As an example, above-mentioned executing subject can draw the first frame image in video if in video including six frame images It is divided into detection class image, rear two frame (i.e. the second frame and third frame) image of first frame image is divided into tracking class image, after It is continuous that 4th frame image is divided into detection class image, rear two frame (i.e. the 5th frame and the 6th frame) image of the 4th frame image is divided To track class image.A frame is divided every two field pictures and detects class image, this two field pictures at interval are divided into tracking class figure Picture, until the image in video is fully sorted.

Step 203, every frame image in video is inputted the detection class image if the image is detection class image To target detection model trained in advance, location information of the target to be tracked in the detection class image is obtained.

In the present embodiment, above-mentioned executing subject can be detected or be tracked to every frame image in video.For example, can Every frame image in video is successively detected or be tracked according to shooting sequence.

In the present embodiment, for every frame image in video, if the image is detection class image, above-mentioned executing subject can To be detected using target detection model to the detection class image, to obtain position of the target to be tracked in detection class image Confidence breath.Wherein, target detection model can be used for detecting the position of target in the picture, and characterization image and target are in the picture Location information between corresponding relationship.The location information of target in the picture can be the position letter of the target frame including target Breath.Target frame can be the rectangle frame or square-shaped frame of pre-set dimension.

In some optional implementations of the present embodiment, those skilled in the art can be to great amount of samples image and sample Location information of this target in sample image is for statistical analysis, is stored with multiple sample images and sample mesh to make It is marked on the mapping table of the location information in sample image, and using the mapping table as target detection model.It is above-mentioned to hold Row main body can calculate the similarity of multiple sample images in the detection class image and the mapping table, and close from the correspondence Be found out in table sample object with the location information in the highest sample image of detection class image similarity, as to Location information of the track target in the detection class image.

In some optional implementations of the present embodiment, target detection model, which can be, utilizes various machine learning sides Method and training sample carry out obtained from Training existing machine learning model (such as various artificial neural networks). Specifically, above-mentioned executing subject can obtain training sample set first, then make the sample image in training sample set Initial target detection model is trained, target is obtained using the markup information of the sample image of input as output for input Detection model.Wherein, each training sample in training sample set may include the mark letter of sample image and sample image Breath.May exist sample object in sample image.The markup information of sample image can be to the sample object in sample image Position be labeled obtained information.For example, can in sample image label target frame, include sample mesh in target frame Mark, then using the sample image of label target frame as the markup information of sample image.Initial target detection model can be not Target detection model (such as Faster R-cnn, R-FCN and SSD etc.) trained or that training is not completed.Here, for without Trained target detection model, parameters (for example, weighting parameter and offset parameter) with some different small random numbers into Row initialization." small random number " is used to guarantee that model will not enter saturation state because weight is excessive, so as to cause failure to train, " difference " is used to guarantee that model can normally learn.For the target detection model that training is not completed, parameters can be with Be be adjusted rear parameter, but this training complete target detection model detection effect usually not yet meet it is pre-set Constraint condition.

Step 203 ', for every frame image in video, if the image is tracking class image, to target to be tracked this with Location information in the previous frame image of track class image is tracked, and determines position of the target to be tracked in the tracking class image Information.

In the present embodiment, for every frame image in video, if the image is tracking class image, above-mentioned executing subject can To be tracked using tracking to the tracking class image, to obtain position of the target to be tracked in the tracking class image Information.Specifically, above-mentioned executing subject can believe position of the target to be tracked in the previous frame image of the tracking class image Breath is tracked, so that it is determined that location information of the target to be tracked in the tracking class image.For example, above-mentioned executing subject can be with Extracting in the region indicated by the location information in the previous frame image of the tracking class image from target to be tracked first should be to Clarification of objective is tracked, then matches the clarification of objective to be tracked in the tracking class image, it should so that it is determined that going out There are the regions of the clarification of objective to be tracked in tracking class image, and exist the location information in the region as target to be tracked Location information in the tracking class image.

Step 204, summarize target to be tracked and detect the location information in class image and each frame tracking class image in each frame, it is raw At the location information in the every frame image of target to be tracked in video.

In the present embodiment, above-mentioned executing subject can in aggregation step 203 utilize target detection model inspection go out to Track location information and the step 203 ' middle to be tracked mesh that using tracking tracks out of the target in each frame detection class image The location information being marked in each frame tracking class image, to obtain the position letter in the every frame image of target to be tracked in video Breath.Here, above-mentioned executing subject can also be according to shooting sequence successively every frame to the target to be tracked in the video The central point in region indicated by the location information in image is attached, to generate the tracking rail of the target to be tracked Mark.

With continued reference to the process that Fig. 3, Fig. 3 are according to the application scenarios of the method for generating information of the present embodiment 300.In the application scenarios of Fig. 3, as shown in 301, in the case where current time is in preset sleep period, room The video of user in the camera acquisition room of interior installation, and the video of user is sent to Cloud Server；Such as 302 institutes Show, every frame image in the video of user is divided into detection class image or tracking class image by Cloud Server；As shown in 303, cloud Server is successively detected or is tracked to every frame image in video according to shooting sequence, for detecting class image, will test Class image is input to target detection model trained in advance, location information of the user in detection class image is obtained, for tracking Class image tracks location information of the user in the previous frame image of tracking class image, determines user in tracking class figure Location information as in；As illustrated at 304, server summarizes position of the user in each frame detection class image and each frame tracking class image Confidence breath, generates the motion profile of user；As shown in 305, server analyzes the motion profile of user, determines that user exists It is moved in one lesser region, at this point, illustrating that user may rest；As shown in 306, intelligence of the server into room The control equipment of household sends the instruction by the mode adjustment of air-conditioning for sleep pattern；As shown by 307, control equipment is into room Air-conditioning send and be adjusted to the instruction of sleep pattern；As illustrated at 308, its mode adjustment is sleep pattern by air-conditioning.

Method provided by the embodiments of the present application for generating information, obtained by acquisition shoots target to be tracked Video after, can classify to the image in video；Later, the case where image in video is detection class image Under, it can use target detection model and obtain location information of the target to be tracked in detection class image, image in video In the case where being tracking class image, location information of the target to be tracked in the previous frame image of tracking class image can be carried out Tracking, to obtain location information of the target to be tracked in tracking class image.In this way, target to be tracked can be obtained in video In every frame image in location information.Image in video is divided into detection class image and tracking class image, utilizes target Detection model detects the detection class image in video, using the tracking class image in tracking year video carry out with Track improves the accuracy that detecting and tracking is carried out to target.

With further reference to Fig. 4, it illustrates according to another embodiment of the method for generating information of the application Process 400.The method for being used to generate information, comprising the following steps:

Step 401, it obtains and target to be tracked is carried out to shoot obtained video.

In the present embodiment, the basic phase of operation of the concrete operations of step 401 and step 201 in embodiment shown in Fig. 2 Together, details are not described herein.

Step 402, every frame image in video is ranked up according to shooting sequence.

In the present embodiment, above-mentioned executing subject can be ranked up every frame image in video according to shooting sequence. For example, the sequencing according to shooting is ranked up the image in video from front to back.

Step 403, by the image for coming odd bits be divided into detection class image, by the image for coming even bit be divided into Track class image.

In the present embodiment, the image for coming odd bits can be divided into detection class image by above-mentioned executing subject, will be arranged Tracking class image is divided into the image of even bit.Specifically, above-mentioned executing subject can be to the image in the video after sequence Reference number, for example, the image labeling to make number one is 1, coming deputy image labeling is 2, comes the figure of third position As being labeled as 3, and so on, until the image in video by until all marking.Then, above-mentioned executing subject can will mark It is divided into detection class image for the image of odd indexed, the image for being labeled as even number serial number is divided into tracking class image.

Step 404, every frame image in video is inputted the detection class image if the image is detection class image To target detection model trained in advance, location information of the target to be tracked in the detection class image is obtained.

In the present embodiment, the basic phase of operation of the concrete operations of step 404 and step 203 in embodiment shown in Fig. 2 Together, details are not described herein.

Step 404 ', for every frame image in video, if the image is tracking class image, using based on average drifting Location information and the tracking class image of the target tracking algorism to target to be tracked in the previous frame image of the tracking class image It is analyzed, obtains location information of the target to be tracked in the tracking class image.

In the present embodiment, for every frame image in video, if the image is tracking class image, above-mentioned executing subject can With using based on average drifting (MeanShift) target tracking algorism to target to be tracked the tracking class image former frame Location information and the tracking class image in image are analyzed, to obtain position of the target to be tracked in the tracking class image Confidence breath.

Firstly, above-mentioned executing subject can be according to position of the target to be tracked in the previous frame image of the tracking class image The instruction of information finds target to be tracked in the previous frame image of the tracking class image.Then, above-mentioned executing subject can be with Target to be tracked is described.Wherein, this description is that target area to be tracked is converted to color H SV (Hue Saturation Value, tone saturation degree lightness) space, then obtain the distribution histogram in the channel H.There is this to describe it Afterwards, above-mentioned executing subject can be found and the region as this description in the tracking class image.Specifically, above-mentioned execution master Body can measure the similarity of the candidate region and target area found with a similar function.Wherein, similar function value is got over Illustrate that the candidate region found and target area are more similar greatly, therefore, the candidate region of the maximum similar function value found is exactly Region of the target to be tracked in the tracking class image.Here, above-mentioned executing subject can pass through the target based on average drifting Constantly iteration obtains the region of maximum similar function value to track algorithm.The effect of target tracking algorism based on average drifting can It is constantly moved with allowing search window constantly to compare the maximum direction of color change with candidate family to object module, to the last two Secondary moving distance is less than threshold value, that is, finds region of the target to be tracked in the tracking class image.Wherein, by calculating target area The available description as described in object module and candidate family of the characteristic value probability of pixel in domain and candidate region.

It should be noted that being the known skill studied and applied extensively at present with the target tracking algorism based on average drifting Art, details are not described herein.

Step 405, summarize target to be tracked and detect the location information in class image and each frame tracking class image in each frame, it is raw At the location information in the every frame image of target to be tracked in video.

In the present embodiment, the basic phase of operation of the concrete operations of step 405 and step 204 in embodiment shown in Fig. 2 Together, details are not described herein.

Figure 4, it is seen that the method for generating information compared with the corresponding embodiment of Fig. 2, in the present embodiment Process 400 highlight using the target tracking algorism based on average drifting to tracking class image track the step of.As a result, It is spent in the scheme of the present embodiment description since the target tracking algorism based on average drifting tracks every frame image Time is shorter (typically not greater than 1ms), is tracked using one frame of interval using the target tracking algorism based on average drifting Mode improves the efficiency that detecting and tracking is carried out to target.

With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides one kind for generating letter One embodiment of the device of breath, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.

As shown in figure 5, the present embodiment may include: acquiring unit 501, grouping sheet for generating the device 500 of information Member 502, detecting and tracking unit 503 and generation unit 504.Wherein, acquiring unit 501 are configured to obtain to target to be tracked It carries out shooting obtained video；Taxon 502 is configured to classify to every frame image in video, wherein image Classification includes detection class and tracking class；Detecting and tracking unit 503 is configured to for every frame image in video, if the image It is detection class image, which is input to target detection model trained in advance, obtains target to be tracked in the inspection Survey class image in location information, if the image be tracking class image, to target to be tracked the tracking class image former frame Location information in image is tracked, and determines location information of the target to be tracked in the tracking class image, wherein target inspection Model is surveyed for detecting the position of target in the picture；Generation unit 504 is configured to summarize target to be tracked and detects in each frame Location information in class image and each frame tracking class image generates the position letter in the every frame image of target to be tracked in video Breath.

In the present embodiment, in the device 500 for generating information: acquiring unit 501, taxon 502, detecting and tracking The specific processing of unit 503 and generation unit 504 and its brought technical effect can be respectively with reference in Fig. 2 corresponding embodiments The related description of step 201, step 202, step 203 and 203 ' and step 204, details are not described herein.

In some optional implementations of the present embodiment, target detection model can be trained as follows It arrives: obtaining training sample set, wherein training sample includes the markup information of sample image and sample image, sample image In there are sample object, the markup information of sample image be the position of the sample object in sample image is labeled obtained by Information；Using the sample image in training sample set as input, using the markup information of the sample image of input as export, Initial target detection model is trained, target detection model is obtained.

In some optional implementations of the present embodiment, detecting and tracking unit 503 can be further configured to: benefit Position of the target to be tracked in the previous frame image of the tracking class image is believed with the target tracking algorism based on average drifting Breath and the tracking class image are analyzed, and location information of the target to be tracked in the tracking class image is obtained.

In some optional implementations of the present embodiment, taxon 502 can be further configured to: according to bat Sequence is taken the photograph to be ranked up every frame image in video；The image for coming odd bits is divided into detection class image, idol will be come The image of numerical digit is divided into tracking class image.

In some optional implementations of the present embodiment, taxon 502 can be further configured to: by video In first frame image as image to be classified；It executes following classifying step: image to be classified is divided into detection class image, it will The rear preset number frame image of image to be classified is divided into tracking class image, determines in video with the presence or absence of still non-classified figure Picture, if it does not exist still non-classified image, then completion of classifying；There is still non-classified image in video in response to determining, it will be to The rear preset number of classification image adds a frame image as image to be classified, and continues to execute classifying step.

Below with reference to Fig. 6, it illustrates the computer systems 600 for the electronic equipment for being suitable for being used to realize the embodiment of the present application Structural schematic diagram.Electronic equipment shown in Fig. 6 is only an example, function to the embodiment of the present application and should not use model Shroud carrys out any restrictions.

As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.

I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.；It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.；Storage section 608 including hard disk etc.； And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon Computer program be mounted into storage section 608 as needed.

Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media 611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer-readable medium either the two any combination.Computer-readable medium for example can be --- but it is unlimited In system, device or the device of --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or any above combination.It calculates The more specific example of machine readable medium can include but is not limited to: electrical connection, portable meter with one or more conducting wires Calculation machine disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In this application, computer-readable medium, which can be, any includes or storage program has Shape medium, the program can be commanded execution system, device or device use or in connection.And in the application In, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, wherein Carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to electric Magnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Jie Any computer-readable medium other than matter, the computer-readable medium can be sent, propagated or transmitted for being held by instruction Row system, device or device use or program in connection.The program code for including on computer-readable medium It can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. or above-mentioned any conjunction Suitable combination.

The calculating of the operation for executing the application can be write with one or more programming languages or combinations thereof Machine program code, described program design language include object-oriented programming language-such as Java, Smalltalk, C+ +, further include conventional procedural programming language-such as " C " language or similar programming language.Program code can Fully to execute, partly execute on the user computer on the user computer, be executed as an independent software package, Part executes on the remote computer or executes on a remote computer or server completely on the user computer for part. In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service Provider is connected by internet).

Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.

Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include acquiring unit, taxon, detecting and tracking unit and generation unit.Wherein, the title of these units is under certain conditions simultaneously The restriction to the unit itself is not constituted, for example, acquiring unit is also described as, " acquisition shoots target to be tracked The unit of obtained video ".

As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in electronic equipment described in above-described embodiment；It is also possible to individualism, and without in the supplying electronic equipment. Above-mentioned computer-readable medium carries one or more program, when said one or multiple programs are held by the electronic equipment When row, so that the electronic equipment: acquisition carries out target to be tracked to shoot obtained video；To every frame image in video into Row classification, wherein image category includes detection class and tracking class；For every frame image in video, if the image is detection class The detection class image is input to target detection model trained in advance, obtains target to be tracked in the detection class image by image In location information, if the image be tracking class image, to target to be tracked in the previous frame image of the tracking class image Location information is tracked, and determines location information of the target to be tracked in the tracking class image, wherein target detection model is used In the position of detection target in the picture；Summarize target to be tracked and detects the position in class image and each frame tracking class image in each frame Confidence breath, generates the location information in the every frame image of target to be tracked in video.

Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims

1. a kind of method for generating information, comprising:

Acquisition carries out target to be tracked to shoot obtained video；

Classify to every frame image in the video, wherein image category includes detection class and tracking class；

Preparatory instruction is input to by the detection class image if the image is detection class image for every frame image in the video Experienced target detection model obtains location information of the target to be tracked in the detection class image, if the image is tracking Class image tracks location information of the target to be tracked in the previous frame image of the tracking class image, determines institute State location information of the target to be tracked in the tracking class image, wherein the target detection model is being schemed for detecting target Position as in；

Summarize the target to be tracked and detect the location information in class image and each frame tracking class image in each frame, generate it is described to Track location information of the target in every frame image in the video.

2. according to the method described in claim 1, wherein, the target detection model is trained as follows to be obtained:

Obtain training sample set, wherein training sample includes the markup information of sample image and sample image, in sample image There are sample object, the markup information of sample image is that gained is labeled to the position of the sample object in sample image The information arrived；

Using the sample image in the training sample set as input, using the markup information of the sample image of input as defeated Out, initial target detection model is trained, obtains the target detection model.

3. according to the method described in claim 1, wherein, it is described to the target to be tracked the tracking class image former frame Location information in image is tracked, and determines location information of the target to be tracked in the tracking class image, comprising:

Using the target tracking algorism based on average drifting to the target to be tracked the tracking class image previous frame image In location information and the tracking class image analyzed, obtain the target to be tracked in the tracking class image position letter Breath.

4. method described in one of -3 according to claim 1, wherein every frame image in the video is classified, Include:

Every frame image in the video is ranked up according to shooting sequence；

The image for coming odd bits is divided into detection class image, the image for coming even bit is divided into tracking class image.

5. method described in one of -3 according to claim 1, wherein every frame image in the video is classified, Include:

Using the first frame image in the video as image to be classified；

It executes following classifying step: image to be classified being divided into detection class image, by the rear preset number frame of image to be classified Image is divided into tracking class image, determines with the presence or absence of still non-classified image in the video, still non-classified if it does not exist Image, then completion of classifying；

In response to there is still non-classified image in the determination video, the rear preset number of image to be classified is added into a frame image As image to be classified, and continue to execute the classifying step.

6. a kind of for generating the device of information, comprising:

Acquiring unit is configured to acquisition and carries out shooting obtained video to target to be tracked；

Taxon is configured to classify to every frame image in the video, wherein image category include detection class and Track class；

Detecting and tracking unit is configured to for every frame image in the video, if the image is detection class image, by the inspection It surveys class image and is input to target detection model trained in advance, obtain position of the target to be tracked in the detection class image Information, if the image is tracking class image, to position of the target to be tracked in the previous frame image of the tracking class image Information is tracked, and determines location information of the target to be tracked in the tracking class image, wherein the target detection mould Type is for detecting the position of target in the picture；

Generation unit is configured to summarize the target to be tracked in each frame and detects the position in class image and each frame tracking class image Confidence breath generates location information of the target to be tracked in every frame image in the video.

7. device according to claim 6, wherein the target detection model is trained as follows to be obtained:

8. device according to claim 6, wherein the detecting and tracking unit is further configured to:

9. the device according to one of claim 6-8, wherein the taxon is further configured to:

Every frame image in the video is ranked up according to shooting sequence；

10. the device according to one of claim 6-8, wherein the taxon is further configured to:

Using the first frame image in the video as image to be classified；

11. a kind of electronic equipment, comprising:

One or more processors；

Storage device is stored thereon with one or more programs,

When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method as claimed in any one of claims 1 to 5.

12. a kind of computer-readable medium, is stored thereon with computer program, wherein the computer program is held by processor Such as method as claimed in any one of claims 1 to 5 is realized when row.