CN109815936A

CN109815936A - A kind of target object analysis method and device, computer equipment and storage medium

Info

Publication number: CN109815936A
Application number: CN201910130040.0A
Authority: CN
Inventors: 颜佺
Original assignee: Shenzhen Sensetime Technology Co Ltd
Current assignee: Shenzhen Sensetime Technology Co Ltd
Priority date: 2019-02-21
Filing date: 2019-02-21
Publication date: 2019-05-28
Anticipated expiration: 2039-02-21
Also published as: CN109815936B

Abstract

The embodiment of the present application provides a kind of target object analysis method and device, computer equipment and storage medium, wherein the described method includes: determining the photographing information of the video to be analyzed obtained；It wherein, include the image of at least one target object in the video to be analyzed；According to the photographing information, the preset model handled the video to be analyzed is determined, with the sum for the target object for including in the determination video to be analyzed；Wherein, the preset model is preset target detection model or preset object count model, it is different from using the photographing information of video to be analyzed of the preset object count model treatment using the photographing information of the video to be analyzed of the preset target detection model treatment.

Description

A kind of target object analysis method and device, computer equipment and storage medium

Technical field

The invention relates to the computer vision communications field, relate to, but are not limited to a kind of target object analysis method and Device, computer equipment and storage medium.

Background technique

Population analysis is a popular application field of intelligent security guard, in the relevant technologies based on depth convolutional neural networks Crowd's counting technology can detecte out the crowd density and crowd's foreground picture of video frame, crowd in the stroke analysis video frame Head and shoulder information exports crowd density figure according to head and shoulder information；Based on above-mentioned realization principle, which is difficult in different videos It is applied in scene；If the human body area in video frame is too big, the count results that will lead to output are on the high side, in scene background color and The similar segmentation that can cause foreground picture of color of object is inaccurate, and the scene angle of video frame also influences whether final output result.

Summary of the invention

In view of this, the embodiment of the present application provides a kind of target object analysis method and device, computer equipment and storage Medium.

The technical solution of the embodiment of the present application is achieved in that

The embodiment of the present application provides a kind of target object analysis method, which comprises

Determine the photographing information of the video to be analyzed obtained；It wherein, include at least one mesh in the video to be analyzed Mark the image of object；

According to the photographing information, determine the preset model handled the video to be analyzed, with determine it is described to The sum for the target object for including in analysis video；Wherein, the preset model is preset target detection model or preset Object count model, using the photographing information of the video to be analyzed of the preset target detection model treatment, and described in use The photographing information of the video to be analyzed of preset object count model treatment is different.

In the above-mentioned methods, the photographing information of the video to be analyzed, comprising: shooting field belonging to the video to be analyzed The shooting period of scape and/or the video to be analyzed.

In the above-mentioned methods, described according to the photographing information, determination handles the video to be analyzed pre- If before model, the method also includes:

Use the preset object count model or preset object count model for initialization model；

Accordingly, according to the photographing information, the preset model handled the video to be analyzed is adjusted.

In the above-mentioned methods, described according to the photographing information, it adjusts and is preset to what the video to be analyzed was handled Model, comprising:

If scene belonging to the video to be analyzed is included in preset scene set and/or the view to be analyzed Period belonging to frequency is adjusted to the preset target detection model in preset period of time, by the initialization model；

The video to be analyzed is handled using the preset target detection model；

If scene belonging to the video to be analyzed is not included in preset scene set, and the video to be analyzed The affiliated period is adjusted to the preset object count model not in preset period of time, by the initialization model.

In the above-mentioned methods, according to the photographing information, the preset model handled the video to be analyzed is determined Before, the method also includes:

The video to be analyzed is decoded using Video Decoder, obtains continuous multiple image.

In the above-mentioned methods, it determines and the video to be analyzed is handled using the preset target detection model, Sum with the target object for including in the determination video to be analyzed, comprising:

It is scanned, is obtained described using each frame image of the preset target detection model to the multiple image The physical trait of each target object；

According to the physical trait of each target object, the detection block of each target object is generated；

According to the number of the detection block, the sum for the target object for including in the video to be analyzed is determined.

In the above-mentioned methods, it is described using the preset target detection model to each frame image of the multiple image It is scanned, obtains the physical trait of each target object, comprising:

Each frame image is scanned according to preset step-length using preset target detection model, determines each frame figure The physical trait of each target object occurred as in.

In the above-mentioned methods, the multiple image includes M frame image, and M is the integer more than or equal to 2, described according to The physical trait of each target object generates the detection block of each target object, comprising:

It is scanned using i-th frame image of the preset target detection model to the M frame image, determines i-th frame The physical trait for N number of target object that image includes；Wherein, i and N is the integer greater than 0, and i is less than or equal to M；

If the physical trait of j-th of target object of N number of target object, and in addition to the i-th frame image The physical trait of target object is different in other frame images, generates the detection block of the j target object；Wherein, j is greater than 0 Integer less than or equal to N.

In the above-mentioned methods, it is described it is determining using the preset object count model to the video to be analyzed at Reason, with the sum for the target object for including in the determination video to be analyzed, comprising:

The video to be analyzed is handled using preset object count model, before obtaining the video to be analyzed The groups of objects density map of scape segmentation figure and the video to be analyzed；

According to the foreground segmentation figure and the target object group density map, the mesh for including in the video to be analyzed is determined Mark the sum of object.

In the above-mentioned methods, described that the video to be analyzed is handled using preset object count model, it obtains The groups of objects density map of the foreground segmentation figure of the video to be analyzed and the video to be analyzed, comprising:

Edge is carried out to frame image each in the multiple image of the video to be analyzed using preset object count model Detection, determines the region that the head of each target object in each frame image is covered；

To in each frame image target object and background be split, obtain the prospect point of each frame image Cut figure；

According to the region that the head of each target object in each frame image is covered, generate described every for characterizing The groups of objects density map of target object density in one frame image.

In the above-mentioned methods, according to the number of the detection block, the target object for including in the video to be analyzed is determined Sum, comprising:

It, will be described if the number of the detection block in the i-th frame image of the video to be analyzed is greater than preset quantity threshold value Preset target detection models switching is the preset object count model；

Using the preset object count model, in the video to be analyzed without the preset target detection The first remaining video that model is handled is handled, and the sum of the target object is obtained.

In the above-mentioned methods, it is described utilize the preset object count model, in the video to be analyzed without The first remaining video that the preset target detection model is handled is handled, and the sum of the target object is obtained, Including

Using the preset object count model, determine first remaining video foreground segmentation subgraph and it is described to Analyze the groups of objects density subgraph of video；

According to the foreground segmentation subgraph and the groups of objects density subgraph, determines in first remaining video and include Second quantity of target object；

Second quantity is determined as to the sum of the target object.

In the above-mentioned methods, described according to the foreground segmentation figure and the target object group density map, determine it is described to The sum for the target object for including in analysis video, comprising:

According to target object in the foreground segmentation figure of the L-th frame image in the multiple image and the L-th frame image Groups of objects density map determines the third quantity for the target object that the L-th frame image includes；Wherein, L is the integer greater than 0；

It is described pre- by the preset object count models switching if the third quantity is less than preset quantity threshold value If target detection model；

Using the preset target detection model, in the video to be analyzed without the preset object count The second remaining video that model is handled is handled, and the sum of the target object is obtained.

In the above-mentioned methods, it is described utilize the preset target detection model, in the video to be analyzed without The second remaining video that the preset object count model is handled is handled, and the sum of the target object is obtained, Include:

Using the preset target detection model, the son detection of each target object in second remaining video is determined Frame；

According to the number of the sub- detection block, the 4th number of the target object for including in second remaining video is determined Amount；

4th quantity is determined as to the sum of the target object.

In the above-mentioned methods, the method also includes:

According to the target object number in numberical range belonging to the sum of the target object and the video to be analyzed For the duration of the sum, the alarm event to match with the duration and the numberical range is generated.

The embodiment of the present application provides a kind of target object analytical equipment, and described device includes: the first acquisition module and first Determining module, in which:

Described first obtains module, for determining the photographing information of the video to be analyzed obtained；Wherein, described to be analyzed It include the image of at least one target object in video；

First determining module, for what is handled according to the photographing information, determination the video to be analyzed Preset model, with the sum for the target object for including in the determination video to be analyzed；Wherein, the preset model is preset Target detection model or preset object count model, using the video to be analyzed of the preset target detection model treatment Photographing information is different from using the photographing information of video to be analyzed of the preset object count model treatment.

In above-mentioned apparatus, the photographing information of the video to be analyzed, comprising: shooting field belonging to the video to be analyzed The shooting period of scape and/or the video to be analyzed.

In above-mentioned apparatus, described device further include:

First initialization module is used to use the preset object count model or preset object count model is first Beginningization model；

Accordingly, first determining module, comprising: the first adjustment submodule, for according to the photographing information, adjustment The preset model that the video to be analyzed is handled.

In above-mentioned apparatus, the first adjustment submodule, comprising:

First judging unit, if be included in preset scene set for scene belonging to the video to be analyzed, And/or the period belonging to the video to be analyzed in preset period of time, the initialization model is adjusted to the preset mesh Mark detection model；

First processing units, for being handled using the preset target detection model the video to be analyzed；

Second judgment unit, if being not included in preset scene set for scene belonging to the video to be analyzed In, and the period belonging to the video to be analyzed not in preset period of time, the initialization model is adjusted to described preset Object count model.

In above-mentioned apparatus, described device further include:

First decoder module is obtained continuous more for being decoded using Video Decoder to the video to be analyzed Frame image.

In above-mentioned apparatus, the video to be analyzed is handled using the preset target detection model when determining When, first determining module, comprising:

First scanning submodule, for each frame figure using the preset target detection model to the multiple image As being scanned, the physical trait of each target object is obtained；

First generates submodule, for the physical trait according to each target object, generates each target pair The detection block of elephant；

First determines that submodule determines the mesh for including in the video to be analyzed for the number according to the detection block Mark the sum of object.

In above-mentioned apparatus, the first scanning submodule, comprising:

First scanning element, for scanning each frame figure according to preset step-length using preset target detection model Picture determines the physical trait of each target object occurred in each frame image.

In above-mentioned apparatus, the multiple image includes M frame image, and M is the integer more than or equal to 2, and described first generates Submodule, comprising:

Second scanning element, for being swept using i-th frame image of the preset target detection model to the M frame image It retouches, determines the physical trait for N number of target object that the i-th frame image includes；Wherein, i and N is the integer greater than 0, and i is small In equal to M；

First generation unit, if the physical trait of j-th of target object for N number of target object, with remove institute The physical trait for stating target object in other frame images except the i-th frame image is different, generates the detection of the j target object Frame；Wherein, j is the integer for being less than or equal to N greater than 0.

In above-mentioned apparatus, the video to be analyzed is handled using the preset object count model when determining When, first determining module, comprising:

Second judgment submodule is obtained for being handled using preset object count model the video to be analyzed To the foreground segmentation figure of the video to be analyzed and the groups of objects density map of the video to be analyzed；

Second determines submodule, described in determining according to the foreground segmentation figure and the target object group density map The sum for the target object for including in video to be analyzed.

In above-mentioned apparatus, the second judgment submodule, comprising:

First detection unit, for utilizing preset object count model to every in the multiple image of the video to be analyzed One frame image carries out edge detection, determines the region that the head of each target object in each frame image is covered；

First cutting unit, for in each frame image target object and background be split, obtain described The foreground segmentation figure of each frame image；

Second generation unit, the region for being covered according to the head of each target object in each frame image, Generate the groups of objects density map for characterizing target object density in each frame image.

In above-mentioned apparatus, first determines submodule, comprising:

First switch unit, if the number for the detection block in the i-th frame image of the video to be analyzed is greater than in advance It is the preset object count model by the preset target detection models switching if amount threshold；

The second processing unit, for utilize the preset object count model, in the video to be analyzed without The first remaining video that the preset target detection model is handled is handled, and the sum of the target object is obtained.

In above-mentioned apparatus, described the second processing unit, including

First determines subelement, for utilizing the preset object count model, determines first remaining video The groups of objects density subgraph of foreground segmentation subgraph and the video to be analyzed；

Second determines subelement, described in determining according to the foreground segmentation subgraph and the groups of objects density subgraph Second quantity of the target object for including in the first remaining video；

Third determines subelement, for second quantity to be determined as to the sum of the target object.

In above-mentioned apparatus, described second determines submodule, comprising:

First determination unit, for according to the L-th frame image in the multiple image foreground segmentation figure and the L-th frame The groups of objects density map of target object in image, determines the third quantity for the target object that the L-th frame image includes；Wherein, L For the integer greater than 0；

Second switch unit, if being less than preset quantity threshold value for the third quantity, by the preset target meter Exponential model is switched to the preset target detection model；

The second processing unit, for utilize the preset target detection model, in the video to be analyzed without The second remaining video that the preset object count model is handled is handled, and the sum of the target object is obtained.

In above-mentioned apparatus, described the second processing unit, comprising:

4th determines subelement, for utilizing the preset target detection model, determines in second remaining video The sub- detection block of each target object；

5th determines subelement, for the number according to the sub- detection block, determines in second remaining video and includes Target object the 4th quantity；

6th determines subelement, for the 4th quantity to be determined as to the sum of the target object.

In above-mentioned apparatus, described device further include:

First alarm module, for the numberical range according to belonging to the sum of the target object and the video to be analyzed In target object number be the sum duration, generate and the alarm thing that matches of the duration and the numberical range Part.

Accordingly, the embodiment of the present application provides a kind of computer storage medium, is stored in the computer storage medium Computer executable instructions after the computer executable instructions are performed, can be realized target pair provided by the embodiments of the present application Step in picture analysis method.

The embodiment of the present application provides a kind of computer equipment, and the computer equipment includes memory and processor, described Computer executable instructions are stored on memory, when the processor runs the computer executable instructions on the memory The step in target object analysis method provided by the embodiments of the present application can be achieved.

The embodiment of the present application provides a kind of target object analysis method and device, computer equipment and storage medium, wherein The photographing information of the video to be analyzed obtained is determined first；It wherein, include at least one target pair in the video to be analyzed The image of elephant；Then, according to the photographing information, the preset model handled the video to be analyzed is determined, with determination The sum for the target object for including in the video to be analyzed；To realize at different conditions, different moulds can be passed through Type preset model is determined the sum of target object in video to be analyzed, ensure that in the image less for target object, The sum of target object is determined more accurately.

Detailed description of the invention

The drawings herein are incorporated into the specification and forms part of this specification, and those figures show meet this public affairs The embodiment opened, and together with specification it is used to illustrate the technical solution of the disclosure.

Figure 1A is the composed structure schematic diagram of the embodiment of the present application network architecture；

Figure 1B is the implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 2A is the implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 2 B is another implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 2 C is the another implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 2 D is another implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 2 E is another implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 3 is the implementation process schematic diagram of the embodiment of the present application target object analysis method；

Fig. 4 is the composed structure schematic diagram of the embodiment of the present application target object analytical equipment；

Fig. 5 is the composed structure schematic diagram of the embodiment of the present application computer equipment.

Specific embodiment

To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the specific technical solution of invention is described in further detail.Following embodiment does not have to for illustrating the application To limit scope of the present application.

The present embodiment first provides a kind of network architecture, and Figure 1A is that the composed structure of the embodiment of the present application network architecture is illustrated Figure, as shown in Figure 1A, which includes two or more computer equipments 11 to 1N and server 30, and wherein computer is set It is interacted between standby 11 to 1N and server 31 by network 21.Computer equipment can be various types of during realization The computer equipment with information processing capability of type, for example, the computer equipment may include mobile phone, it is tablet computer, desk-top Machine, personal digital assistant etc..

The present embodiment proposes a kind of target object analysis method, can be in video fewer in number, using target detection Model carries out human testing and is determined target object in video to be processed using object count model when number is more Number is determined more accurately to ensure that in the video few for people in number, and this method is applied to computer equipment, should The function that method is realized can realize that certain program code can by the processor caller code in computer equipment To be stored in computer storage medium, it is seen then that the computer equipment includes at least pocessor and storage media.

Figure 1B is the implementation process schematic diagram of the embodiment of the present application target object analysis method, as shown in Figure 1B, the side Method the following steps are included:

Step S101 determines the photographing information of the video to be analyzed obtained.

It here, include the image of at least one target object in the video to be analyzed, for example, including under subway scene The video of many portraits includes the video etc. of flock of sheep in scene of putting sheep out to pasture.The step S101 can be to be realized by computer equipment , further, the computer equipment can be intelligent terminal, such as can be mobile phone (for example, mobile phone), plate electricity Brain, laptop etc. have the mobile terminal device of wireless communication ability, can also be the inconvenient movement such as desktop computer Intelligent terminal.The computer equipment is for carrying out image recognition or processing.

Step S102 determines the preset model handled the video to be analyzed, according to the photographing information with true The sum for the target object for including in the fixed video to be analyzed.

Here, the photographing information of the video to be analyzed, comprising: photographed scene belonging to the video to be analyzed and/or The shooting period of the video to be analyzed；The preset model is preset target detection model or preset object count mould Type, using the photographing information of the video to be analyzed of the preset target detection model treatment, with the use preset target The photographing information of the video to be analyzed of counter model processing is different.The preset target detection model can be using any one The model that the mode of kind target detection is trained neural network.For example, by according to certain pixel separation, traversal Sample image obtains the head feature (being also possible to face, hair or body etc.) in sample image, determines in sample image Everyone detection block；This detection block is compared with the detection block of the people in known sample image, to complete Training to the preset target detection model.The step S102 is it is to be understood that the shooting according to belonging to video to be analyzed The difference of the shooting period of the different and/or described videos to be analyzed of scene, using different preset models to video to be analyzed In include the number of target object counted, for example, if scene belonging to the video to be analyzed be included in it is preset It, will in scene set and/or the period belonging to the video to be analyzed (video for meeting preset condition) in preset period of time Initialization model is adjusted to the preset target detection model；If scene belonging to the video to be analyzed is not included in pre- If scene set in, and the period belonging to the video to be analyzed (is unsatisfactory for the view of preset condition not in preset period of time Frequently), initialization model is adjusted to preset object count model, if the inherently preset object count of initialization model Model, then continuing to handle the video to be analyzed using the preset object count model.

In the actual implementation process, the sum of target object exports in the computer equipment；The target object Sum can be the sum that target object is exported on own display screen, can also be the computer equipment by the target pair The sum of elephant is exported to other equipment, that is, is sent to other equipment, such as the other equipment can be the intelligence of user In terminal.

In the embodiment of the present application, it is analyzed by the photographing information of the video to be analyzed for input, judges to shoot Whether information meets specific scene, if it is satisfied, how many determines in video so just by the way of target detection model Target object, this ensure that the sum of target object is determined more accurately in the image less for target object, thus Target detection model is used in the sparse situation of off-peak period crowd, can utilize hardware resource with higher efficiency.

The present embodiment provides a kind of target object analysis method again, and Fig. 2A is the embodiment of the present application target object analysis method Implementation process schematic diagram the described method comprises the following steps as shown in Figure 2 A:

Step S201 uses the preset object count model or preset object count model for initialization model.

Here, after the step S201 is it is to be understood that get video to be analyzed, it is analysed to target pair in video As number the initialization model being determined be preset object count model, that is, after getting video to be analyzed, default Model be preset object count model, then determine the photographing information of video to be captured, based on the photographing information decision be It is no by preset object count models switching be preset target detection model.

Step S202 determines the photographing information of the video to be analyzed obtained.

Here, after step S202, start to judge photographing information, if field belonging to the video to be analyzed Scape be included in preset scene set in, and the period belonging to the video to be analyzed in preset period of time, enter step S203； If scene belonging to the video to be analyzed is not comprised in preset scene set or belonging to the video to be analyzed Period not in preset period of time, determine that the photographing information is unsatisfactory for preset condition, enter step S205.

Step S203, if scene belonging to the video to be analyzed is included in preset scene set and/or described Period belonging to video to be analyzed is adjusted to the preset target detection in preset period of time, by the initialization model Model.

Here, the period belonging to scene belonging to the video to be analyzed for including in photographing information and video to be analyzed is carried out Judgement, if scene belonging to video to be analyzed is included in preset scene set and/or the period belonging to video to be analyzed In preset period of time, it is determined that the photographing information meets preset condition, by the preset object count mould as initialization model Type is adjusted to preset target detection model.The preset scene set may include: subway scene, supermarket's scene, campus Scene etc.；The video for meeting preset condition is it is to be understood that include video fewer in number.For example, determine at night 10 points it The video shot under subway scene afterwards does not meet the video of preset condition, it is clear that the number for including in the video under this scene It is less, it is switched to target detection model and human body is detected, the number in video is determined more accurately.When determining morning peak The video shot under the subway scene of section is the video for being unsatisfactory for preset condition, it is clear that this period number is numerous, in this implementation Target object is counted using preset object count model in example；In this way for using target in video fewer in number Detection model uses preset object count model in the video more for number, keeps the number counted on more accurate.To The image of a target object corresponds to a detection block in analysis video.For example, target object is people, in the video Each generates a detection block per capita, by the number of statistic mixed-state frame, that is, can determine the number in the video.

Step S204 is handled the video to be analyzed using the preset target detection model.

Step S205, if scene belonging to the video to be analyzed is not included in preset scene set, and described Period belonging to video to be analyzed is adjusted to the preset object count mould not in preset period of time, by the initialization model Type.

Here, scene belonging to the video to be analyzed is not included in preset scene set, and video to be analyzed institute The period of category not in preset period of time, illustrates that target object is more in the video.For example, if scene is subway scene, Preset period of time is exactly peak period, the subway scene in 7 points of morning, i.e., if video to be analyzed is the subway scene in 7 points of morning Under video, then it is determined that the photographing information of the video is unsatisfactory for preset condition.If scene is bar, preset period of time is Ten one points to 4 points of morning at night, i.e., if video to be analyzed is the view under ten one points to 4 points of morning of bar scene at night Frequently, illustrate that number is more in the video, then it is determined that the photographing information of the video is unsatisfactory for preset condition.Step S205, if Scene belonging to the video to be analyzed is not included in preset scene set, and the period belonging to the video to be analyzed is not In preset period of time, illustrate that target object is more in the video, continues using initialization model (i.e. preset object count mould Type) target object is counted.

In the present embodiment, it is analyzed by the photographing information of the video to be analyzed for input, judges photographing information Whether specific scene is met, if it is satisfied, the preset model of default is just so switched to preset target detection model Mode, how many target object in video determined, thus realize at different conditions, it can be default by different models Model is determined the sum of target object in video to be analyzed, ensure that in the image less for target object, more quasi- The sum of the true object that sets the goal really.

The present embodiment provides a kind of target object analysis method again, and Fig. 2 B is the embodiment of the present application target object analysis method Another implementation process schematic diagram the described method comprises the following steps as shown in Figure 2 B:

Step S221 determines the photographing information of the video to be analyzed obtained.

Step S222 is decoded the video to be analyzed using Video Decoder, obtains continuous multiple image.

Here, the multiple image includes M frame image, and M is the integer more than or equal to 2.

Above-mentioned steps S222 provides a kind of mode being decoded to video to be analyzed, when get video to be analyzed it Afterwards, video to be analyzed is decoded first, to obtain continuous multiple image, it is default then judges whether photographing information meets Condition, to be handled using corresponding preset model each frame image.

Step S223, if scene belonging to the video to be analyzed is included in preset scene set and/or described Period belonging to video to be analyzed is adjusted to the preset target detection mould in preset period of time, by the initialization model Type.

Step S224 is swept using each frame image of the preset target detection model to the multiple image It retouches, obtains the physical trait of each target object.

Here, each frame image is scanned according to preset step-length using preset target detection model, determined described every The physical trait of each target object occurred in one frame image；That is, when the photographing information for determining video meets in advance It is then pre- using this if being the preset target detection model by the preset object count models switching after condition If target detection model each frame image of multiple image is scanned according to certain step-length, with scanning obtain it is each The physical trait of target object is (for example, traverse each frame image according to certain pixel separation, with each target of determination The physical trait of object).

Step S225 generates the detection block of each target object according to the physical trait of each target object.

Here, the step S225 can be realized by following procedure:

The first step is scanned using i-th frame image of the preset target detection model to the M frame image, determines institute State the physical trait for N number of target object that the i-th frame image includes.

Here, i and N is the integer greater than 0, and i is less than or equal to M.

Second step, if the physical trait of j-th of target object of N number of target object, with remove the i-th frame image Except other frame images in target object physical trait it is different, generate the detection block of the j target object.

Here, j is the integer for being less than or equal to N greater than 0.If some mesh in N number of target object in a frame image The physical trait of object is marked, it is different from the physical trait of the target object in other frames, illustrate the target object in other frames In do not occur, i.e., the target object in other frames without generating corresponding detection block, so in the frame to the target pair As generating detection block, it can guarantee the corresponding detection block of each target object in the video, be not in a target pair The case where detection blocks multiple as correspondence, to ensure that the number based on detection block determines the accuracy of target object sum.

Step S226 determines the total of the target object for including in the video to be analyzed according to the number of the detection block Number.

Above-mentioned steps S223 to step S226 give one video to be analyzed is decoded after, realize " if institute Scene belonging to video to be analyzed is stated included in preset scene set and/or the period belonging to the video to be analyzed exists In preset period of time, the initialization model is adjusted to the preset target detection model " mode, in this approach, lead to It crosses the multiple image for obtaining decoding to detect frame by frame, obtains the corresponding detection block of each target object, so that it is determined that wait divide Analyse the sum of target object in video.

In the present embodiment, in the case where scene people is seldom by the preset model of initialization, that is, preset object count mould Type is switched to target detection model and is determined to the number in video, the use of target detection model will be in this case one Better choice is planted, and can effectively make up in the video scenes such as object count model is sparse in crowd, and target is larger and count Inaccurate short slab, moreover it is possible to reduce the energy consumption of target analysis.

The present embodiment provides a kind of target object analysis method again, and Fig. 2 C is the embodiment of the present application target object analysis method Another implementation process schematic diagram the described method comprises the following steps as shown in Figure 2 C:

Step S231 determines the photographing information of the video to be analyzed obtained.

Step S232 is decoded the video to be analyzed using Video Decoder, obtains continuous multiple image.

Step S233, if scene belonging to the video to be analyzed is not included in preset scene set, and described Period belonging to video to be analyzed carries out the video to be analyzed not in preset period of time, using preset object count model Processing, obtains the foreground segmentation figure of the video to be analyzed and the groups of objects density map of the video to be analyzed.

Here, the step S233 can be realized by following procedure:

Firstly, being decoded using Video Decoder to the video to be analyzed, continuous multiple image is obtained；Secondly, Edge detection is carried out to frame image each in the multiple image using preset object count model, determines each frame figure The region that the head of each target object is covered as in；For example, target object is people, each portrait in each frame image is determined The pixel region that occupies of head；Again, in each frame image target object and background be split, obtain described The foreground segmentation figure of each frame image；Finally, the area covered according to the head of each target object in each frame image Domain generates the groups of objects density map for characterizing target object density in each frame image.

Step S234 determines the video to be analyzed according to the foreground segmentation figure and the target object group density map In include target object sum.

Here, if the target object in image to be analyzed is crowd, the step S224 be can be known to Each number of people position, then estimate the size of the number of people where the position, the overlay area of the number of people available in this way, by the area Domain be converted into the region may be the number of people probability, a crowd density figure can be obtained in the area probability and be 1；It obtains After crowd density figure, being integrated (summation) to the density map can be obtained by crowd's number.Obviously in the present embodiment, may be used also To determine that density map and other modes determine target object sum based on density map by other means；For example, determination is each Pixel region occupied by the personal number of people, determines density map based on the pixel region, to obtain total crowd.

In the present embodiment, by judging whether the scene information of video to be analyzed meets preset condition, to primarily determine Whether the video is the sparse video of target object, if it is, using preset target detection model；If it is not, then adopting With preset object count model, such as under subway scene, early evening peak crowd density is high, and quantity is big, and object count can be used Model, in off-peak period, crowd is sparse, is switched to target detection model, can utilize hardware resource in this way with higher efficiency.

The present embodiment provides a kind of target object analysis method again, and Fig. 2 D is the embodiment of the present application target object analysis method Another implementation process schematic diagram the described method comprises the following steps as shown in Figure 2 D:

Step S241 determines the photographing information of the video to be analyzed obtained.

Step S242, if scene belonging to the video to be analyzed is included in preset scene set and/or described Period belonging to video to be analyzed in preset period of time, using preset target detection model to the i-th frame figure of the M frame image As being scanned, the physical trait for N number of target object that the i-th frame image includes is determined.

Here, the physical trait can be the characteristic point of the body of target object, for example, face feature point, upper body are special Sign point and lower part of the body characteristic point, are that can determine the detection block of target object by these characteristic points.I and N is integer greater than 0, and i Less than or equal to M.Step S232 is it is to be understood that be scanned video to be analyzed, frame by frame with each target object of determination Physical trait.

Step S243, if the physical trait of j-th of target object of N number of target object, with remove the i-th frame figure The physical trait of target object is different in other frame images as except, generates the detection block of the j target object.

Here, j is the integer for being less than or equal to N greater than 0.

Above-mentioned steps S242 and step S243 gives a kind of realize " if scene belonging to the video to be analyzed includes In preset scene set and/or the period belonging to the video to be analyzed is in preset period of time, by goal-selling count module Type is switched to preset target detection model and handles the video to be analyzed, obtains each mesh in the video to be analyzed The mode of the detection block of mark object " by scanning the video after parsing frame by frame, obtains each target object in this approach Physical trait, to generate the detection block of target object.

Step S244 will be described default if the number of the detection block in the i-th frame image is greater than preset quantity threshold value Target detection models switching be the preset object count model.

Here, it firstly, according to the number of the detection block in the i-th frame image, determines in the i-th frame image and includes First quantity of target object；Then, if first quantity is greater than preset quantity threshold value, by the preset target detection Models switching is the preset object count model.The step S234 is it is to be understood that if the video institute to be analyzed The scene of category is included in preset scene set and/or the period belonging to the video to be analyzed is in preset period of time, will be pre- If object count models switching is that preset target detection model detects the video, when detecting in the i-th frame image When detection block number is more, preset object count model is directly switch to of the target object in remaining multiple image Number is counted.For example, although video is subway scene of the night more than 10 points, since the same day is festivals or holidays, the visitor of subway Flow is still very big, although at this moment photographing information meets preset condition, automatically using preset target detection model to the video It is detected, but once detects the more preset object count mould being switched to suitable for the more scene of number of number Type.

Step S245 is preset in the video to be analyzed without described using the preset object count model The first remaining video for being handled of target detection model handled, obtain the sum of the target object.

Here, when preset target detection model inspection is more to number, it is switched to preset object count model；Benefit Remaining video is handled with preset object count model, firstly, determining the foreground segmentation of first remaining video The groups of objects density subgraph of subgraph and the video to be analyzed；Then, close according to the foreground segmentation subgraph and the groups of objects Subgraph is spent, determines the second quantity of the target object for including in first remaining video；Finally, second quantity is determined For the sum of the target object；Target is carried out in this way, timely switching according to the number of the target object in each frame image The model that object number is counted flexibly and is efficiently completed and is analyzed target object group so as to more accurate Task, so that solving the quantity and density of crowd in actual security protection scene is real-time change, when different scenes difference The stream of people at quarter may be very big, some scene visual angles may also occur that variation (such as video of clipping the ball camera shooting), uses Single analytical technology may be difficult the problem of meeting actual analysis demand.

In the present embodiment, if preset target detection model inspection into a certain frame image target object quantity compared with Greatly, then it automatically switches to object count model to count the target object in remaining multiple image, and by final goal Sum of the number for the target object that counter model counts on as the target object of the video to be analyzed；So as to more quasi- Really, population analysis flexibly and is efficiently realized.

In other embodiments, after determining the sum of target object, the method also includes following alarm processes:

Here, the alarm process is it is to be understood that if the scene of video to be analyzed is the scene on airport, target object Sum is hundreds of thousands people, and these people persistently stay in airport and reach a few hours, then can determine that the airport number is overstocked, and have It is detained population, then, generates the warning information of the overstocked event of number and be detained the warning information of population event, to prompt airport work Make personnel and carries out population evacuation.

In the present embodiment, the numberical range for the sum of target object and mark object number are the duration of the sum, Corresponding alarm event is generated, to prompt user's face alarm event to make corresponding processing, so as to effectively handle The events such as over-congested population or population delay.

The present embodiment provides a kind of target object analysis method again, and Fig. 2 E is the embodiment of the present application target object analysis method Another implementation process schematic diagram the described method comprises the following steps as shown in Figure 2 E:

Step S251 determines the photographing information of the video to be analyzed obtained.

Step S252, if scene belonging to the video to be analyzed is not included in preset scene set, and described Period belonging to video to be analyzed not in preset period of time, continues with preset object count model to each in multiple image Frame image carries out edge detection, determines the region that the head of each target object in each frame image is covered.

Here, before step S252, first the video to be analyzed is decoded using Video Decoder, is obtained continuous Multiple image；In step S252, edge detection is carried out for each frame in the multiple image, with each target pair of determination The pixel region that the head of elephant occupies.

Step S253, in each frame image target object and background be split, obtain each frame figure The foreground segmentation figure of picture.

Here, the foreground segmentation figure of each frame image is obtained, so as to prominent target object, in order to below to target The statistics of object.

Step S254, according to the region that the head of each target object in each frame image is covered, generation is used for Characterize the groups of objects density map of target object density in each frame image.

Step S255, according to mesh in the foreground segmentation figure of the L-th frame image in the multiple image and the L-th frame image The groups of objects density map for marking object, determines the third quantity for the target object that the L-th frame image includes.

Here, L is the integer greater than 0；The step S255, it can be understood as, it determines in multiple image in L-th frame image The quantity that target object includes, i.e. third quantity.

Step S256 cuts the preset object count model if the third quantity is less than preset quantity threshold value It is changed to the preset target detection model.

Here, the step S256 is it is to be understood that if the photographing information of video is unsatisfactory for preset condition, using default Object count model the target object in the video is counted, when detecting that the target object in L-th frame image is less When, it is directly switch to preset target detection model and remaining multiple image is detected, with determination target object therein Number.For example, although video is subway scene of the morning more than 8 points, since the same day is subway stoppage in transit, the passenger flow of subway Very little is measured, although at this moment photographing information is unsatisfactory for preset condition, the video is carried out using preset object count model automatically Analysis, but once analyze fewer in number, that is, it is switched to the preset target detection model suitable for scene fewer in number.

Step S257 is preset in the video to be analyzed without described using the preset target detection model The second remaining video for being handled of object count model handled, obtain the sum of the target object.

Here, firstly, determining each target pair in second remaining video using the preset target detection model The sub- detection block of elephant；Then, according to the number of the sub- detection block, the target object for including in second remaining video is determined The 4th quantity；Finally, the 4th quantity to be determined as to the sum of the target object.

In the present embodiment, if preset object count model analysis into a certain frame image target object quantity compared with It is small, then it automatically switches to target detection model and the target object in remaining multiple image is counted, and by final goal Sum of the number for the target object that detection model counts on as the target object of the video to be analyzed；So as to effectively more Benefit crowd counting technology is sparse in crowd, counts inaccurate short slab in the larger equal video scenes of human body target, moreover it is possible to reduce crowd The energy consumption of analytical technology.

In the related art, the crowd density and crowd's prospect in video frame can be calculated by means such as deep learnings Figure, crowd density and crowd's foreground picture can analyze out the number of one piece of monitoring area and the information such as crowd is stagnant, these information can To instruct the flow of the people of control monitoring area, assist to shunt Dense crowd, AT STATION, square, the public places such as market are controlled There is biggish application value in peace.But if the human body area in video frame is too big, using this mode, it will lead to output Count results it is on the high side, the background color segmentation that can cause foreground picture similar with color of object is inaccurate in scene, the field of video frame Scape angle also influences whether final output result.

It is flexible and efficient accurately complete that more scenes more moment are unable to satisfy for existing crowd's counting technology mentioned above The problem of at population analysis task, the embodiment of the present application provide a kind of target object analysis method, under Same Scene not In the same time, using a kind of more efficient more energy efficient analysis mode, target detection model is such as used in the case where scene people is seldom Instead of object count model, because the object count model used in the related technology all compares consumption hardware resource；In this feelings It the use of target detection model will be that one kind is better under condition to supplement, the target detection model based on depth convolutional neural networks is being pacified It is had been widely used in anti-monitoring scene, preferable detection effect can be obtained under monitoring scene, can effectively make up crowd's counting Technology is sparse in crowd, counts inaccurate short slab in the larger equal video scenes of human body target, moreover it is possible to reduce population analysis technology Energy consumption.

Target object analysis method provided by the embodiments of the present application has merged object count model and target detection model, Under different scenes, can either automatically or manually switching model, such as the highdensity scene of the high flow of the people of subway or square can Object count model is switched to, is switched to human body engine indoors or under the biggish scene of target, this kind of scene objects are larger, Block less between target, monomer detection effect is preferable, and obtained number and density is also more accurate；Or in Same Scene Different moments, according to the quantity analysis of shift model of crowd in scene, such as under subway scene, early evening peak crowd density Height, quantity is big, object count model can be used, crowd is sparse in off-peak period, is switched to human testing, in this way can be more It is efficient to utilize hardware resource.

Fig. 3 is the implementation process schematic diagram of the embodiment of the present application target object analysis method, as shown in figure 3, the method The following steps are included:

Step S301, according to the initiation parameter for obtaining video to be analyzed, determine that the video needs to use for crowd The modeling engine of analysis.

Here, the mode engine can be for showing analyzed using which kind of model the crowd in the video Refer to the mark of model；For example, the engine of target detection model is " 1 "；The engine of object count model is " 0 "；Initiation parameter The as photographing information of the video.For example, obtaining the corresponding scene of the video period corresponding with the video.The step S301 It can be understood as engine initialization and mainly create the model that population analysis needs to use, and load corresponding deep learning Model (i.e. target detection model and object count model), needs exist for loading target detection model and object count mould simultaneously Type, and the initial operation mode of specified engine, if flow of the people in known scene is big, the crowd is dense, can be designed to object count mould Type；If crowd's quantity is few in scene, objective monomer is larger, is set using target detection model, needs exist for before analysis first Obtain this prior information of number in scene；Engine also needs to read analyzed area when initializing, event threshold, number threshold value, The parameters such as the head foot markup information of video scene, all parameters used are read from configuration file, and convenient, flexible adjustment is drawn The initiation parameter held up.

Step S302 is decoded the video, obtains continuous multiple image.

Here, the step S302 is it is to be understood that by Video Decoder, by the view of offline video or IP Camera Frequency flow data is decoded as continuous video requency frame data, and continuous multiple image can be chronologically expressed as F by us₀, F₁, F₂, F_t。

Step S303 sequentially inputs multiple image in the model that the modeling engine in step S301 indicates.

Here, if modeling engine is " 1 ", i.e. the model of expression modeling engine instruction is target detection model, into step Rapid S304；If modeling engine is " 0 ", i.e. the model of identification model engine instruction is object count model, enters step S305. Here multiple image is inputted because of the result difference of engine difference output in which model, if specified mesh when initialization Detection model is marked, analytic process mainly just includes the detection of target, the modules such as tracking and the extraction of organization of human body information, detection Frame quantity can directly be converted into crowd's quantity, and tracking module can export the same human body in the video frame of multiple Time Continuous Tracking mark and tracking box, the attribute information of human body can be extracted and can not also be extracted, and be determined according to specific business need.If What is specified when initialization is object count model, then analytic process mainly includes the Density Detection and foreground picture segmentation figure of crowd, The two models can be completed by the same depth convolutional neural networks.

Step S304 detects each frame of the multiple image of input, is obtained using preset target detection model Everyone detection block.

Here, the step S304 can be understood as, and using any portrait detection mode, determine in each video Portrait, for example, the human region etc. of face in each frame video or the people in each frame video of detection is detected, to generate Everyone detection block.After step S304, S306 is entered step.Setting threshold value switching model per capita is such as crossed, then in step After the completion of rapid S304, current crowd's number and number threshold value are compared, still uses target detection mould according to lower than number threshold value Type is switched over higher than number threshold value using the principle of object count model, and the foundation of switching is less than some number threshold value and says It is fewer in number in light field scape, it is more accurate using target detection model in situation fewer in number；If be provided with according to the time period Switchover policy then uses object count model (for example, morning peak and evening peak of subway station) in the peak crowd period, non-height The peak period then uses target detection model；It in this way can be in conjunction with actual scene flexible setting strategy, in analytical effect and energy consumption Between obtain a relatively good balance.

Step S305 detects each frame of the multi-frame video of input, is obtained using preset object count model The foreground segmentation figure and object densities figure of the video.

Here, foreground segmentation figure is used to protrude crowd and the background in the video, and object densities figure is for showing the video The density of middle crowd.

Step S306 determines the number for the portrait for including in current frame image.

Here, it if the target detection model used, is the number according to detection block, determines in current frame image and includes People number；If being the foreground segmentation figure and object densities figure according to the frame video using object count model, Determine the number for the portrait for including in the frame image.

Step S307, if the number for the portrait for including in the present frame video that the target detection model used determines is greater than Target detection models switching is object count model by preset amount threshold.

Here, it is preset if the number for the portrait for including in the current frame image that the object count model used determines is less than Amount threshold, by object count models switching be target detection model；When fewer in number to guarantee to include in video, adopt The portrait in detection video is carried out with target detection model, the number made is more accurate.

Step S308 will be examined in video to be analyzed without the preset target using preset object count model It surveys the first remaining video that model is handled to be handled, obtains the sum of the target object.

Step S309, if numberical range belonging to the sum and the number of the people in the video are the final people Several durations generates the alarm event to match with the duration and the numberical range, and exports the alarm event.

Here, for example, scene belonging to video is the scene of a certain subway, the final number in video is tens of thousands of people, And the number within the hour in the video is tens of thousands of people, then illustrating that number is overstocked in the subway, generated Close event alarm, and since these people are trapped in subway for a long time, it generates and is detained event alarm；And export this alarm thing Part, to prompt related personnel to do drainage work etc. for this case.

Step S310, analyzes whether the video decodes completion.

Here, if decoding is completed, S311 is entered step；If decoding does not complete, S302 is entered step.

Step S311, if decoding is completed, target end object analysis.

In the present embodiment, by analyzing the photographed scene of video, the model that the video needs to use is determined, to solve Single population analysis counts the problem of existing precision deficiency under different scenes, in conjunction with Human Detection can complete more multiclass The population analysis task of other scene gives full play to the technical advantage of the two.

The embodiment of the present application provides a kind of target object analytical equipment, and Fig. 4 is the embodiment of the present application target object analysis dress The composed structure schematic diagram set, as shown in figure 4, described device 400 includes: the first acquisition module 401 and the first determining module 402, in which:

Described first obtains module 401, for determining the photographing information of the video to be analyzed obtained；Wherein, described wait divide It include the image of at least one target object in the video of analysis；

First determining module 402, for according to the photographing information, determination to handle the video to be analyzed Preset model, with the sum for the target object for including in the determination video to be analyzed；Wherein, the preset model is default Target detection model or preset object count model, using the video to be analyzed of the preset target detection model treatment Photographing information, from using the photographing information of video to be analyzed of the preset object count model treatment it is different.

In above-mentioned apparatus, described device further include:

In above-mentioned apparatus, the first adjustment submodule, comprising:

In above-mentioned apparatus, described device further include:

In above-mentioned apparatus, the first scanning submodule, comprising:

In above-mentioned apparatus, the second judgment submodule, comprising:

In above-mentioned apparatus, first determines submodule, comprising:

In above-mentioned apparatus, described the second processing unit, including

In above-mentioned apparatus, described the second processing unit, comprising:

In above-mentioned apparatus, described device further include:

First alarm module, for the numberical range according to belonging to the sum of the target object and the video to be analyzed In target object number be the sum duration, generate and the alarm event that matches of the duration and the numberical range

It should be noted that the description of apparatus above embodiment, be with the description of above method embodiment it is similar, have The similar beneficial effect with embodiment of the method.For undisclosed technical detail in the application Installation practice, this Shen is please referred to Please embodiment of the method description and understand.

It should be noted that in the embodiment of the present application, if realizing above-mentioned target pair in the form of software function module Picture analysis method, and when sold or used as an independent product, it also can store in a computer-readable storage medium In.Based on this understanding, the technical solution of the embodiment of the present application substantially the part that contributes to existing technology in other words It can be embodied in the form of software products, which is stored in a storage medium, including several fingers It enables and using so that an instant messaging equipment (can be terminal, server etc.) executes each embodiment the method for the application It is all or part of.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (Read Only Memory, ROM), the various media that can store program code such as magnetic or disk.In this way, the embodiment of the present application is not limited to any spy Fixed hardware and software combines.

Correspondingly, the embodiment of the present application provides a kind of computer storage medium, is stored in the computer storage medium Computer executable instructions after the computer executable instructions are performed, can be realized target pair provided by the embodiments of the present application Step in picture analysis method.

The description of above instant computing machine equipment and storage medium embodiment, is similar with the description of above method embodiment , there is with embodiment of the method similar beneficial effect.For in the application instant messaging equipment and storage medium embodiment not The technical detail of disclosure please refers to the description of the application embodiment of the method and understands.

Fig. 5 is the composed structure schematic diagram of the embodiment of the present application computer equipment, as shown in figure 5, the computer equipment 500 Hardware entities include: processor 501, communication interface 502 and memory 503, wherein

The overall operation of the usually control computer equipment 500 of processor 501.

Communication interface 502 can make computer equipment pass through network and other terminals or server communication.

Memory 503 is configured to store the instruction and application that can be performed by processor 501, can also cache device to be processed 501 and computer equipment 500 in each module it is to be processed or processed data (for example, image data, audio data, language Sound communication data and video communication data), flash memory (FLASH) or random access storage device (Random Access can be passed through Memory, RAM) it realizes.

It should be understood that " one embodiment " or " embodiment " that specification is mentioned in the whole text mean it is related with embodiment A particular feature, structure, or characteristic includes at least one embodiment of the application.Therefore, occur everywhere in the whole instruction " in one embodiment " or " in one embodiment " not necessarily refer to identical embodiment.In addition, these specific features, knot Structure or characteristic can combine in any suitable manner in one or more embodiments.It should be understood that in the various implementations of the application In example, magnitude of the sequence numbers of the above procedures are not meant that the order of the execution order, the execution sequence Ying Yiqi function of each process It can be determined with internal logic, the implementation process without coping with the embodiment of the present application constitutes any restriction.Above-mentioned the embodiment of the present application Serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.

In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit Or communication connection, it can be electrical, mechanical or other forms.

Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit；Both it can be located in one place, and may be distributed over multiple network lists In member；Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

In addition, each functional unit in each embodiment of the application can be fully integrated in one processing unit, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units；It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.

Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can store in computer-readable storage medium, which exists When execution, step including the steps of the foregoing method embodiments is executed；And storage medium above-mentioned includes: movable storage device, read-only deposits The various media that can store program code such as reservoir (Read Only Memory, ROM), magnetic or disk.

If alternatively, the above-mentioned integrated unit of the application is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the application is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with Personal computer or server etc.) execute each embodiment the method for the application all or part.And storage above-mentioned Medium includes: the various media that can store program code such as movable storage device, ROM, magnetic or disk.

The above, the only specific embodiment of the application, but the protection scope of the application is not limited thereto, it is any Those familiar with the art within the technical scope of the present application, can easily think of the change or the replacement, and should all contain Lid is within the scope of protection of this application.Therefore, the protection scope of the application should be based on the protection scope of the described claims.

Claims

1. a kind of target object analysis method, which is characterized in that the described method includes:

Determine the photographing information of the video to be analyzed obtained；It wherein, include at least one target pair in the video to be analyzed The image of elephant；

According to the photographing information, the preset model handled the video to be analyzed is determined, it is described to be analyzed with determination The sum for the target object for including in video；Wherein, the preset model is preset target detection model or preset target Counter model is preset using the photographing information of the video to be analyzed of the preset target detection model treatment with using described Object count model treatment video to be analyzed photographing information it is different.

2. the method according to claim 1, wherein the photographing information of the video to be analyzed, comprising: it is described to Analyze the shooting period of photographed scene belonging to video and/or the video to be analyzed.

3. method according to claim 1 or 2, which is characterized in that described according to the photographing information, determine to described Before the preset model that video to be analyzed is handled, the method also includes:

4. according to the method described in claim 3, adjustment is to described wait divide it is characterized in that, described according to the photographing information The preset model that analysis video is handled, comprising:

If scene belonging to the video to be analyzed is included in preset scene set and/or the video institute to be analyzed The period of category is adjusted to the preset target detection model in preset period of time, by the initialization model；

The video to be analyzed is handled using the preset target detection model；

If scene belonging to the video to be analyzed is not included in preset scene set, and belonging to the video to be analyzed Period not in preset period of time, the initialization model is adjusted to the preset object count model.

5. method according to any one of claims 1 to 4, which is characterized in that according to the photographing information, determine to institute Before stating the preset model that video to be analyzed is handled, the method also includes:

6. method according to any one of claims 1 to 5, which is characterized in that determine and use the preset target detection Model handles the video to be analyzed, with the sum for the target object for including in the determination video to be analyzed, comprising:

It is scanned, is obtained described each using each frame image of the preset target detection model to the multiple image The physical trait of target object；

7. according to the method described in claim 6, it is characterized in that, described use the preset target detection model to described Each frame image of multiple image is scanned, and obtains the physical trait of each target object, comprising:

Each frame image is scanned according to preset step-length using preset target detection model, is determined in each frame image The physical trait of each target object occurred.

8. a kind of target object analytical equipment, which is characterized in that described device includes: that the first acquisition module and first determine mould Block, in which:

Described first obtains module, for determining the photographing information of the video to be analyzed obtained；Wherein, the video to be analyzed In include at least one target object image；

First determining module, for determining and being preset to what the video to be analyzed was handled according to the photographing information Model, with the sum for the target object for including in the determination video to be analyzed；Wherein, the preset model is preset target Detection model or preset object count model, using the shooting of the video to be analyzed of the preset target detection model treatment Information is different from using the photographing information of video to be analyzed of the preset object count model treatment.

9. a kind of computer storage medium, which is characterized in that be stored with the executable finger of computer in the computer storage medium It enables, after which is performed, can be realized the described in any item method and steps of claim 1 to 7.

10. a kind of computer equipment, which is characterized in that the computer equipment includes memory and processor, the memory On be stored with computer executable instructions, can be realized when the processor runs the computer executable instructions on the memory The described in any item method and steps of claim 1 to 7.