CN109426793A - A kind of image behavior recognition methods, equipment and computer readable storage medium - Google Patents
A kind of image behavior recognition methods, equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN109426793A CN109426793A CN201710780212.XA CN201710780212A CN109426793A CN 109426793 A CN109426793 A CN 109426793A CN 201710780212 A CN201710780212 A CN 201710780212A CN 109426793 A CN109426793 A CN 109426793A
- Authority
- CN
- China
- Prior art keywords
- subassembly
- region
- target
- image
- behavior
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of image behavior recognition methods, equipment and computer readable storage mediums.This method comprises: carrying out the division of subassembly to the target region in images to be recognized, the region where each subassembly is determined;The feature that each subassembly is extracted from the region where each subassembly, according to the feature of each subassembly determine target belonging to behavior classification.The present invention carries out image recognition by the full convolutional network LRFCN model of regional area, in the case where only increasing few computing cost, can effectively improve the accuracy of identification.
Description
Technical field
The present invention relates to technical field of image processing, can more particularly to a kind of image Activity recognition, equipment and computer
Read storage medium.
Background technique
In recent years, it as monitoring electronic equipment is in the continuous universal of every field, is more efficiently supervised from monitor video
The demand for surveying valuable information is increasingly prominent.Traditional monitoring method is but personal monitoring's video using being manually monitored
Method low efficiency, accuracy be difficult to ensure, so there is an urgent need to a kind of method for capableing of behavior in intelligent distinguishing video, and energy
It is enough that behavior interested in video is detected.
Summary of the invention
The present invention provides a kind of image behavior recognition methods, equipment and computer readable storage medium, to hold in battery
Measure in constant situation, terminal realize power consumption control the problem of.
For achieving the above object, the present invention uses following technical solutions:
According to one aspect of the present invention, a kind of image behavior recognition methods is provided, which comprises
The division that subassembly is carried out to the target region in images to be recognized, determines the area where each subassembly
Domain;
The feature that each subassembly is extracted from the region where each subassembly, according to the feature of each subassembly
Determine behavior classification belonging to the target.
Optionally, the target region in images to be recognized carries out the division of subassembly, determines each sub-portion
Region where part, comprising:
The division of subassembly is carried out to the target region according to preset subassembly average proportions value;
Prospect background segmentation is carried out using each subassembly region of the Region Segmentation Algorithm to division, before obtaining subassembly
Scape segmentation result.
Optionally, in the division for carrying out subassembly to the target region according to preset subassembly average proportions value
Before, further includes:
The mark of subassembly is carried out to the image comprising target that sample data is concentrated;
According to the region of the subassembly of mark, the ratio value of image shared by subassembly is determined;
Count the ratio value of image where the sample data concentrates identical subassembly and value, according to described and value determination
The subassembly average proportions value, wherein the subassembly average proportions value is different subassemblies and value ratio.
Optionally, the Region Segmentation Algorithm comprises at least one of the following: GrabCut algorithm, GraphCut algorithm and
RandomWalker algorithm.
Optionally, the feature that each subassembly is extracted in the region from where each subassembly, according to each
The feature of subassembly determines behavior classification belonging to the target, comprising:
To where the subassembly region and the target region carry out feature extraction respectively;
The feature that the feature that subassembly extracts is extracted with the target region is cascaded, the feature after cascade is made
For the target signature;
Behavior classification belonging to the target is determined from default disaggregated model according to the target signature.
Optionally, described to determine behavior class belonging to the target from default disaggregated model according to the target signature
Not, comprising:
The probability of affiliated every kind of behavior classification is determined from default disaggregated model according to the target signature;
The behavior classification of the maximum probability is chosen as behavior classification belonging to the target.
Optionally, determined from default disaggregated model according to the target signature behavior classification belonging to the target it
Before, further includes:
Obtain pre-training disaggregated model;
The sample data set comprising multiclass behavior is established, and to sample data set target region, behavior classification
And the region of subassembly is labeled;The pre-training disaggregated model is instructed based on the sample data set of mark
Practice, obtains the default disaggregated model.
Optionally, it is described obtain the preset disaggregated model after, the method also includes:
The image concentrated to the sample data is cut, to expand the sample data set;
Energy damage threshold is optimized according to the sample data set after expansion, the default classification mould after being optimized
Type.
According to one aspect of the present invention, a kind of image Activity recognition equipment is provided, comprising: memory and processor;Its
In, computer instruction is stored in the memory, it is above-mentioned to realize when the computer instruction is executed by the processor
Overall Steps and part steps in image behavior recognition methods.
According to one aspect of the present invention, a kind of computer readable storage medium, the computer-readable storage medium are provided
Matter is stored with one or more program, above-mentioned to realize when one or more of programs are executed by the processor
Image behavior recognition methods in Overall Steps and part steps.
The present invention has the beneficial effect that:
Image behavior recognition methods, equipment and computer readable storage medium provided by the embodiment of the present invention, using office
The full convolutional network in portion region improves pond process, carries out drawing for subassembly by the target region that will be identified
Point, final behavior classification is determined according to the feature that subassembly obtains.Therefore, the present invention is by local shape factor, only
In the case where increasing few computing cost, the accuracy of identification can effectively improve.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
In order to illustrate the embodiments of the present invention more clearly or it is existing in scheme, below will be in embodiment or existing description
Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention some
Embodiment without any creative labor, can also be according to these attached drawings for ordinary people in the field
Obtain other attached drawings.
Fig. 1 is the flow chart of image behavior recognition methods provided in the embodiment of the present invention;
Fig. 2 is the network structure of image behavior recognition methods provided in the embodiment of the present invention;
Fig. 3 is the cascade schematic diagram of feature in the embodiment of the present invention;
Fig. 4 is the functional block diagram of image Activity recognition equipment provided in the embodiment of the present invention.
Specific embodiment
Below in conjunction with attached drawing and embodiment, the present invention will be described in further detail.It should be appreciated that described herein
Specific embodiment be only used to explain the present invention, limit the present invention.
In computer vision field, there are all multi-methods to can be used for Activity recognition, still, in many cases, background is built
Mould, the real-time of foreground target detection and tracking and precision are difficult to reach requirement.And one as machine learning of deep learning
There is good improvement in new branch in real-time and accuracy.In object detection field, there are some typical deep learning moulds
Type scheme, is broadly divided into two classes, and one kind is such as YOLO of the method based on recurrence (You Only Look Once), SSD (Single
Shot Multibox Detector) etc., such methods efficiency is relatively high but precision is limited, and one kind is the side based on candidate regions
Method, such as faster RCNN (region-based convolutional neural networks), RFCN (Region-
Based Fully Convolutional Networks) etc., such methods precision is higher but efficiency decreases.
In view of Activity recognition problem and target detection problems have certain similitude, but difficulty is bigger, therefore, this hair
Bright selection improves on the basis of current accuracy of identification highest RFCN, proposes a kind of based on the full convolution net of regional area
The Activity recognition method of network (Local Region-based Fully Convolutional Networks, abbreviation LRFCN),
Activity recognition for video.
Embodiment of the method
Image behavior recognition methods provided by the embodiment of the present invention specifically comprises the following steps: as depicted in figs. 1 and 2
Step 101, the division that subassembly is carried out to the target region in images to be recognized, determines each subassembly institute
Region.
Step 102, the feature that each subassembly is extracted from the region where each subassembly, according to each subassembly
Feature determines behavior classification belonging to target.
Target region is carried out the division of subassembly by the embodiment of the present invention, and according to the feature that subassembly extracts come really
Final behavior classification belonging to setting the goal, is based on this, and the present invention is identified by using local feature, only increases few calculate
In the case where expense, the accuracy of identification can effectively improve.
Wherein, in an alternate embodiment of the present invention, in target region (the also referred to as region of interest to images to be recognized
Domain RoI) when being identified, the identification of area-of-interest RoI can be carried out using region recommendation network RPN.For " region pushes away
It recommends network RPN " and already belongs to technology known to those skilled in the art, it will not be described here.It can certainly be using other knowledges
Other technology carries out the identification of area-of-interest, does not do excessive restriction here.Wherein, in the target region of images to be recognized
Before being identified, images to be recognized is normalized, so that image can obtain unification after normalized
The standard picture of form.
Wherein, in an alternate embodiment of the present invention, drawing for subassembly is carried out to the target region in images to be recognized
Timesharing, the division including carrying out subassembly to target region according to preset subassembly average proportions value;Utilize region point
It cuts algorithm and prospect background segmentation is carried out to each subassembly region of division, obtain the foreground segmentation result of subassembly.
Here Preliminary division is carried out to subassembly region by subassembly average proportions value.Wherein, subassembly is average
Ratio value is sample data set when being trained according to disaggregated model (LRFCN model) to determine, to guarantee the accuracy of the value.
Specifically, in the division for carrying out subassembly to the target region according to preset subassembly average proportions value
Before, need to obtain the subassembly average proportions value.Here determine subassembly average proportions mode, include the following:
The mark of subassembly is carried out to the image comprising target that sample data is concentrated;
According to the region of the subassembly of mark, the ratio value of image shared by subassembly is determined;
In statistical sample data set the ratio value of image where identical subassembly and value, determined according to described and value described in
The subassembly average proportions value, wherein the subassembly average proportions value is different subassemblies and value ratio.
Specifically, the calculation of subassembly average proportions value is as follows:
Wherein, (part1)i+(part2)i+…(partk)i=1;partkiFor mesh shared by k subassembly in i-th of target
Mark the ratio of region;K is the number of subassembly;The number of targets that n is included for sample data set in training library.
It is also to say, to subassembly 1, subassembly 2 ... the subassembly K of targets all inside sample data set in the present invention
The ratio of shared ROI is summed respectively, then according to the ratio and to determine average proportions value between each subassembly.For example,
Human body is divided into head, body and lower limb three parts by one specific embodiment.To head proprietary inside sample data set
Portion, body, lower limb ratio are averaging, it is assumed that the ratio after i-th of people's normalization is Headi:BodyUpi:BodyDowni,
Wherein, Headi+BodyUpi+BodyDowni=1, if sample data set one shares n people, average proportions value is
Here, it in order to guarantee the accuracy of subassembly region division, needs to carry out the region further to Preliminary division
Accurate Segmentation.Specifically, when being split, background interference is excluded using Region Segmentation Algorithm, in each subassembly area
Background and prospect are distinguished in domain, obtain the foreground segmentation result of subassembly.
Wherein, it is preferred that Region Segmentation Algorithm uses GrabCut algorithm, GraphCut algorithm or RandomWalker
Any one of algorithm.Certainly it can also be realized using other algorithms, be no longer introduced here, do not departed from core of the present invention and think
Think, all in the scope of the present invention.Here, by taking GrabCut algorithm as an example, the specific implementation process of segmentation is illustrated.
Firstly, defining the optimization aim of an energy function E description segmentation, formula is expressed as follows:
E (α, k, θ, z)=U (α, k, θ, z)+V (α, z)
Wherein, the realm data item of U function representation energy function, the smooth item (border item) of V function representation energy function;
α is that (background label 0, prospect label are that 1), k is the Gauss point using GMM (mixed Gauss model) to picture init Tag
The number of amount, θ are the statistics parameter (weight of Gaussian component, mean vector, covariance matrix) of GMM, and z is the figure of subassembly
Sheet data.
Then, the min-cut minimal cut for solving the energy function can obtain the segmenting pixels set of prospect background.
Wherein, in an alternate embodiment of the present invention, the spy of each subassembly is extracted from the region where each subassembly
Sign, according to the feature of each subassembly determine target belonging to behavior classification, comprising:
To where subassembly region and target region carry out feature extraction respectively;
The feature that the feature that subassembly extracts is extracted with target region is cascaded, the feature after cascade is as mesh
Mark feature;
Behavior classification belonging to target is determined from preset disaggregated model according to target signature.
Specifically, when extracting the feature of each subassembly, the pixel of subassembly region is rolled up with convolution kernel
Long-pending, the value after convolution is the feature of subassembly.But because LRFCN network usually has plurality of layers, i.e. convolution operation can be with iteration
Many times.So its practical range corresponded in initial original image is bigger than the region of segmentation result.
Wherein, in order to enable the accuracy of image recognition, when extracting each subassembly feature, while extracting target place
The global feature in region.For example, shown in Fig. 3, mainly to head, body, the corresponding local feature in three partial regions of lower limb, and
One tandem compound of the entire corresponding global characteristics progress of human region, is constituted eventually for the feature described to whole region.
Based on this it is found that here by by the feature in each subassembly feature (local pond) and entire RoI extracted region
(whole pond) is cascaded, and need to only be increased in the case where introducing computing cost, can have been made the characteristic of image recognition
Increase, effectively improves the accuracy of identification of image.It certainly, can also be special by each subassembly in an alternate embodiment of the present invention
The cascade nature of sign is identified as target signature, is identified relative to the feature based on entire RoI extracted region, can also
To effectively improve the accuracy of identification of image.
Wherein, it in an alternate embodiment of the present invention, is determined belonging to target from preset disaggregated model according to target signature
Behavior classification, comprising: determine the probability of every kind of behavior classification belonging to target signature;Choose the behavior classification conduct of maximum probability
The affiliated behavior classification of target.
Further, in one embodiment of the invention, target is being determined from default disaggregated model according to the target signature
Before affiliated behavior classification, it is thus necessary to determine that default disaggregated model (LRFCN model).Here, default disaggregated model is determined
Mode, specifically include as follows:
Obtain pre-training disaggregated model;
Establish include multiclass behavior sample data set, and to sample data set target region, behavior classification and
The region of subassembly is labeled, and is trained, is obtained pre- to pre-training disaggregated model based on the sample data set of mark
If disaggregated model.
Here, when obtaining pre-training disaggregated model, be by large database training obtain, such as ImageNet this
One biggish database.Specifically, when establishing multiclass behavior sample data set, all images are in background, shooting in the data set
Angle, illumination, picture two time scales approach will have certain otherness.Then by artificial mode to the target area in image
The region of domain, behavior type and each subassembly is labeled.Then by the sample data set of mark to pre-training mould
Type is trained, to be adjusted to the parameter in LRFCN model.
Further, optionally, pre-training disaggregated model is trained in the sample data set based on mark, is obtained pre-
If disaggregated model after, this method further include:
Random cropping is carried out by the image concentrated to sample data, to expand sample data set;According to expansion
Sample data set afterwards optimizes energy damage threshold, the preset disaggregated model after being optimized.
Specifically, in training LRFCN model, energy damage threshold is to intersect entropy loss and bounding box recurrence loss
With as shown by the following formula:
Wherein, s is that all kinds of softmax is responded, t*Prediction result is represented relative to the offset of ground truth, t is pre-
Result is surveyed relative to the offset of preset frame.c*=0 illustrates that the label of RoI is background, works as c*[c when > 0*It > 0]=1, is otherwise 0.Lreg
Indicate bounding box loss, rcIndicate that the score of the spatial position of the RoI c class is averaged pond, circular is for example following
Shown in formula:
Lreg(t,t*)=R (t-t*)
tx=(x-xa)/wa,ty=(y-ya)/ha,tw=log (w/wa), th=log (h/ha)
tx *=(x*-xa)/wa,ty *=(y*-ya)/ha,tw *=log (w*/wa), th *=log (h*/ha)
Wherein, R is Smooth L1 loss function, and x, y, w, h are respectively the center point coordinate and width height of predicted boundary frame,
Subscript a is the center point coordinate and width height of preset frame, and the center point coordinate and width that subscript * is ground truth are high.
Based on it is above-mentioned it is found that by energy damage threshold can be used to estimate model predicted value and true value it is inconsistent
Degree, energy damage threshold is smaller, and model accuracy is better.Therefore, LRFCN can be guaranteed by training energy damage threshold
The accuracy of model, to improve the accuracy of identification.
TV is eaten, seen in family's monitor video, electronic equipment is played, falls down, the five class behaviors identification such as cruelty to child
For the training process of LFRCN model in the present invention is illustrated:
Step 201, the sample data set comprising multiclass behavior is established.
Here, first against propose the problem of, establish one comprising eat, see TV, play electronic equipment, fall down,
The database of five class behavior such as cruelty to child, every class include about 2000, and the sample of these images is all derived from family's monitoring view
Frequently.
Secondly, randomly selecting therein 2/3rds as training sample and being put into trained library, remaining one third
As test sample.The content that all images include all derives from family's monitor video of reality.
Step 202, manually to the target area in image, behavior classification, the regional areas such as target cranial, body, lower limb
It is labeled.Specifically, include the following:
Step 2021, it to carry out artificial mark ground truth to the target in image, is marked out simultaneously by picture frame
Target area and behavior class label, if class label is 0,1,2,3,4;
Step 2022, three head, body, lower limb parts in human body target are calibrated in sample image, and according to calibration
As a result, calculating average proportions value shared by head, body and lower limb in each region;
Step 2023, the specific picture that three head, body, lower limb parts cover in human body target in sample image is calibrated
Plain position records specific location of pixels by image template.
Step 203, pre-training LFRCN network model is obtained.
Because the neural network in LFRCN network model includes quantity of parameters, and what the sample data oneself established was concentrated
Sample number is on the low side, is directly trained with sample data set and is easy to happen over-fitting, thus select ImageNet this compared with
LFRCN network model is first obtained on big database, and then pre-training LFRCN network model is trained based on training library.
Step 204 is trained pre-training LFRCN network model based on training library, finely tunes the parameter of the network model.
The process can be divided into following small step:
The size of image in training library is normalized step 2041, makes the maximum side of image less than 600;
Step 2042 will train every piece image all random croppings in library, carry out the expansion of database;
Since network parameter is more and sample is less, in order to avoid over-fitting, randomly cut from image in training
Image to training library expanded to network training, to increase sample number.
The above-mentioned energy damage threshold of step 2043, optimization, obtains final LFRCN network model.Wherein, it was training
Cheng Zhong sets the parameter that initial learning rate randomly abandons 50% as 0.000001 and according to 0.5 Loss Rate.For optimization
Process is no longer introduced here, such as can pass through least square method and gradient descent method etc..
Based on above-mentioned it is found that the behavior proposed by the invention based on the full convolutional neural networks of regional area (LRFCN) is known
Other method can extremely accurate detect the goal behavior in video, have centainly for the technology vacancy of current intelligent security guard
Degree is filled up.
Apparatus embodiments
According to an embodiment of the invention, a kind of image Activity recognition equipment is provided, for realizing above-mentioned image behavior
Recognition methods.As shown in Figure 4.The equipment includes processor 42 and is stored with the memory 41 of 42 executable instruction of processor.Tool
Body, image Activity recognition equipment provided in an embodiment of the present invention, when the executable instruction in memory 41 is held by processor 42
When row, with image behavior recognition methods provided in implementation method embodiment.It should be noted that in apparatus embodiments,
Concrete implementation is no longer repeated, may refer to the detailed description in embodiment of the method, in this embodiment no longer into
Row repeats.
Wherein, processor 42 can be general processor, such as central processing unit (central processing unit,
CPU), it can also be digital signal processor (digital signal processor, DSP), specific integrated circuit
(application specific integrated circuit, ASIC), or be arranged to implement the embodiment of the present invention
One or more integrated circuits.
Memory 41 is transferred to CPU for storing program code, and by the program code.Memory 41 may include easy
The property lost memory (volatile memory), such as random access memory (random access memory, RAM);Storage
Device 41 also may include nonvolatile memory (non-volatile memory), such as read-only memory (read-only
Memory, ROM), flash memory (flash memory), hard disk (hard disk drive, HDD) or solid state hard disk
(solid-state drive, SSD);Memory 41 can also include the combination of the memory of mentioned kind.
Storage medium embodiment
The embodiment of the invention also provides a kind of computer readable storage mediums.Here computer readable storage medium is deposited
Contain one or more program.Wherein, computer readable storage medium may include volatile memory, such as arbitrary access
Memory;Memory also may include nonvolatile memory, such as read-only memory, flash memory, hard disk or solid-state are hard
Disk;Memory can also include the combination of the memory of mentioned kind.When one or more in computer readable storage medium
Program can be executed by one or more processor, with complete in image behavior recognition methods provided by implementation method embodiment
Portion's step and part steps.For step concrete implementation, the detailed description in embodiment of the method may refer to, in the embodiment
In no longer repeated.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, program can be stored in computer-readable storage medium, the journey
Sequence is when being executed, it may include such as the process of the embodiment of above-mentioned each method.
Although describing the application by embodiment, it will be apparent to one skilled in the art that the application is there are many deformation and becomes
Change without departing from the spirit and scope of the present invention.If being wanted in this way, these modifications and changes of the present invention belongs to right of the present invention
Ask and its equivalent technologies within the scope of, then the present invention is also intended to include these modifications and variations.
Claims (10)
1. a kind of image behavior recognition methods characterized by comprising
The division that subassembly is carried out to the target region in images to be recognized, determines the region where each subassembly;
The feature that each subassembly is extracted from the region where each subassembly is determined according to the feature of each subassembly
Behavior classification belonging to the target.
2. image behavior recognition methods according to claim 1, which is characterized in that the target in images to be recognized
Region carries out the division of subassembly, determines the region where each subassembly, comprising:
The division of subassembly is carried out to the target region according to preset subassembly average proportions value;
Prospect background segmentation is carried out using each subassembly region of the Region Segmentation Algorithm to division, obtains the prospect point of subassembly
Cut result.
3. image behavior recognition methods according to claim 2, which is characterized in that according to preset subassembly average specific
Example value carries out the target region before the division of subassembly, further includes:
The mark of subassembly is carried out to the image comprising target that sample data is concentrated;
According to the region of the subassembly of mark, the ratio value of image shared by subassembly is determined;
Count the ratio value of image where the sample data concentrates identical subassembly and value, according to described and value determination
The subassembly average proportions value, wherein the subassembly average proportions value is different subassemblies and value ratio.
4. image behavior recognition methods according to claim 2, which is characterized in that the Region Segmentation Algorithm includes following
It is at least one: GrabCut algorithm, GraphCut algorithm and RandomWalker algorithm.
5. image behavior recognition methods according to claim 1, which is characterized in that described where each subassembly
Region in extract the feature of each subassembly, according to the feature of each subassembly determine the target belonging to behavior classification,
Include:
To where the subassembly region and the target region carry out feature extraction respectively;
The feature that the feature that subassembly extracts is extracted with the target region is cascaded, the feature after cascade is as institute
State target signature;
Behavior classification belonging to the target is determined from default disaggregated model according to the target signature.
6. image behavior recognition methods according to claim 5, which is characterized in that it is described according to the target signature from pre-
If determining behavior classification belonging to the target in disaggregated model, comprising:
The probability of affiliated every kind of behavior classification is determined from default disaggregated model according to the target signature;
The behavior classification of the maximum probability is chosen as behavior classification belonging to the target.
7. image behavior recognition methods according to claim 5, which is characterized in that according to the target signature from default
In disaggregated model before behavior classification belonging to the determining target, further includes:
Obtain pre-training disaggregated model;
Establish include multiclass behavior sample data set, and to sample data set target region, behavior classification and
The region of subassembly is labeled;The pre-training disaggregated model is trained based on the sample data set of mark, is obtained
To the default disaggregated model.
8. image behavior recognition methods according to claim 7, which is characterized in that described to obtain the preset classification mould
After type, the method also includes:
The image concentrated to the sample data is cut, to expand the sample data set;
Energy damage threshold is optimized according to the sample data set after expansion, the default disaggregated model after being optimized.
9. a kind of image Activity recognition equipment characterized by comprising memory and processor;Wherein, it is deposited in the memory
Computer instruction is stored up, when the computer instruction is executed by the processor, to realize described in any one of claim 1~8
Image behavior recognition methods in step.
10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or
Multiple programs, when one or more of programs are executed by the processor, to realize any one of claim 1~8 institute
The step in image behavior recognition methods stated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710780212.XA CN109426793A (en) | 2017-09-01 | 2017-09-01 | A kind of image behavior recognition methods, equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710780212.XA CN109426793A (en) | 2017-09-01 | 2017-09-01 | A kind of image behavior recognition methods, equipment and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109426793A true CN109426793A (en) | 2019-03-05 |
Family
ID=65504993
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710780212.XA Pending CN109426793A (en) | 2017-09-01 | 2017-09-01 | A kind of image behavior recognition methods, equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109426793A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110688936A (en) * | 2019-09-24 | 2020-01-14 | 深圳市银星智能科技股份有限公司 | Method, machine and storage medium for representing characteristics of environment image |
CN110929628A (en) * | 2019-11-18 | 2020-03-27 | 北京三快在线科技有限公司 | Human body identification method and device |
CN111985269A (en) * | 2019-05-21 | 2020-11-24 | 顺丰科技有限公司 | Detection model construction method, detection device, server and medium |
CN112183666A (en) * | 2020-10-28 | 2021-01-05 | 阳光保险集团股份有限公司 | Image description method and device, electronic equipment and storage medium |
CN112560767A (en) * | 2020-12-24 | 2021-03-26 | 南方电网深圳数字电网研究院有限公司 | Document signature identification method and device and computer readable storage medium |
CN112712133A (en) * | 2021-01-15 | 2021-04-27 | 北京华捷艾米科技有限公司 | Deep learning network model training method, related device and storage medium |
WO2021093344A1 (en) * | 2019-11-15 | 2021-05-20 | 五邑大学 | Semi-automatic image data labeling method, electronic apparatus, and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701467A (en) * | 2016-01-13 | 2016-06-22 | 河海大学常州校区 | Many-people abnormal behavior identification method based on human body shape characteristic |
CN106570480A (en) * | 2016-11-07 | 2017-04-19 | 南京邮电大学 | Posture-recognition-based method for human movement classification |
-
2017
- 2017-09-01 CN CN201710780212.XA patent/CN109426793A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701467A (en) * | 2016-01-13 | 2016-06-22 | 河海大学常州校区 | Many-people abnormal behavior identification method based on human body shape characteristic |
CN106570480A (en) * | 2016-11-07 | 2017-04-19 | 南京邮电大学 | Posture-recognition-based method for human movement classification |
Non-Patent Citations (2)
Title |
---|
JIFENG DAI等: "R-FCN: Object Detection via Region-based Fully Convolutional Networks" * |
陶玲: "结合全局和局部特征的人体行为识别技术研究", 中国优秀硕士学位论文全文数据库信息科技辑, no. 3, pages 5 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985269A (en) * | 2019-05-21 | 2020-11-24 | 顺丰科技有限公司 | Detection model construction method, detection device, server and medium |
CN110688936A (en) * | 2019-09-24 | 2020-01-14 | 深圳市银星智能科技股份有限公司 | Method, machine and storage medium for representing characteristics of environment image |
WO2021093344A1 (en) * | 2019-11-15 | 2021-05-20 | 五邑大学 | Semi-automatic image data labeling method, electronic apparatus, and storage medium |
CN110929628A (en) * | 2019-11-18 | 2020-03-27 | 北京三快在线科技有限公司 | Human body identification method and device |
CN112183666A (en) * | 2020-10-28 | 2021-01-05 | 阳光保险集团股份有限公司 | Image description method and device, electronic equipment and storage medium |
CN112560767A (en) * | 2020-12-24 | 2021-03-26 | 南方电网深圳数字电网研究院有限公司 | Document signature identification method and device and computer readable storage medium |
CN112712133A (en) * | 2021-01-15 | 2021-04-27 | 北京华捷艾米科技有限公司 | Deep learning network model training method, related device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109426793A (en) | A kind of image behavior recognition methods, equipment and computer readable storage medium | |
CN109753975B (en) | Training sample obtaining method and device, electronic equipment and storage medium | |
Kristan et al. | The visual object tracking vot2015 challenge results | |
US8401292B2 (en) | Identifying high saliency regions in digital images | |
Xu et al. | Learning-based shadow recognition and removal from monochromatic natural images | |
US20160180196A1 (en) | Object re-identification using self-dissimilarity | |
Kim et al. | Color–texture segmentation using unsupervised graph cuts | |
Sukanya et al. | A survey on object recognition methods | |
CN103577875B (en) | A kind of area of computer aided CAD demographic method based on FAST | |
CN105469029A (en) | System and method for object re-identification | |
CN105303152B (en) | A kind of human body recognition methods again | |
Ma et al. | Counting people crossing a line using integer programming and local features | |
CN106778635A (en) | A kind of human region detection method of view-based access control model conspicuousness | |
Bouma et al. | Re-identification of persons in multi-camera surveillance under varying viewpoints and illumination | |
CN104732534B (en) | Well-marked target takes method and system in a kind of image | |
CN106157330A (en) | A kind of visual tracking method based on target associating display model | |
Chi | Self‐organizing map‐based color image segmentation with k‐means clustering and saliency map | |
Gündoğdu et al. | The visual object tracking VOT2016 challenge results | |
Kang et al. | A multiobjective piglet image segmentation method based on an improved noninteractive GrabCut algorithm | |
CN112241736A (en) | Text detection method and device | |
CN117037049B (en) | Image content detection method and system based on YOLOv5 deep learning | |
Fowlkes et al. | How much does globalization help segmentation? | |
Lobry et al. | Deep learning models to count buildings in high-resolution overhead images | |
Kalboussi et al. | Object proposals for salient object segmentation in videos | |
Xu et al. | Crowd density estimation of scenic spots based on multifeature ensemble learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |