CN111311603A - Method and apparatus for outputting target object number information - Google Patents

Method and apparatus for outputting target object number information Download PDF

Info

Publication number
CN111311603A
CN111311603A CN201811519247.9A CN201811519247A CN111311603A CN 111311603 A CN111311603 A CN 111311603A CN 201811519247 A CN201811519247 A CN 201811519247A CN 111311603 A CN111311603 A CN 111311603A
Authority
CN
China
Prior art keywords
target object
frame image
image
regression model
foreground region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811519247.9A
Other languages
Chinese (zh)
Inventor
董博
李艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Qianshi Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201811519247.9A priority Critical patent/CN111311603A/en
Publication of CN111311603A publication Critical patent/CN111311603A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30242Counting objects in image

Abstract

The embodiment of the application discloses a method and a device for outputting target object number information. One embodiment of the method comprises: acquiring a frame image on which at least one target object is displayed, and performing super-pixel segmentation on the frame image; determining the distance between the super pixel in the frame image and the corresponding super pixel in the preset background image; carrying out motion detection on a target object displayed in the frame image to obtain motion information of the target object in the frame image; based on the distance and the motion information of the target object, carrying out image segmentation on the frame image to obtain a foreground area in the frame image, wherein the foreground area comprises an area where the target object displayed in the frame image is located; and extracting image characteristic values of the foreground region, and inputting the extracted image characteristic values into a preset target object prediction regression model to output the number information of the target objects contained in the foreground region. This embodiment improves the accuracy of the predicted target object number information.

Description

Method and apparatus for outputting target object number information
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for outputting target object number information.
Background
With the development of computer technology and image processing technology, video-based intelligent monitoring systems are widely used. Plays a great role in ensuring the public safety and traffic safety of the society, protecting the safety of lives and properties of people, ensuring the safe production and product detection in the industrial control field and the related commercial field.
The method has important effects on industries such as supermarkets, markets, transportation and the like by counting target objects such as human flow, vehicle flow and the like to obtain number information. Taking the target object as the pedestrian as an example, the output pedestrian number information can be used for realizing assistance management, and the manpower and material resources are reasonably configured so as to efficiently utilize limited resources, or the crowd density can be reasonably controlled according to the counted pedestrian number information so as to prevent the occurrence of safety accidents caused by overcrowding of the crowd. Therefore, how to accurately acquire the number information of the target objects through the images acquired by monitoring equipment such as a camera and the like plays an important role in daily production and life.
Disclosure of Invention
The embodiment of the application provides a method and a device for outputting target object number information.
In a first aspect, an embodiment of the present application provides a method for outputting information on the number of target objects, where the method includes: acquiring a frame image on which at least one target object is displayed, and performing super-pixel segmentation on the frame image; determining the distance between the super pixel in the frame image and the corresponding super pixel in the preset background image; carrying out motion detection on a target object displayed in the frame image to obtain motion information of the target object in the frame image; based on the distance and the motion information of the target object, carrying out image segmentation on the frame image to obtain a foreground area in the frame image, wherein the foreground area comprises an area where the target object displayed in the frame image is located; and extracting image characteristic values of the foreground region, and inputting the extracted image characteristic values into a preset target object prediction regression model to output the number information of the target objects contained in the foreground region.
In some embodiments, determining a distance between a superpixel in the frame image and a corresponding superpixel in the preset background image comprises: extracting the characteristics of the superpixels in the frame image and the superpixels in the background image; based on the extracted features, Euclidean distances between the superpixels in the frame image and the corresponding superpixels in the preset background image are determined.
In some embodiments, the performing motion detection on the target object displayed in the frame image to obtain motion information of the target object in the frame image includes: acquiring a previous frame image adjacent to the frame image, and performing super-pixel segmentation on the previous frame image; based on an optical flow method, detecting superpixels in a frame image and corresponding superpixels in a previous frame image, and determining the dynamic characteristics of the superpixels in the frame image to obtain the motion information of the target object in the frame image.
In some embodiments, the preset target object prediction regression model is obtained by training through the following steps: acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image; and establishing a correlation vector machine regression model by adopting a sparse Bayesian learning algorithm, respectively taking image characteristic values extracted from foreground regions of sample images in training samples in a training sample set and target object number information contained in the foreground regions of the sample images as input and expected output of the correlation vector machine regression model, and training the correlation vector machine regression model to obtain a target object prediction regression model.
In some embodiments, the preset target object prediction regression model is obtained by training through the following steps: acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image; replacing Gaussian distribution in a regression model of a correlation vector machine with Poisson distribution to obtain a sparse Bayesian Poisson regression model; and respectively taking the image characteristic value extracted from the foreground region of the sample image in the training samples in the training sample set and the target object number information contained in the foreground region of the sample image as the input and the expected output of a sparse Bayesian Poisson regression model, and training the sparse Bayesian Poisson regression model to obtain a target object prediction regression model.
In a second aspect, an embodiment of the present application provides an apparatus for outputting information on the number of target objects, where the apparatus includes: a superpixel segmentation unit configured to acquire a frame image on which at least one target object is displayed, and perform superpixel segmentation on the frame image; a determining unit configured to determine a distance between a super pixel in the frame image and a corresponding super pixel in a preset background image; the detection unit is configured to perform motion detection on a target object displayed in the frame image to obtain motion information of the target object in the frame image; the image segmentation unit is configured to perform image segmentation on the frame image to obtain a foreground area in the frame image based on the distance and the motion information of the target object, wherein the foreground area comprises an area where the target object displayed in the frame image is located; and the target object number information output unit is configured to extract image characteristic values of the foreground region, and input the extracted image characteristic values into a preset target object prediction regression model to output target object number information contained in the foreground region.
In some embodiments, the determining unit is further configured to: extracting the characteristics of the superpixels in the frame image and the superpixels in the background image; based on the extracted features, Euclidean distances between the superpixels in the frame image and the corresponding superpixels in the preset background image are determined.
In some embodiments, the detection unit is further configured to: acquiring a previous frame image adjacent to the frame image, and performing super-pixel segmentation on the previous frame image; based on an optical flow method, detecting superpixels in a frame image and corresponding superpixels in a previous frame image, and determining the dynamic characteristics of the superpixels in the frame image to obtain the motion information of the target object in the frame image.
In some embodiments, the preset target object prediction regression model is obtained by training through the following steps: acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image; and establishing a correlation vector machine regression model by adopting a sparse Bayesian learning algorithm, respectively taking image characteristic values extracted from foreground regions of sample images in training samples in a training sample set and target object number information contained in the foreground regions of the sample images as input and expected output of the correlation vector machine regression model, and training the correlation vector machine regression model to obtain a target object prediction regression model.
In some embodiments, the preset target object prediction regression model is obtained by training through the following steps: acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image; replacing Gaussian distribution in a regression model of a correlation vector machine with Poisson distribution to obtain a sparse Bayesian Poisson regression model; and respectively taking the image characteristic value extracted from the foreground region of the sample image in the training samples in the training sample set and the target object number information contained in the foreground region of the sample image as the input and the expected output of a sparse Bayesian Poisson regression model, and training the sparse Bayesian Poisson regression model to obtain a target object prediction regression model.
The method and the device for outputting the number information of the target objects, provided by the embodiment of the application, firstly obtain a frame image displaying at least one target object, perform superpixel segmentation on the frame image, then determine a distance between a superpixel in the frame image and a corresponding superpixel in a preset background image, perform motion detection on the target object displayed in the frame image to obtain motion information of the target object in the frame image, then perform image segmentation on the frame image based on the distance and the motion information of the target object to obtain a foreground region in the frame image, finally perform image characteristic value extraction on the foreground region, input the extracted image characteristic value into a preset target object prediction regression model and output the number information of the target object contained in the foreground region, and therefore improve the accuracy of the predicted number information of the target object.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for outputting target object count information according to the present application;
FIG. 3 is a schematic diagram of an application scenario of a method for outputting target object number information according to the present application;
FIG. 4 is a flow chart of yet another embodiment of a method for outputting target object count information according to the present application;
FIG. 5 is a schematic diagram of an embodiment of an apparatus for outputting target object count information according to the present application;
FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing an electronic device according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the present method for outputting target object number information or an apparatus for outputting target object number information may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as image viewing software, web browsers, search-type applications, instant messaging tools, mailbox clients, social platform software, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting image saving and browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio layer iii, motion Picture Experts compression standard Audio layer 3), MP4 players (Moving Picture Experts Group Audio layer IV, motion Picture Experts compression standard Audio layer 4), laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background server that processes images transmitted on the terminal devices 101, 102, 103. The background server may perform processing such as segmentation and feature extraction on the received image, and feed back the processing result (e.g., output information on the number of target objects) to the terminal device.
It should be noted that the method for outputting the target object number information provided in the embodiment of the present application is generally performed by the server 105, and accordingly, the apparatus for outputting the target object number information is generally disposed in the server 105.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
It should be further noted that the terminal devices 101, 102, and 103 may also be installed with an image processing application, and the terminal devices 101, 102, and 103 may also perform image segmentation and feature extraction on the image to be processed based on the image processing application, in this case, the method for outputting the target object number information may also be executed by the terminal devices 101, 102, and 103, and accordingly, the apparatus for outputting the target object number information may also be installed in the terminal devices 101, 102, and 103. At this point, the exemplary system architecture 100 may not have the server 105 and the network 104.
Further, the system architecture 100 may further include an image capturing device (not shown) such as a camera, which is used for capturing images of areas such as a supermarket and an intersection to obtain a frame image and a background image. The above-described terminal apparatuses 101, 102, 103 may acquire an acquisition frame image and a background image from an image acquisition apparatus so as to transmit the acquired images to the server 105.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for outputting target object count information in accordance with the present application is shown. The method for outputting the number information of the target objects comprises the following steps:
step 201, acquiring a frame image displaying at least one target object, and performing superpixel segmentation on the frame image.
In the present embodiment, an execution subject (for example, a server shown in fig. 1) of the method for outputting the target object number information may acquire a frame image on which at least one target object is displayed, by a wired connection manner or a wireless connection manner. Then, the execution subject may perform a super-pixel division process on the acquired frame image, thereby dividing the acquired frame image into a number of super-pixels. It can be understood that the frame image may be collected by the camera and then stored in the terminal device, and at this time, the execution main body may acquire the frame image from the terminal device in a wired connection manner or a wireless connection manner; or, the execution main body may also directly acquire the acquired frame image from the camera in a wired connection manner or a wireless connection manner. The target object may be a moving object such as a pedestrian, a vehicle, an animal, and the like, and is not particularly limited herein. It should be noted that the wireless connection means may include, but is not limited to, a 3G/4G connection, a WiFi connection, a bluetooth connection, a WiMAX connection, a Zigbee connection, a uwb (ultra wideband) connection, and other wireless connection means now known or developed in the future.
In general, a super-pixel may refer to an irregular block of pixels with some visual significance made up of adjacent pixels having similar texture, color, brightness, etc. The super-pixel segmentation technique is to group pixels by using the similarity of features between pixels, and to express image features by replacing a large number of pixels with a small number of super-pixels. Therefore, the superpixel segmentation technique can greatly reduce the complexity of image processing. Further, superpixel segmentation may avoid the occurrence of pixel holes and noise as compared to pixel-level segmentation of images.
Step 202, determining the distance between the super pixel in the frame image and the corresponding super pixel in the preset background image.
In this embodiment, the execution subject (e.g., the server shown in fig. 1) described above may acquire a background image in advance. The background image and the frame image may be images captured by the same camera, and the background image is different from the frame image in that the target object does not exist in the background image. Then, the executing body may perform superpixel segmentation on the acquired background image to obtain superpixels of the background image. Finally, based on the superpixels of the frame image obtained in step 201, the execution subject may determine the distance between the superpixels of the frame image and the corresponding superpixels in the background image. It will be appreciated that the resulting distances may be used to characterize the similarity between the superpixels of the frame image and the corresponding superpixels in the background image.
Specifically, the frame image and the background image may be divided into N corresponding superpixels, and the ith superpixel of the frame image may correspond to the ith superpixel of the background image, where i is greater than or equal to 1 and less than or equal to N, and i and N are positive integers. The execution subject may calculate a distance between an ith super pixel of the frame image and an ith super pixel of the background image, and thus the execution subject may determine a distance between each super pixel in the frame image and a corresponding super pixel in the background image. Alternatively, the distance between the super pixel in the frame image and the corresponding super pixel in the background image may be an euclidean distance, a cosine distance, a hamming distance, or the like, and there is no unique limitation here.
In some optional implementations of the present implementation, a distance between a super pixel of the frame image and a corresponding super pixel in the preset background image may be a euclidean distance. The above-mentioned executing body may determine the euclidean distance by: extracting the characteristics of the superpixels in the frame image and the superpixels in the background image; based on the extracted features, Euclidean distances between the superpixels in the frame image and the corresponding superpixels in the preset background image are determined. Specifically, for a frame image including N superpixels, the executing body may perform feature extraction on the ith superpixel in the frame image to obtain the feature of the ith superpixel in the frame image
Figure BDA0001902804690000081
And xi=(xi1,…,xij…,xiJ) And the executing body can also extract the features of the ith super pixel in the background image to obtain the features of the ith super pixel in the background image
Figure BDA0001902804690000082
And y isi=(yi1,…,yij…,yiJ). Where N, J, i and J are positive integers, i can range from 1 up to N, J can be used to represent the number of features extracted from the superpixel, and J can range from 1 up to J, xijIs the j-th feature, y, of the i-th super-pixel of the frame imageijThe jth feature of the ith super pixel of the background image. Finally, the euclidean distance d (x, y) between the ith superpixel in the frame image and the ith superpixel in the background image can be calculated using the following formula:
Figure BDA0001902804690000083
in some optional implementations of this embodiment, in the process of outputting the number information of the target objects acquired by a certain camera, the background image used in the process may be an image that is acquired by the camera most recently and does not have the target object. Therefore, in the method for outputting the number information of the target objects disclosed in this embodiment, the background image is not uniform, and when a new image without the target object is acquired by the camera, the original background image may be replaced with the newly acquired image without the target object.
Step 203, performing motion detection on the target object displayed in the frame image to obtain motion information of the target object in the frame image.
In this embodiment, the executing entity may detect a moving target object in the frame image by using the target object in the frame image as a detection target, thereby obtaining motion information of the target object. Here, the execution subject described above may obtain the motion information of the target object in the frame image by determining the motion information present in each super pixel of the frame image. The motion information present in any superpixel in the frame image can be represented by the probability mean of the motion information of all pixels in the superpixel. Further, the execution subject may perform motion detection on the target object displayed in the frame image by using various means to obtain the target object motion information in the frame image. The executing agent may obtain the motion information of the target object by using different methods such as an inter-frame difference method, a background subtraction method, an optical flow method, or the like, which is not limited herein.
And 204, carrying out image segmentation on the frame image based on the distance and the motion information of the target object to obtain a foreground area in the frame image.
In this embodiment, based on the distance between the super pixel in the frame image obtained in step 202 and the corresponding super pixel in the background image, and based on the motion information of the target object in the frame image obtained in step 203, the execution subject may determine the super pixel belonging to the foreground region in the frame image by combining the distance and the motion information of the target object, so as to segment the foreground region in the frame image to obtain the foreground region of the frame image. Therefore, the executing body can realize the purpose of segmenting the foreground area and the background area of the frame image to obtain the foreground area of the frame image. The foreground region of the frame image comprises a region where the target object is located in the frame image.
Step 205, extracting image characteristic values of the foreground region, and inputting the extracted image characteristic values into a preset target object prediction regression model to output the number information of the target objects contained in the foreground region.
In this embodiment, a target object prediction regression model may be trained in advance, and the target object prediction regression model may be used to represent a correspondence between image features in a foreground region and target object number information included in the foreground region. After the execution subject obtains the foreground regions in the frame image, the execution subject may extract image feature values of the independent foreground regions, and then input the extracted image feature values of the independent foreground regions into the target object prediction regression model, and the target object prediction regression model may output information on the number of target objects displayed in the corresponding foreground regions.
In some optional implementations of the present embodiment, the target object prediction regression model may be obtained by training through the following steps:
in a first step, a set of training samples is obtained. The training sample set may include a plurality of training samples, and each training sample may include image feature values extracted from a foreground region of a sample image and target object number information included in the foreground region of the sample image;
and secondly, establishing a correlation vector machine regression model by adopting a sparse Bayesian learning algorithm, and respectively taking image features extracted from a foreground region of a sample image in training samples in a training sample set and target object number information contained in the foreground region of the sample image as input and expected output of the established correlation vector machine regression model so as to train the correlation vector machine regression model, wherein the trained correlation vector machine regression model is the target object prediction regression model.
It can be understood that the relevance vector machine regression model established by the sparse Bayesian learning algorithm has the characteristic of high calculation speed of the sparse Bayesian algorithm, so that the calculation cost of model training is reduced, and the calculation resources of the model training are saved. The execution subject inputs image feature values extracted from the foreground region of the frame image into a trained target object prediction regression model, which can output information on the number of target objects contained in the foreground region.
In some optional implementations of the present embodiment, the target object prediction regression model may be further trained by:
in a first step, a set of training samples is obtained. The training sample set may include a plurality of training samples, and each training sample may include image feature values extracted from a foreground region of a sample image and target object number information included in the foreground region of the sample image;
secondly, substituting the Gaussian distribution in the established correlation vector machine regression model by Poisson distribution to obtain a sparse Bayesian Poisson regression model;
and thirdly, respectively taking the image characteristic value extracted from the foreground region of the sample image in the training samples in the obtained training sample set and the target object number information contained in the foreground region of the sample image as the input and the expected output of a sparse Bayesian Poisson regression model so as to train the sparse Bayesian Poisson regression model, thereby obtaining a target object prediction regression model.
It can be understood that the target object prediction regression model obtained through the training in the above steps may be a sparse bayesian regression model based on a correlation vector machine, and the sparse processing of the model may reduce the calculation cost of the model training. Furthermore, the model adopts the characteristic that Poisson regression is an integer, so that the number of the target objects output by the target object prediction regression model is an integer, and the model is more in line with the actual requirement. The executing body inputs the image feature value extracted from the foreground region of the frame image into a trained target object prediction regression model, and the target object prediction regression model can output the number of target objects which are contained in the foreground region and are integers.
In some optional implementations of this embodiment, the executing body may perform image feature value extraction on each independent foreground region, and the extracted image feature value may include at least one of: region size information, pixel information, region perimeter information, perimeter and region size ratio, boundary information, texture information, and shape information.
The region size information may include the number of pixels in the independent foreground region. The pixel information may include mean, variance, histogram, etc. of pixels in the individual foreground regions. The region perimeter information may include the number of pixels contained by the perimeter of the independent foreground region. The perimeter and region size ratio may be a ratio of region perimeter information and region size information of the individual foreground regions. The boundary information may include the number of boundary pixels within the independent foreground region, where the inner boundary may be extracted by a boundary detection algorithm. The texture information may be represented by energy characteristics of a Gray-Level Co-occurrrence Matrix (GLCM) of the independent foreground region. The shape information may characterize the relief of the shape of the individual foreground regions, for example, may be represented by a ratio of the perimeter of the convex polygon of the individual foreground regions to the perimeter of the individual foreground regions.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for outputting target object number information according to the present embodiment. The application scene shown in fig. 3 may include 3a-3e, which is to count the number of pedestrians in a frame image acquired by a camera in a supermarket, where two independent pedestrians are displayed in the frame image, as shown in fig. 3a, a server may perform superpixel segmentation on the frame image to obtain a superpixel-segmented frame image, as shown in fig. 3b, where mesh information in the frame image is an edge of a superpixel; thereafter, the server may determine the distance between the super-pixel in the frame image and the corresponding super-pixel in the preset background image, as shown in fig. 3 c; then, the server carries out motion detection on the pedestrians displayed in the frame image to obtain motion information of the pedestrians in the frame image; then, the server performs image segmentation on the frame image based on the distance and the motion information of the pedestrians to obtain two independent pedestrians of the frame image, as shown in fig. 3d, the two pedestrians are two independent foreground regions (foreground region 1 and foreground region 2) of the frame image; finally, image feature value extraction is performed on the foreground region 1 and the foreground region 2, the extracted image feature values of the foreground regions are input into a preset target object prediction regression model (the target object prediction regression model in the application scene is a pedestrian prediction regression model shown in the figure), the target object prediction regression model can output pedestrian number information contained in the foreground regions, and as shown in fig. 3e, "the number of people" is output in frames corresponding to the two foreground regions: 1".
The method for outputting the number information of the target objects according to the embodiment of the present application obtains a frame image on which at least one target object is displayed, performs superpixel segmentation on the frame image, then determines a distance between a superpixel in the frame image and a corresponding superpixel in a preset background image, then performs motion detection on the target object displayed in the frame image to obtain motion information of the target object in the frame image, performs image segmentation on the frame image based on the distance and the motion information of the target object to obtain a foreground region of the frame image, and finally performs image feature extraction on the foreground region, and inputs an extracted image feature value into a preset target object prediction regression model to output the number information of the target object included in the foreground region. The scheme disclosed by the embodiment can avoid the problems of noise, holes and the like by performing super-pixel level segmentation on the frame image, and improves the accuracy of foreground region extraction, thereby improving the accuracy of the predicted target object number information.
With further reference to FIG. 4, a flow 400 of another embodiment of a method for outputting target object count information is shown. The process 400 of the method for outputting the information on the number of target objects includes the following steps:
step 401, acquiring a frame image displaying at least one target object, and performing superpixel segmentation on the frame image.
In the present embodiment, an execution subject (for example, a server shown in fig. 1) of the method for outputting the target object number information may acquire a frame image on which at least one target object is displayed, by a wired connection manner or a wireless connection manner. Then, the execution subject may perform a super-pixel division process on the acquired frame image, thereby dividing the acquired frame image into a number of super-pixels. It can be understood that the frame image may be collected by the camera and then stored in the terminal device, and at this time, the execution main body may acquire the frame image from the terminal device in a wired connection manner or a wireless connection manner; or, the execution main body may also directly acquire the acquired frame image from the camera in a wired connection manner or a wireless connection manner.
Among various superpixel segmentation algorithms, the SLIC algorithm (simple iterative clustering algorithm) has the advantages of low memory occupation, high speed, few parameters, high accuracy of extracted boundary information and the like, so that the execution main body can perform superpixel segmentation on a frame image by using the existing SLIC algorithm. It is understood that the executing entity may also perform superpixel segmentation on the frame image by using other methods, which are not limited herein.
Step 402, determining the distance between the super pixel in the frame image and the corresponding super pixel in the preset background image.
In this embodiment, the execution subject (e.g., the server shown in fig. 1) described above may acquire a background image in advance. It should be noted that the background image and the frame image may be images captured by the same camera, and the background image is different from the frame image in that there is no target object in the background image. Then, the executing body may perform superpixel segmentation on the acquired background image to obtain superpixels of the background image. Finally, based on the superpixels of the frame image obtained in step 401, the execution subject may determine the distance between the superpixels of the frame image and the corresponding superpixels in the background image.
And step 403, acquiring a previous frame image adjacent to the frame image, and performing super-pixel segmentation on the previous frame image.
In this embodiment, based on the frame image acquired in step 401, the executing body may determine a previous frame image adjacent to the frame image in the acquired image and acquire the previous frame image. Then, super-pixel segmentation is performed on the acquired previous frame image by using, for example, an SLIC algorithm, so as to obtain the super-pixels of the previous frame image.
Step 404, detecting superpixels in the frame image and corresponding superpixels in the previous frame image based on an optical flow method, and determining the dynamic characteristics of the superpixels in the frame image to obtain the motion information of the target object in the frame image.
In this embodiment, the execution subject may detect a superpixel in a frame image and a corresponding superpixel in a previous frame image by using an optical flow method so as to determine a dynamic feature of the superpixel in the frame image. It is understood that the execution subject can obtain the probability of motion information of each super pixel in the frame image by using the dynamic characteristics of the super pixels in the obtained frame image. As an example, the probability of motion information of the ith super-pixel block in the frame image may be represented by the mean of the probabilities of motion information of all pixels within the ith super-pixel. And finally, combining the probability of the motion information of each super pixel in the frame image to obtain the motion information of the target object in the frame image.
Step 405, based on the distance and the motion information of the target object, performing image segmentation on the frame image to obtain a foreground region in the frame image.
In this embodiment, based on the distance obtained in step 402 and the motion information of the target object obtained in step 404, the executing entity may determine, in the frame image, a superpixel belonging to the foreground region by combining the distance and the motion information of the target object, so as to segment the foreground region in the frame image to obtain the foreground region of the frame image. It can be understood that the optical flow method can only detect moving objects in general, and if the target object is detected only by the optical flow method, the detection of the moving features may be missed when the moving amplitude of the target object is not large or the target object is static. In the solution disclosed in this embodiment, the executing body may perform foreground region segmentation by combining the distance between the super pixel in the frame image and the corresponding super pixel in the background image and the target object motion information of the super pixel in the frame image, so as to improve the accuracy of the foreground region segmented in the frame image.
Specifically, the probability that the ith super pixel in the frame image belongs to the foreground region can be calculated by the following formula:
Figure BDA0001902804690000141
wherein p isiIs the probability, p, that the ith super-pixel in the frame image belongs to the foreground regioni distIs the distance between the ith superpixel in the frame image and the ith superpixel in the background image,
Figure BDA0001902804690000142
α is a preset coefficient for the motion information of the target object of the ith super pixel in the frame image obtained based on the optical flow method.
It can be understood that the above-mentioned implementation subject calculates the probability p that the ith super-pixel belongs to the foreground regioniThen, p can be substitutediComparing with a preset threshold value, and determining piDetermining the ith super pixel as the foreground area of the frame image under the condition that the number of the super pixels is larger than a preset threshold valueOtherwise, determining the ith super pixel as a background area of the frame image.
In the prior art, a background subtraction method is usually adopted to obtain motion information of a target object, and the method determines a foreground region of an image based on difference information between the foreground region and a background region in the image, so that for a situation that a part of colors of the target object are similar to the background region, for example, colors of clothes, hats and the like of pedestrians are similar to the background region, the method affects a result of image foreground region segmentation. In the scheme disclosed by the embodiment, in order to solve the problem, an optical flow method is introduced to detect the motion information of the target object, so that the accuracy of the foreground area of the obtained frame image is improved.
And step 406, extracting image characteristic values of the foreground region, and inputting the extracted image characteristic values into a preset target object prediction regression model to output the number information of the target objects contained in the foreground region.
In this embodiment, a target object prediction regression model may be trained in advance, and the target object prediction regression model may be used to represent a correspondence between image features in a foreground region and target object number information included in the foreground region. After the execution subject obtains the foreground regions in the frame image, the execution subject may extract image feature values of the independent foreground regions, and then input the extracted image feature values of the independent foreground regions into the target object prediction regression model, and the target object prediction regression model may output information on the number of target objects displayed in the corresponding foreground regions.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the method for outputting the target object number information in the present embodiment highlights the step of obtaining the motion information of the target object. Therefore, the scheme described in this embodiment can perform segmentation of the foreground region of the frame image by combining the characteristics of the superpixel and the optical flow method, and improves the accuracy of foreground region segmentation.
With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present application provides an embodiment of an apparatus for outputting information on the number of target objects, which corresponds to the embodiment of the method shown in fig. 2, and which is particularly applicable to various electronic devices.
As shown in fig. 5, the apparatus 500 for outputting the target object number information of the present embodiment includes: a superpixel segmentation unit 501, a determination unit 502, a detection unit 503, an image segmentation unit 504, and a target object number information output unit 505. The super-pixel segmentation unit 501 is configured to acquire a frame image on which at least one target object is displayed, and perform super-pixel segmentation on the frame image; the determining unit 502 is configured to determine a distance between a super pixel in the frame image and a corresponding super pixel in a preset background image; the detection unit 503 is configured to perform motion detection on the target object displayed in the frame image, and obtain motion information of the target object in the frame image; the image segmentation unit 504 is configured to perform image segmentation on the frame image to obtain a foreground region in the frame image based on the distance and the motion information of the target object, wherein the foreground region includes a region where the target object displayed in the frame image is located; the target object number information output unit 505 is configured to perform image feature value extraction on the foreground region, and input the extracted image feature values into a preset target object prediction regression model to output target object number information contained in the foreground region.
In some optional implementations of the present embodiment, the determining unit 502 is further configured to: extracting the characteristics of the superpixels in the frame image and the superpixels in the background image; based on the extracted features, Euclidean distances between the superpixels in the frame image and the corresponding superpixels in the preset background image are determined.
In some optional implementations of this embodiment, the detecting unit 503 is further configured to: acquiring a previous frame image adjacent to the frame image, and performing super-pixel segmentation on the previous frame image; based on an optical flow method, detecting superpixels in a frame image and corresponding superpixels in a previous frame image, and determining the dynamic characteristics of the superpixels in the frame image to obtain the motion information of the target object in the frame image.
In some optional implementations of this embodiment, the preset target object prediction regression model is obtained by training through the following steps: acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image; and establishing a correlation vector machine regression model by adopting a sparse Bayesian learning algorithm, respectively taking image characteristic values extracted from foreground regions of sample images in training samples in a training sample set and target object number information contained in the foreground regions of the sample images as input and expected output of the correlation vector machine regression model, and training the correlation vector machine regression model to obtain a target object prediction regression model.
In some optional implementations of this embodiment, the preset target object prediction regression model is obtained by training through the following steps: acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image; replacing Gaussian distribution in a regression model of a correlation vector machine with Poisson distribution to obtain a sparse Bayesian Poisson regression model; and respectively taking the image characteristic value extracted from the foreground region of the sample image in the training samples in the training sample set and the target object number information contained in the foreground region of the sample image as the input and the expected output of a sparse Bayesian Poisson regression model, and training the sparse Bayesian Poisson regression model to obtain a target object prediction regression model.
The units recited in the apparatus 500 correspond to the various steps in the method described with reference to fig. 2 and 4. Thus, the operations and features described above for the method are equally applicable to the apparatus 500 and the units included therein, and are not described in detail here.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing the electronic device of an embodiment of the present application. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601. It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a superpixel segmentation unit, a determination unit, a detection unit, an image segmentation unit, and a target object number information output unit. The names of these units do not in some cases constitute a limitation on the units themselves, and for example, the super-pixel division unit may also be described as a "unit that acquires a frame image on which at least one target object is displayed, and performs super-pixel division on the frame image".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be present separately and not assembled into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring a frame image on which at least one target object is displayed, and performing super-pixel segmentation on the frame image; determining the distance between the super pixel in the frame image and the corresponding super pixel in the preset background image; carrying out motion detection on a target object displayed in the frame image to obtain motion information of the target object in the frame image; based on the distance and the motion information of the target object, carrying out image segmentation on the frame image to obtain a foreground area in the frame image, wherein the foreground area comprises an area where the target object displayed in the frame image is located; and extracting image characteristic values of the foreground region, and inputting the extracted image characteristic values into a preset target object prediction regression model to output the number information of the target objects contained in the foreground region.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A method for outputting target object number information, comprising:
acquiring a frame image on which at least one target object is displayed, and performing super-pixel segmentation on the frame image;
determining the distance between the super pixel in the frame image and the corresponding super pixel in a preset background image;
performing motion detection on the target object displayed in the frame image to obtain motion information of the target object in the frame image;
performing image segmentation on the frame image based on the distance and the motion information of the target object to obtain a foreground region in the frame image, wherein the foreground region comprises a region where the target object displayed in the frame image is located;
and extracting image characteristic values of the foreground region, and inputting the extracted image characteristic values into a preset target object prediction regression model to output the number information of the target objects contained in the foreground region.
2. The method of claim 1, wherein the determining a distance between a superpixel in the frame image and a corresponding superpixel in a preset background image comprises:
performing feature extraction on the super pixels in the frame image and the super pixels in the background image;
determining Euclidean distance between the superpixels in the frame image and the corresponding superpixels in the preset background image based on the extracted features.
3. The method according to claim 1, wherein the performing motion detection on the target object displayed in the frame image to obtain motion information of the target object in the frame image comprises:
acquiring a previous frame image adjacent to the frame image, and performing super-pixel segmentation on the previous frame image;
detecting superpixels in the frame image and corresponding superpixels in the previous frame image based on an optical flow method, and determining the dynamic characteristics of the superpixels in the frame image to obtain the motion information of the target object in the frame image.
4. The method of claim 1, wherein the pre-set target object prediction regression model is trained by:
acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image;
establishing a correlation vector machine regression model by adopting a sparse Bayesian learning algorithm, respectively taking image characteristic values extracted from foreground regions of sample images in training samples in the training sample set and target object number information contained in the foreground regions of the sample images as input and expected output of the correlation vector machine regression model, and training the correlation vector machine regression model to obtain the target object prediction regression model.
5. The method according to any one of claims 1 to 4, wherein the pre-set target object prediction regression model is trained by:
acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image;
replacing Gaussian distribution in a regression model of a correlation vector machine with Poisson distribution to obtain a sparse Bayesian Poisson regression model;
and respectively taking the image characteristic value extracted from the foreground region of the sample image in the training samples in the training sample set and the target object number information contained in the foreground region of the sample image as the input and the expected output of the sparse Bayesian Poisson regression model, and training the sparse Bayesian Poisson regression model to obtain the target object prediction regression model.
6. An apparatus for outputting target object number information, comprising:
a superpixel segmentation unit configured to acquire a frame image on which at least one target object is displayed, perform superpixel segmentation on the frame image;
a determining unit configured to determine a distance between a super pixel in the frame image and a corresponding super pixel in a preset background image;
the detection unit is configured to perform motion detection on a target object displayed in the frame image to obtain motion information of the target object in the frame image;
the image segmentation unit is configured to perform image segmentation on the frame image based on the distance and the motion information of the target object to obtain a foreground region in the frame image, wherein the foreground region comprises a region where the target object displayed in the frame image is located;
and the target object number information output unit is configured to extract image characteristic values of the foreground region, and input the extracted image characteristic values into a preset target object prediction regression model to output the target object number information contained in the foreground region.
7. The apparatus of claim 6, wherein the determination unit is further configured to:
performing feature extraction on the super pixels in the frame image and the super pixels in the background image;
determining Euclidean distance between the superpixels in the frame image and the corresponding superpixels in the preset background image based on the extracted features.
8. The apparatus of claim 6, wherein the detection unit is further configured to:
acquiring a previous frame image adjacent to the frame image, and performing super-pixel segmentation on the previous frame image;
detecting superpixels in the frame image and corresponding superpixels in the previous frame image based on an optical flow method, and determining the dynamic characteristics of the superpixels in the frame image to obtain the motion information of the target object in the frame image.
9. The apparatus of claim 6, wherein the pre-set target object prediction regression model is trained by:
acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image;
establishing a correlation vector machine regression model by adopting a sparse Bayesian learning algorithm, respectively taking image characteristic values extracted from foreground regions of sample images in training samples in the training sample set and target object number information contained in the foreground regions of the sample images as input and expected output of the correlation vector machine regression model, and training the correlation vector machine regression model to obtain the target object prediction regression model.
10. The apparatus according to one of claims 6 to 9, wherein the pre-set target object prediction regression model is trained by:
acquiring a training sample set, wherein the training sample comprises image characteristic values extracted from a foreground region of a sample image and target object number information contained in the foreground region of the sample image;
replacing Gaussian distribution in a regression model of a correlation vector machine with Poisson distribution to obtain a sparse Bayesian Poisson regression model;
and respectively taking the image characteristic value extracted from the foreground region of the sample image in the training samples in the training sample set and the target object number information contained in the foreground region of the sample image as the input and the expected output of the sparse Bayesian Poisson regression model, and training the sparse Bayesian Poisson regression model to obtain the target object prediction regression model.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
12. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-5.
CN201811519247.9A 2018-12-12 2018-12-12 Method and apparatus for outputting target object number information Pending CN111311603A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811519247.9A CN111311603A (en) 2018-12-12 2018-12-12 Method and apparatus for outputting target object number information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811519247.9A CN111311603A (en) 2018-12-12 2018-12-12 Method and apparatus for outputting target object number information

Publications (1)

Publication Number Publication Date
CN111311603A true CN111311603A (en) 2020-06-19

Family

ID=71159808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811519247.9A Pending CN111311603A (en) 2018-12-12 2018-12-12 Method and apparatus for outputting target object number information

Country Status (1)

Country Link
CN (1) CN111311603A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001884A (en) * 2020-07-14 2020-11-27 浙江大华技术股份有限公司 Training method, counting method, equipment and storage medium of quantity statistical model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049765A (en) * 2012-12-21 2013-04-17 武汉经纬视通科技有限公司 Method for judging crowd density and number of people based on fish eye camera
CN107016691A (en) * 2017-04-14 2017-08-04 南京信息工程大学 Moving target detecting method based on super-pixel feature
CN107392917A (en) * 2017-06-09 2017-11-24 深圳大学 A kind of saliency detection method and system based on space-time restriction
US10037610B1 (en) * 2017-10-03 2018-07-31 StradVision, Inc. Method for tracking and segmenting a target object in an image using Markov Chain, and device using the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049765A (en) * 2012-12-21 2013-04-17 武汉经纬视通科技有限公司 Method for judging crowd density and number of people based on fish eye camera
CN107016691A (en) * 2017-04-14 2017-08-04 南京信息工程大学 Moving target detecting method based on super-pixel feature
CN107392917A (en) * 2017-06-09 2017-11-24 深圳大学 A kind of saliency detection method and system based on space-time restriction
US10037610B1 (en) * 2017-10-03 2018-07-31 StradVision, Inc. Method for tracking and segmenting a target object in an image using Markov Chain, and device using the same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑云飞;张雄伟;曹铁勇;孙蒙;: "基于全卷积网络的语义显著性区域检测方法研究", 电子学报, no. 11 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001884A (en) * 2020-07-14 2020-11-27 浙江大华技术股份有限公司 Training method, counting method, equipment and storage medium of quantity statistical model

Similar Documents

Publication Publication Date Title
US10936919B2 (en) Method and apparatus for detecting human face
US10902245B2 (en) Method and apparatus for facial recognition
US10796438B2 (en) Method and apparatus for tracking target profile in video
CN107220652B (en) Method and device for processing pictures
CN109344762B (en) Image processing method and device
CN109711508B (en) Image processing method and device
CN113436100B (en) Method, apparatus, device, medium, and article for repairing video
CN114943936B (en) Target behavior recognition method and device, electronic equipment and storage medium
US20230030431A1 (en) Method and apparatus for extracting feature, device, and storage medium
CN110941978B (en) Face clustering method and device for unidentified personnel and storage medium
CN112989987A (en) Method, apparatus, device and storage medium for identifying crowd behavior
CN114241358A (en) Equipment state display method, device and equipment based on digital twin transformer substation
Venkatesvara Rao et al. Real-time video object detection and classification using hybrid texture feature extraction
CN114973057A (en) Video image detection method based on artificial intelligence and related equipment
CN111292333A (en) Method and apparatus for segmenting an image
He et al. A double-region learning algorithm for counting the number of pedestrians in subway surveillance videos
CN113378790A (en) Viewpoint positioning method, apparatus, electronic device and computer-readable storage medium
CN110633597A (en) Driving region detection method and device
CN115083008A (en) Moving object detection method, device, equipment and storage medium
CN111311603A (en) Method and apparatus for outputting target object number information
CN114663980B (en) Behavior recognition method, and deep learning model training method and device
CN114445751A (en) Method and device for extracting video key frame image contour features
CN111311604A (en) Method and apparatus for segmenting an image
CN112700657B (en) Method and device for generating detection information, road side equipment and cloud control platform
CN115205555B (en) Method for determining similar images, training method, information determining method and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210302

Address after: 101, 1st floor, building 2, yard 20, Suzhou street, Haidian District, Beijing 100080

Applicant after: Beijing Jingbangda Trading Co.,Ltd.

Address before: 100086 8th Floor, 76 Zhichun Road, Haidian District, Beijing

Applicant before: BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY Co.,Ltd.

Applicant before: BEIJING JINGDONG CENTURY TRADING Co.,Ltd.

Effective date of registration: 20210302

Address after: Room a1905, 19 / F, building 2, No. 18, Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Beijing Jingdong Qianshi Technology Co.,Ltd.

Address before: 101, 1st floor, building 2, yard 20, Suzhou street, Haidian District, Beijing 100080

Applicant before: Beijing Jingbangda Trading Co.,Ltd.

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination