CN113963307A

CN113963307A - Method and device for identifying content on target and acquiring video, storage medium and computer equipment

Info

Publication number: CN113963307A
Application number: CN202010628765.5A
Authority: CN
Inventors: 周斌; 张剑锋; 朱曦; 刘春雷
Original assignee: Shanghai G2link Network Technology Co ltd
Current assignee: Shanghai G2link Network Technology Co ltd
Priority date: 2020-07-02
Filing date: 2020-07-02
Publication date: 2022-01-21

Abstract

A method, a device, a storage medium and a computer device for identifying content on a target and acquiring video, wherein the method comprises the following steps: acquiring partial images in an initial video stream as initial images, wherein the initial video stream comprises a plurality of frames of images arranged according to a shooting time sequence; carrying out target identification on the initial image frame by frame according to the shooting time sequence; the target recognition is carried out on each frame of initial image, and the method comprises the following steps: carrying out target detection on each frame of initial image to obtain an image containing a target to be identified; identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified; and identifying the content of the target area to obtain identification information. Through the scheme, the identification efficiency can be improved while the accuracy of the identification result is ensured.

Description

Method and device for identifying content on target and acquiring video, storage medium and computer equipment

Technical Field

The invention relates to the field of image recognition, in particular to a method and a device for recognizing content on a target and acquiring a video, a storage medium and computer equipment.

Background

With the development of intelligent identification technology, various scenes in life are usually subjected to target identification through big data training and shooting monitoring technology so as to realize the effects of monitoring and target control. In an example of object recognition, vehicle detection can be performed at a platform, a parking lot, a logistics park, and the like, and currently, in the detection of vehicles, a plurality of pairs of vehicles are photographed or videos are recorded by a monitoring camera, and a plurality of characteristics such as vehicle types, vehicle colors, and the like are recognized from the acquired photos or videos to perform vehicle recognition.

However, in the existing object recognition technology, when a moving object is recognized, especially when the content on the object is recognized, the recognition effect is poor, and in the recognition process, since a captured video stream contains a large number of image frames, if the analysis is performed frame by frame, the calculation amount is large, and the recognition efficiency is low.

Therefore, there is a need for a method for identifying content on a target, which can ensure the accuracy of the identification result and has high identification efficiency when identifying the content of a moving target.

Disclosure of Invention

The invention solves the technical problem of how to improve the identification efficiency of the moving target while ensuring the accuracy of the identification result.

In order to solve the above technical problem, an embodiment of the present invention provides a method for identifying content on a target, where the method includes: acquiring partial images in an initial video stream as initial images, wherein the initial video stream comprises a plurality of frames of images arranged according to a shooting time sequence; carrying out target identification on the initial image frame by frame according to the shooting time sequence; the target recognition is carried out on each frame of initial image, and the method comprises the following steps: carrying out target detection on each frame of initial image to obtain an image containing a target to be identified; identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified; and identifying the content of the target area to obtain identification information.

Optionally, the acquiring a part of an image in an initial video stream as an initial image includes: acquiring an initial video stream; and extracting a plurality of frame images from the initial video stream as the initial images according to the frame extraction interval.

Optionally, the initial image is extracted from the initial video stream according to a frame extraction interval by a video acquisition device that acquires the initial video stream.

Optionally, the method further includes: and sending the frame extraction interval to the video acquisition equipment.

Optionally, the frame extraction interval is determined according to a motion speed of the object to be recognized.

Optionally, after performing target recognition on the initial image frame by frame according to the shooting time sequence, the method includes: and calculating the movement speed of the target to be recognized by combining the initial images of the multiple frames of targets to be recognized.

Optionally, the method further includes: determining the frame extraction interval according to the motion speed of the target to be recognized according to the following formula: n — Mod (a × exp (b × X) + c × exp (d × X)); wherein N is the frame extraction interval; a, b, c and d are preset constants; x is the movement speed of the target to be identified; mod () is rounding the value in parentheses.

Optionally, the extracting the frame interval includes a plurality of values, and extracting a plurality of frames of initial images from the initial video stream according to the frame interval includes: and when the target to be identified is not identified, extracting a plurality of frames of initial images from the initial video stream according to the maximum value of the frame extraction interval.

Optionally, after the target detection is performed on each frame of initial image to obtain an image containing a target to be identified, the method further includes: when the target to be identified is not detected, ending the target detection of the initial image of the current frame; and continuously acquiring the next frame of initial image, and carrying out target identification on the next frame of initial image.

The embodiment of the invention also provides a video acquisition method, which comprises the following steps: acquiring an initial video stream, wherein the initial video stream comprises a plurality of frames of images which are arranged according to a shooting time sequence; extracting a plurality of frame images from the initial video stream to be used as initial images; wherein the initial image is used for content recognition on the target.

Optionally, the extracting a plurality of frame images from the initial video stream as initial images includes: and extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval.

Optionally, before extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval, the method further includes: and receiving the frame extraction interval sent by the identification equipment.

An embodiment of the present invention further provides an apparatus for identifying content on a target, where the apparatus includes: the system comprises an initial image acquisition module, a video acquisition module and a video processing module, wherein the initial image acquisition module is used for acquiring partial images in an initial video stream as initial images, and the initial video stream comprises a plurality of frames of images which are arranged according to the shooting time sequence; the target identification module is used for carrying out target identification on the initial image frame by frame according to the shooting time sequence; wherein the object recognition module comprises: the target detection unit is used for carrying out target detection on each frame of initial image to obtain an image containing a target to be identified; the area identification unit is used for identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified; and the content identification unit is used for identifying the content of the target area to obtain identification information.

An embodiment of the present invention further provides a video capture device, where the device includes: the video acquisition module is used for acquiring an initial video stream, wherein the initial video stream comprises a plurality of frames of images which are arranged according to the shooting time sequence; the image extraction module is used for extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval; wherein the initial image is provided to a recognition device so that the recognition device performs object recognition on the initial image frame by frame in the shooting time order.

An embodiment of the present invention further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for identifying content on the target or the steps of the video capture method.

The embodiment of the present invention further provides a computer device, which includes a memory and a processor, where the memory stores a computer program, and the processor implements the method for identifying content on any one of the above objects or the steps of the method for capturing video according to any one of the above objects when executing the computer program.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides a method for identifying content on a target, which comprises the following steps: acquiring partial images in an initial video stream as initial images, wherein the initial video stream comprises a plurality of frames of images arranged according to a shooting time sequence; carrying out target identification on the initial image frame by frame according to the shooting time sequence; the target recognition is carried out on each frame of initial image, and the method comprises the following steps: carrying out target detection on each frame of initial image to obtain an image containing a target to be identified; identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified; and identifying the content of the target area to obtain identification information. Compared with the prior art, the method and the device can select the partial image from the initial video stream as the initial image, acquire the image of the target to be identified containing the content to be identified, detect the target area from the image containing the target to be identified and identify the content in the target area. By adopting the scheme, the identification efficiency can be improved while the accuracy of the identification result is ensured.

Furthermore, in the identification process, the selected initial images are subjected to target content identification frame by frame according to the shooting time sequence, the real motion state of the target can be accurately monitored, the initial images to be identified are selected according to the real motion state, the identification efficiency is improved, and the calculated amount in the identification process is reduced.

Furthermore, the frame extraction interval can be set as required to extract the initial image needing target identification from the initial video stream, the frame extraction interval can be set to be a fixed numerical value, can be dynamically adjusted according to the condition of the initial video stream, and can be determined according to the motion speed of the target to be identified, so that the frame extraction interval can be dynamically adjusted according to the real-time motion condition of the target to be identified, and the effect of content identification can be ensured while the identification efficiency is ensured.

Further, when the set frame extraction interval includes a plurality of values, if the target to be identified is not detected in the continuous plurality of frames, frame extraction is performed according to the maximum value of the set frame extraction interval, that is, the number of initial image frames of the target video is reduced, and the calculation amount of identification is reduced while the monitoring target is maintained. When the object to be identified is detected in one of the frames, the frame extraction interval can be appropriately reduced.

Further, when the target of each frame of initial image is identified, if the target to be identified is not detected in the frame, the subsequent identification step of the frame of initial image is not continued, and the target of the next frame of initial image is continued to be identified, so that the calculation power is saved, and the identification efficiency is improved.

Furthermore, the license plate content of the vehicle can be identified through the identification method of the content on the target, the license plate area of the moving vehicle can be accurately positioned, and the accuracy of the license plate content identification is effectively improved. And the frame extraction interval is adjusted according to the result of vehicle detection, so that the identification efficiency is improved while the identification result is ensured.

Drawings

FIG. 1 is a flow chart illustrating a method for identifying content on a target according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a method for recognizing license plate contents on a vehicle according to an embodiment of the present invention;

fig. 3 is a schematic flow chart of a video capture method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an apparatus for identifying content on a target according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a video capture device according to an embodiment of the present invention.

Detailed Description

As background art, in the prior art, when a moving object is identified, especially when content on the object is identified, the identification effect is poor, and in the identification process, since a captured video stream contains a large number of image frames, if the analysis is performed frame by frame, the calculation amount is large, and the identification efficiency is low.

To solve the problem, an embodiment of the present invention provides a method for identifying content on a target, where the method includes: acquiring partial images in an initial video stream as initial images, wherein the initial video stream comprises a plurality of frames of images arranged according to a shooting time sequence; carrying out target identification on the initial image frame by frame according to the shooting time sequence; the target recognition is carried out on each frame of initial image, and the method comprises the following steps: carrying out target detection on each frame of initial image to obtain an image containing a target to be identified; identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified; and identifying the content of the target area to obtain identification information.

Through the scheme, when the content of the moving target is identified, the accuracy of the identification result can be ensured, and the identification efficiency is effectively improved.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

Referring to fig. 1, an embodiment of the present invention provides a method for identifying content on a target, where the method includes:

step S11, acquiring partial images in an initial video stream as initial images, wherein the initial video stream comprises a plurality of frames of images arranged according to the shooting time sequence;

s12, carrying out target recognition on the initial images frame by frame according to the shooting time sequence;

the initial video stream is a video stream obtained by continuously shooting a certain area or a certain person or object, and comprises a plurality of frame images which are arranged according to the shooting time sequence. The initial video stream comprises an image of a target to be recognized where the content to be recognized is located, and the content to be recognized is located on the target to be recognized.

When the content to be recognized on the target to be recognized in the initial video stream is recognized, the target to be recognized is recognized first, and in the recognition process, the target recognition is performed frame by frame according to the shooting time sequence.

The video capture device such as a camera may be the same device as the device (hereinafter referred to as a recognition device) for executing the method for recognizing the content on the target, for example, a mobile phone with a camera, a computer, or the like. The video capture device and the device for executing the method for identifying the content on the target can also be two devices in the same area or different areas, and the two devices can communicate through a network, Bluetooth and the like.

The identification device may directly acquire an initial video stream captured by the video capture device and extract a portion of the image from the initial video stream as an acquired initial image. The moment when the recognition device extracts the initial image is determined based on the situation of object recognition. For example, when other people or objects except the non-background are detected to be contained in the initial video stream, the target recognition is carried out on the initial image of the frame; and the target recognition is not carried out on the acquired initial image only containing the background.

In addition, after the video capture device captures the initial video stream, a partial image may be extracted from the initial video stream by the video capture device as an initial image, and the initial image may be provided to the recognition device, so that the recognition device may perform the above steps S11 and S12.

Optionally, the video capture device may store the initial video stream or the initial image extracted from the initial video stream to an intermediate storage device such as a disk or a server (e.g., a cloud server), and the identification device reads the initial video stream or the initial image from the intermediate storage device.

Optionally, the video capture device or the recognition device may extract the initial image from the initial video stream according to a preset extraction rule. The extraction rule can be set according to needs, for example, the initial image can be extracted from the initial video stream at certain time intervals; or determining the frequency of extraction according to the specific shooting time of each frame image in the initial video stream to extract the initial image.

In step S12, the target recognition of each frame of initial image may specifically include the following steps:

step S121, performing target detection on each frame of initial image to obtain an image containing a target to be identified;

s122, identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified;

and S123, identifying the content of the target area to obtain identification information.

The target to be identified can be any person or object, the content to be identified is on the target to be identified, and if the relative position of the target to be identified and the content to be identified can be determined according to a certain identification logic, the protection range of the content identification method on the target is met.

Optionally, the image containing the target to be recognized may be one or more frames of images containing the target to be recognized in the initial video stream, or may be a partial image that is captured from one or more frames of images containing the target to be recognized and can represent the feature of the target to be recognized.

Optionally, a first deep neural network may be obtained through big data training, and target recognition is performed on the initial images of multiple frames to obtain an image including a target to be recognized.

After the image containing the target to be recognized is obtained, the image containing the area where the content to be recognized is located, namely the target area, is recognized. The target area may be identified from the image containing the target to be identified according to the set identification logic, and the identification logic may be set according to the shape and structure of the target to be identified, the relative position relationship between the target area and the target to be identified, the characteristics of the target area (such as the shape contained in the target area, the color of the target area), and the like.

Optionally, when the initial video stream includes multiple frames of images of the same target to be recognized, the multiple frames of images corresponding to the same target to be recognized may be acquired as the images including the target to be recognized, and the target areas in the multiple frames of images are acquired respectively.

Optionally, a second deep neural network may be obtained through big data training, and a target region on the target to be recognized is recognized.

Optionally, when the first deep neural network and/or the second deep neural network are used for image recognition, a Graphics Processing Unit (GPU for short) may be used for acceleration operation.

In a non-limiting example, the target to be recognized is a vehicle, the content to be recognized may be the content of a license plate, and the target area is an image area corresponding to the license plate, and the initial video stream may include an image of the vehicle (i.e., the vehicle to be recognized) with the license plate to be recognized; an image capturing device may be provided at an entrance/exit of a vehicle centralized gathering place such as a logistics park, a community, a platform, a parking lot, etc. to capture an initial video stream. The image area corresponding to the license plate of the vehicle to be recognized can be recognized as the target area according to the position of the license plate in the vehicle to be recognized relative to the vehicle body, the size, the shape and other characteristics of the vehicle, or the characteristics of the license plate expressed in the image.

When the content of the license plate region is identified, the content can be identified by adopting an Optical Character Recognition (OCR) and other identification technologies.

By the method for identifying the content on the target, provided by the embodiment of the invention, the partial image can be extracted from the initial video stream to be used as the initial image, the image of the target to be identified containing the content to be identified is obtained, the target area is detected from the image containing the target to be identified, and the content in the target area is identified. In the identification process, the selected initial images are subjected to target content identification frame by frame according to the shooting time sequence, the real motion state of the target can be accurately monitored, the initial images to be identified are selected according to the real motion state, the identification efficiency is improved, and the calculated amount in the identification process is reduced.

In one embodiment, the acquiring, as an initial image, a part of an image in the initial video stream in step S11 in fig. 1 may include: acquiring an initial video stream; and extracting a plurality of frame images from the initial video stream as the initial images according to the frame extraction interval.

That is, the initial image may be extracted from the initial video stream at a frame extraction interval for the recognition device or the video capture device. The frame extraction interval is used for extracting an initial image needing object identification from the initial video stream, and the frame extraction interval can be set according to needs.

The set frame extraction interval may be a fixed value, for example, the set frame extraction interval is 5 frames, that is, every 5 frames of the initial image, and one frame of the initial image is extracted for object recognition.

The decimation interval can also be dynamically adjusted according to the conditions of the original video stream. That is, the set framing interval is a plurality of fixed values, and one of the fixed values is selected as the current framing interval according to the condition of the initial video stream. The condition of the initial video stream may include, among others, the sharpness of the image in the initial video stream, the result of identifying the information content of the initial image in the initial video stream, and so on.

When the video acquisition equipment executes the operation of extracting the initial image, the frame extracting interval can be set by the identification equipment according to the frame extracting requirement or the identification condition, and is sent to the video acquisition equipment as the frame extracting basis.

In a non-limiting example, an initial video stream acquired by a device such as a camera that is commonly used for video acquisition is encoded by using an h.264 (a compressed digital video codec standard) encoding method based on a Real Time Streaming Protocol (RTSP), and the initial video stream or the initial image needs to be decoded according to the encoding method before being analyzed. All frames of the initial video stream or the initial image may be decoded and then decimated at a decimation interval.

And the frame extraction interval is determined according to the motion speed of the target to be identified.

The moving speed of the target to be recognized can be obtained by recognition and calculation according to the initial video stream, and can also be obtained by measuring the speed of the target to be recognized. Further, the larger the movement speed of the target to be recognized is, the smaller the frame extraction interval is, so as to accurately track the target to be recognized, and improve the accuracy of content recognition on the target to be recognized. In other words, the frame extraction interval is dynamically changed according to the movement speed of the object to be recognized. Further, in order to prevent the frame extraction interval from changing too frequently, a buffer period may be set after the speed change, that is, after the movement speed of the object to be recognized changes, the frame extraction interval may be adjusted after the changed speed exceeds the buffer period. By adopting the scheme, in the fields of license plate recognition and the like, the phenomenon that the frame extraction interval is overlarge due to temporary braking of the vehicle can be prevented.

Optionally, after performing object recognition on the initial image frame by frame according to the shooting time sequence, the method may include: and calculating the movement speed of the target to be recognized by combining the initial images of the multiple frames of targets to be recognized.

When the target recognition is performed on the extracted initial image in the initial video stream, the motion speed of the target to be recognized can be calculated by combining the position of the target in the multi-frame initial image containing the same target to be recognized, the time interval of the acquisition of each frame of image in the acquired initial video stream and the frame extraction interval, so as to determine the frame extraction interval for performing the target recognition subsequently. At the moment, the initial video stream can be acquired from the video acquisition equipment in real time, so that the frame extraction interval of the initial video stream in a later period of time is adjusted according to the motion speed of the target to be identified reflected in the initial video stream acquired in a previous period of time, the frame extraction interval can be dynamically adjusted according to the real-time motion condition of the target to be identified, and the identification efficiency is ensured while the content identification effect is ensured.

Optionally, the method for identifying content on the target further includes: determining the frame extraction interval according to the movement speed of the target to be identified according to the following formula (1):

N＝Mod(a×exp(b×X)+c×exp(d×X)) (1)

wherein N is the frame extraction interval; a, b, c and d are preset constants; x is the movement speed of the target to be identified; mod () is rounding the value in parentheses.

At the moment, the movement speed of the target to be recognized is obtained through calculation according to a plurality of frames of initial images, the unit of the movement speed can be expressed as pixels per second, and the numerical values of a, b, c and d can be adjusted according to the frame extraction effect. For example, each value may be a ═ 25.04; b is-1.451; c is 4.954; and d is-0.1702.

In one embodiment, the decimating interval comprises a plurality of values, and the decimating of the initial images from the initial video stream by the decimating interval may include: and when the target to be identified is not identified, extracting a plurality of frames of initial images from the initial video stream according to the maximum value of the frame extraction interval.

When the set frame extraction interval contains a plurality of values, if the target to be identified is not detected in the continuous multiframes, frame extraction is carried out according to the maximum value of the set frame extraction interval, namely, the number of initial image frames of the target video is reduced, and the calculation amount of identification is reduced while the monitoring target is kept. When the object to be identified is detected in one of the frames, the frame extraction interval can be appropriately reduced.

In an embodiment, with continuing reference to fig. 1, after performing the target detection on each frame of the initial image to obtain an image including the target to be recognized in step S121, the method may further include: when the target to be identified is not detected, ending the target detection of the initial image of the current frame; and continuously acquiring the next frame of initial image, and carrying out target identification on the next frame of initial image.

When the target of each frame of initial image is identified, if the target to be identified is not detected in the frame, the step S122 and the step S123 are not continuously executed on the frame of initial image, and the target of the next frame of initial image is continuously identified, so that the calculation power is saved, and the identification efficiency is improved.

In a non-limiting example, according to the method for identifying content on a target provided by the present invention, the license plate content of a vehicle is identified, and the identification steps are shown in fig. 2:

step S201, acquiring an initial video stream acquired by a camera, and extracting partial images as initial images according to frame extraction intervals; the camera can be arranged at a parking lot, an entrance and an exit of a logistics park and the like, collects initial video streams of vehicles coming and going, and extracts multi-frame initial images from the initial video streams for target recognition.

Step S202, acquiring a current initial image, and carrying out vehicle detection on the current initial image through a pre-trained first deep neural network; the first deep neural network is used for identifying a target (namely a vehicle) of each frame of initial image so as to acquire an image containing the vehicle to be identified, wherein each identified initial image is called a current initial image.

In step S203, it is determined whether a vehicle is detected in step S202.

When the vehicle is not detected, executing step S204 to obtain a first frame extraction interval; and continuing to execute the step S201 by taking the first frame extraction interval as the frame extraction interval of the subsequent initial video stream, and ending the target identification of the current frame initial image.

When the vehicle is detected, executing step S205 to obtain a second frame extraction interval; the step S201 is continuously executed with the second decimation interval as the decimation interval of the subsequent initial video stream. Wherein the second frame extraction interval may be determined according to a moving speed of the vehicle.

For the initial image frame where the vehicle is detected, the steps of content recognition on the object, i.e., step S206 and step S207, are continuously performed.

Step S206, detecting the position of the license plate through a pre-trained second deep neural network; and the second deep neural network is used for identifying the license plate position of the image containing the vehicle to be identified so as to acquire the image of the license plate region.

Step S207, recognizing the license plate content; the content of the image in the license plate area can be identified by adopting an OCR (optical character recognition) technology so as to acquire the identification information of the license plate content. And finishing the target recognition of the initial image of the current frame.

After finishing the target recognition of the current frame initial image, step S208 is executed to continue to acquire the next frame initial image as the current initial image. And starting to perform target recognition on the initial image of the next frame.

Optionally, when the content of the image in the license plate region is identified, the gray level of the image in the license plate region may be adjusted to improve the identification effect.

Through the step of recognizing the license plate content of the vehicle in fig. 2, the license plate area of the moving vehicle can be accurately positioned, and the accuracy of recognizing the license plate content is effectively improved. And the frame extraction interval is adjusted according to the result of vehicle detection, so that the identification efficiency is improved while the identification result is ensured.

Referring to fig. 3, an embodiment of the present invention further provides a video capture method, where the method includes:

step S31, collecting an initial video stream, wherein the initial video stream comprises a plurality of frames of images arranged according to the shooting time sequence;

step S32, extracting a plurality of frame images from the initial video stream as initial images;

wherein the initial image is used for content recognition on the target.

Optionally, the step S32 extracting a plurality of frame images from the initial video stream as initial images may include: and extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval.

In one embodiment, before extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval, the method further includes: and receiving the frame extraction interval sent by the identification equipment.

The video capturing method illustrated in fig. 3 is executed by the video capturing device, that is, the video capturing device that captures the initial video stream may perform frame extraction on the initial video stream, and frame extraction may be performed according to a frame extraction interval, where the frame extraction interval may be set locally for the video capturing device or may be sent to the video capturing device for the identification device, and the identification device may set the frame extraction interval according to an identification requirement or an identification result, where a specific setting manner may refer to description about the frame extraction interval in the above identification method of the content on the target.

For more contents of the working principle and the working mode of the video capture method, reference is made to the related description on the video capture device side in the identification method of the content on the target, and details are not repeated here.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an apparatus for identifying content on a target, the apparatus including:

an initial image obtaining module 41, configured to obtain, as an initial image, a part of images in an initial video stream, where the initial video stream includes multiple frames of initial images arranged in a shooting time order;

a target identification module 42, configured to perform target identification on the initial images frame by frame according to the shooting time sequence;

the object recognition module 42 may include:

the target detection unit 421 is configured to perform target detection on each frame of initial image to obtain an image including a target to be identified;

an area identification unit 422, configured to identify a target area where content is located in an image including a target to be identified, where the content is located on the target to be identified;

a content identification unit 423, configured to perform content identification on the target area to obtain identification information.

In one embodiment, the initial image acquisition module 41 may include:

an initial video stream acquisition unit for acquiring an initial video stream;

and the frame extracting unit is used for extracting a plurality of frame images from the initial video stream as the initial images according to the frame extracting interval.

In one embodiment, the initial image is extracted from the initial video stream at a frame extraction interval by a video acquisition device that acquires the initial video stream.

In one embodiment, the device for identifying content on the target may further include:

and the frame extraction interval sending module is used for sending the frame extraction interval to the video acquisition equipment.

In one embodiment, the object recognition module may further include:

and the movement speed determining module is used for calculating the movement speed of the target to be recognized by combining multiple frames of initial images of the target to be recognized.

the frame extraction interval determining module is used for determining the frame extraction interval according to the motion speed of the object to be identified according to the following formula:

N＝Mod(a×exp(b×X)+c×exp(d×X))

In one embodiment, the frame extraction interval comprises a plurality of values, and the frame extraction unit is further configured to extract a plurality of frames of initial images from the initial video stream according to a maximum value of the frame extraction interval when the object to be identified is not identified.

the end detection module is used for ending the target detection of the initial image of the current frame when the target to be identified is not detected;

and the continuous detection module is used for continuously acquiring the next frame of initial image and carrying out target identification on the next frame of initial image.

For more details on the working principle and working mode of the device for identifying content on a target, reference may be made to the above-mentioned description of the method for identifying content on a target in fig. 1 and fig. 2, and details are not repeated here.

Referring to fig. 5, an embodiment of the present invention further provides a video capture device, where the video capture device may include:

a video collecting module 51, configured to collect an initial video stream, where the initial video stream includes multiple frames of images arranged in a shooting time sequence;

an image extraction module 52, configured to extract a plurality of frame images from the initial video stream as initial images according to a frame extraction interval;

wherein the initial image is provided to a recognition device so that the recognition device performs object recognition on the initial image frame by frame in the shooting time order.

In one embodiment, the image extraction module 52 is further configured to extract a plurality of frames of images from the initial video stream as initial images according to a frame extraction interval.

In one embodiment, the video capture device may further include:

and the frame extracting interval receiving module is used for receiving the frame extracting interval sent by the identification equipment.

For more contents of the working principle and the working mode of the video capture device, reference may be made to the description related to the video capture method in fig. 3, which is not repeated here.

Further, the embodiment of the present invention further discloses a storage medium, on which a computer program is stored, and the computer program executes the method for identifying the content on the object shown in fig. 1 and fig. 2 or the technical solution of the method for capturing the video shown in fig. 3 when running.

Further, an embodiment of the present invention further discloses a computer device, that is, the identification device in the embodiment, where the computer device includes a memory and a processor, where the memory stores a computer program capable of running on the processor, and when the processor runs the computer program, the processor executes the technical solution of the identification method for the content on the object shown in fig. 1 and fig. 2. The terminal can be a mobile phone, a computer, an intelligent camera and other terminals. The video collected by the terminal can be directly obtained to be used as an initial video stream or an initial image obtained by framing the initial video stream, and the method for identifying the content on the target can be executed.

Further, the embodiment of the present invention further discloses a computer device, that is, a video capture device, including a memory and a processor, where the memory stores a computer program capable of running on the processor, and the processor executes the technical scheme of the video capture method shown in fig. 3 when running the computer program. The computer equipment can refer to equipment with video acquisition functions, such as a camera, an intelligent camera and the like.

Specifically, in the embodiment of the present invention, the processor may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It will also be appreciated that the memory in the embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example and not limitation, many forms of Random Access Memory (RAM) are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (enhanced SDRAM), SDRAM (SLDRAM), synchlink DRAM (SLDRAM), and direct bus RAM (DR RAM).

It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in this document indicates that the former and latter related objects are in an "or" relationship.

The "plurality" appearing in the embodiments of the present application means two or more.

The descriptions of the first, second, etc. appearing in the embodiments of the present application are only for illustrating and differentiating the objects, and do not represent the order or the particular limitation of the number of the devices in the embodiments of the present application, and do not constitute any limitation to the embodiments of the present application.

The term "connect" in the embodiments of the present application refers to various connection manners, such as direct connection or indirect connection, to implement communication between devices, which is not limited in this embodiment of the present application.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for identifying content on a target, the method comprising:

acquiring partial images in an initial video stream as initial images, wherein the initial video stream comprises a plurality of frames of images arranged according to a shooting time sequence;

carrying out target identification on the initial image frame by frame according to the shooting time sequence;

the target recognition is carried out on each frame of initial image, and the method comprises the following steps:

carrying out target detection on each frame of initial image to obtain an image containing a target to be identified;

identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified;

and identifying the content of the target area to obtain identification information.

2. The method of claim 1, wherein the obtaining of the image of the portion in the initial video stream as the initial image comprises:

acquiring an initial video stream;

and extracting a plurality of frame images from the initial video stream as the initial images according to the frame extraction interval.

3. The method of claim 1, wherein the initial image is extracted from the initial video stream at a frame extraction interval by a video capture device that captures the initial video stream.

4. The method of claim 3, further comprising:

and sending the frame extraction interval to the video acquisition equipment.

5. The method according to any one of claims 2 to 4, wherein the frame-extracting interval is determined according to a moving speed of the object to be recognized.

6. The method of claim 5, wherein after performing object recognition on the initial images frame by frame in shooting time order, the method comprises:

and calculating the movement speed of the target to be recognized by combining the initial images of the multiple frames of targets to be recognized.

7. The method of claim 5, further comprising:

determining the frame extraction interval according to the motion speed of the target to be recognized according to the following formula:

N＝Mod(a×exp(b×X)+c×exp(d×X))；

8. The method according to any one of claims 2 to 4, wherein the decimation interval comprises a plurality of values, and wherein the decimating of the plurality of frames of the initial image from the initial video stream according to the decimation interval comprises:

and when the target to be identified is not identified, extracting a plurality of frames of initial images from the initial video stream according to the maximum value of the frame extraction interval.

9. The method according to claim 1, wherein after the target detection is performed on each frame of initial image to obtain an image containing the target to be identified, the method further comprises:

when the target to be identified is not detected, ending the target detection of the initial image of the current frame;

and continuously acquiring the next frame of initial image, and carrying out target identification on the next frame of initial image.

10. A method of video capture, the method comprising:

acquiring an initial video stream, wherein the initial video stream comprises a plurality of frames of images which are arranged according to a shooting time sequence;

extracting a plurality of frame images from the initial video stream to be used as initial images;

wherein the initial image is used for content recognition on the target.

11. The method according to claim 10, wherein said extracting a number of frame images from said initial video stream as initial images comprises:

and extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval.

12. The method of claim 11, wherein before extracting a number of frames of images from the initial video stream as initial images at a frame extraction interval, further comprising:

and receiving the frame extraction interval sent by the identification equipment.

13. An apparatus for identifying content on a target, the apparatus comprising:

the system comprises an initial image acquisition module, a video acquisition module and a video processing module, wherein the initial image acquisition module is used for acquiring partial images in an initial video stream as initial images, and the initial video stream comprises a plurality of frames of images which are arranged according to the shooting time sequence;

the target identification module is used for carrying out target identification on the initial image frame by frame according to the shooting time sequence;

wherein the object recognition module comprises:

the target detection unit is used for carrying out target detection on each frame of initial image to obtain an image containing a target to be identified;

the area identification unit is used for identifying a target area where content is located from an image containing a target to be identified, wherein the content is located on the target to be identified;

and the content identification unit is used for identifying the content of the target area to obtain identification information.

14. A video capture device, the device comprising:

the video acquisition module is used for acquiring an initial video stream, wherein the initial video stream comprises a plurality of frames of images which are arranged according to the shooting time sequence;

the image extraction module is used for extracting a plurality of frame images from the initial video stream as initial images according to the frame extraction interval;

15. A storage medium having stored thereon a computer program for implementing a method for identifying content on an object according to any one of claims 1 to 9 or the steps of a video capture method according to any one of claims 10 to 12 when executed by a processor.

16. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program performs the steps of the method for identifying content on an object according to any one of claims 1 to 9 or the method for capturing video according to any one of claims 10 to 12.