CN111445499B - Method and device for identifying target information - Google Patents

Method and device for identifying target information Download PDF

Info

Publication number
CN111445499B
CN111445499B CN202010219513.7A CN202010219513A CN111445499B CN 111445499 B CN111445499 B CN 111445499B CN 202010219513 A CN202010219513 A CN 202010219513A CN 111445499 B CN111445499 B CN 111445499B
Authority
CN
China
Prior art keywords
processed
target object
information
video frame
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010219513.7A
Other languages
Chinese (zh)
Other versions
CN111445499A (en
Inventor
庞荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010219513.7A priority Critical patent/CN111445499B/en
Publication of CN111445499A publication Critical patent/CN111445499A/en
Application granted granted Critical
Publication of CN111445499B publication Critical patent/CN111445499B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure discloses a method and a device for identifying target information, and belongs to the technical field of cloud computing. One embodiment of the method comprises the following steps: firstly, decomposing a video to be processed into a video frame sequence to be processed; then marking a target object image in a first to-be-processed video frame in the to-be-processed video frame sequence; marking the target object image in the preset video frames to be processed in the video frame sequence to be processed, and obtaining the motion characteristic information of a target object corresponding to the target object image; and finally, identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information. The embodiment can improve the accuracy and speed of tracking the target object.

Description

Method and device for identifying target information
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a method and apparatus for identifying target information.
Background
The target tracking is that, given the position and size of a target object (which may be a person or other moving object) in an initial video frame of a video, the position and size of the target in a video frame following the video is output, and the position of the target can be obtained according to an algorithm of target detection. When a section of video to be processed is acquired, the target to be tracked in the first video frame can be detected through a target detection algorithm, and then targets in other video frames are detected through a target tracking algorithm.
Disclosure of Invention
The embodiment of the disclosure provides a method and a device for identifying target information.
In a first aspect, embodiments of the present disclosure provide a method for identifying target information, the method comprising: decomposing a video to be processed into a video frame sequence to be processed, wherein the video frame sequence to be processed comprises at least one video frame to be processed, which is identified according to a time sequence, and the video frame to be processed comprises a target object image; marking a target object image in a first to-be-processed video frame in the to-be-processed video frame sequence; marking the target object image in the preset video frames to be processed in the video frame sequence to be processed to obtain the motion characteristic information of a target object corresponding to the target object image; and identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information.
In some embodiments, marking the target object image in the first video frame of the sequence of video frames to be processed includes: identifying at least one object to be detected contained in the first video frame to be processed, and displaying an image of the at least one object to be detected; and in response to detection of a determination signal of an image of the object to be detected in the images of the at least one object to be detected, marking the image of the object to be detected corresponding to the determination signal as a target object image.
In some embodiments, the marking the target object image in the previous set of the video frames to be processed in the sequence of video frames to be processed to obtain the motion feature information of the target object corresponding to the target object image includes: setting at least one type of designated mark point for the target object image for the video frames to be processed in the pre-set video frames to be processed; and determining the motion characteristic information of the target object through the motion trail of the same type of designated mark point in the preset video frames to be processed.
In some embodiments, the determining the motion feature information of the target object according to the motion trail of the same type of the designated mark point in the preset video frames includes: for the same type of specified mark points in the at least one type of specified mark points, acquiring motion tracks of the same type of specified mark points in the pre-set video frames to be processed; and calculating the relative position information between at least one motion trail corresponding to the at least one type of designated mark point, and determining the motion characteristic information of the target object.
In some embodiments, the calculating the relative position information between at least one motion trail corresponding to the at least one type of designated mark point, and determining the motion feature information of the target object include: calculating the relative position information of the at least one motion trail between the at least one type of designated mark points in the same video frame to be processed to obtain a relative position information sequence corresponding to the at least one motion trail; obtaining a relative position change information sequence based on the relative position information sequence, wherein the relative position change information contained in the relative position change information sequence is the distance difference between the next relative position information and the previous relative position information in the two adjacent relative position information; setting a designated mark point corresponding to the relative position change information smaller than a set distance threshold value in the relative position change information sequence as a reference mark point; and calculating target relative distance information between other types of specified mark points except the reference mark point and the reference mark point in the at least one type of specified mark points, and obtaining motion characteristic information of the target object according to the target relative distance information.
In some embodiments, the identifying the target object image in the to-be-identified to-be-processed video frame other than the pre-set to-be-processed video frame in the to-be-processed video frame sequence based on the motion characteristic information includes: determining a predicted position of each type of specified mark point in the at least one type of specified mark point in the to-be-identified video frame to be processed based on the motion characteristic information; and in response to the video frame to be identified, updating the motion characteristic information through the position information of the target object image in the video frame to be identified, wherein the target object image corresponds to the predicted position.
In a second aspect, embodiments of the present disclosure provide an apparatus for identifying target information, the apparatus comprising: a video decomposition unit configured to decompose a video to be processed into a sequence of video frames to be processed, wherein the sequence of video frames to be processed includes at least one video frame to be processed identified in a time sequence, and the video frame to be processed includes a target object image; a target object image marking unit configured to mark a target object image in a first video frame to be processed in the sequence of video frames to be processed; the motion characteristic information acquisition unit is configured to mark the target object image in the preset video frames to be processed in the video frame sequence to be processed to obtain motion characteristic information of a target object corresponding to the target object image; and a target recognition unit configured to recognize the target object image in the to-be-recognized to-be-processed video frames other than the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion feature information.
In some embodiments, the target object image marking unit includes: an object to be detected identifying subunit configured to identify at least one object to be detected contained in the first video frame to be processed, and display an image of the at least one object to be detected; and a target object image marking subunit, responsive to detecting a determination signal corresponding to an image of the object to be detected in the images of the at least one object to be detected, configured to mark the image of the object to be detected corresponding to the determination signal as a target object image.
In some embodiments, the motion feature information acquisition unit includes: a specified mark point setting subunit configured to set at least one type of specified mark point for the target object image for a video frame to be processed of the previously set video frames to be processed; and a motion characteristic information acquisition subunit configured to determine motion characteristic information of the target object through the motion trail of the same type designated mark point in the preset video frames to be processed.
In some embodiments, the motion feature information acquiring subunit includes: a motion trail acquisition module configured to acquire motion trail of the same type of specified mark point in the pre-set video frames for the same type of specified mark point in the at least one type of specified mark point; and the motion characteristic information acquisition module is configured to calculate the relative position information between at least one motion trail corresponding to the at least one type of designated mark point and determine the motion characteristic information of the target object.
In some embodiments, the motion feature information acquisition module includes: the relative position information calculation sub-module is configured to calculate the relative position information of the at least one motion trail between the at least one type of designated mark points in the same video frame to be processed, and obtain a relative position information sequence corresponding to the at least one motion trail; a relative position change information sequence obtaining sub-module configured to obtain a relative position change information sequence based on the relative position information sequence, where the relative position change information contained in the relative position change information sequence is a difference in distance between a next relative position information and a previous relative position information in two adjacent relative position information; a reference mark point setting sub-module configured to set a specified mark point corresponding to the relative position change information smaller than a set distance threshold value in the relative position change information sequence as a reference mark point; and the motion characteristic information acquisition sub-module is configured to calculate target relative distance information between other types of specified mark points except the reference mark point in the at least one type of specified mark points and the reference mark point, and acquire the motion characteristic information of the target object according to the target relative distance information.
In some embodiments, the object recognition unit includes: a predicted position acquisition subunit configured to determine, based on the motion feature information, a predicted position of each type of the at least one type of specified marker point in the to-be-identified to-be-processed video frame; and a target recognition subunit, responsive to the video frame to be recognized having the target object image corresponding to the predicted position, configured to update the motion feature information by the position information of the target object image in the video frame to be recognized.
In a third aspect, embodiments of the present disclosure provide an electronic device/terminal/server, comprising: one or more processors; and a memory having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to perform the method for identifying target information of the first aspect.
In a fourth aspect, embodiments of the present disclosure provide a computer readable medium having a computer program stored thereon, characterized in that the program, when executed by a processor, implements the method for identifying target information of the first aspect described above.
The embodiment of the disclosure provides a method and a device for identifying target information, which are characterized in that firstly, a video to be processed is decomposed into a video frame sequence to be processed; then marking a target object image in a first to-be-processed video frame in the to-be-processed video frame sequence; marking the target object image in the preset video frames to be processed in the video frame sequence to be processed, and obtaining the motion characteristic information of a target object corresponding to the target object image; and finally, identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information. The method and the device can improve the accuracy and the speed of tracking the target object.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure may be applied;
FIG. 2 is a flow chart of one embodiment of a method for identifying target information according to the present disclosure;
FIG. 3 is a schematic illustration of one application scenario of an apparatus for identifying target information according to the present disclosure;
FIG. 4 is a flow chart of yet another embodiment of a method for identifying target information according to the present disclosure;
FIG. 5 is a schematic structural diagram of one embodiment of a method and apparatus for identifying target information according to the present disclosure;
fig. 6 is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates an exemplary system architecture 100 to which a method for identifying target information or an apparatus for identifying target information of an embodiment of the present disclosure may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various image processing applications, such as an image acquisition application, a video editing application, a data transmission application, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting image acquisition, including but not limited to a monitoring camera, a smart phone, a tablet computer, a laptop portable computer, a desktop computer, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., to provide distributed services), or as a single software or software module, without limitation.
The server 105 may be a server providing various services, such as a server performing data processing on the video to be processed transmitted from the terminal devices 101, 102, 103. The server can analyze and the like the received data such as the video to be processed and identify the target object image in the video to be processed.
It should be noted that, the method for identifying target information provided by the embodiments of the present disclosure is generally performed by the server 105, and accordingly, the device for identifying target information is generally disposed in the server 105.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (for example, to provide a distributed service), or may be implemented as a single software or software module, which is not specifically limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of one embodiment of a method for identifying target information according to the present disclosure is shown. The method for identifying target information includes the steps of:
in step 201, the video to be processed is decomposed into a sequence of video frames to be processed.
In the present embodiment, the execution subject of the method for identifying target information (e.g., the server 105 shown in fig. 1) may receive the video to be processed from the terminal devices 101, 102, 103 by wired connection or wireless connection. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (Ultra Wideband) connections, and other now known or later developed wireless connection means.
The existing method is usually a method of correlation filtering or a deep learning method and the like for tracking the target. The method of correlation filtering cannot give consideration to the accuracy and tracking efficiency of target tracking; the deep learning method has higher requirements on the data set, the difficulty of acquiring the data set is higher, and the deep learning method is not easy to obtain through training.
Therefore, after the video to be processed is acquired, the video to be processed is first decomposed into a sequence of video frames to be processed. The sequence of video frames to be processed may include at least one video frame to be processed identified in a time sequence. The video frame to be processed may contain an image of the target object.
It should be noted that, the video to be processed in the present application may be a video acquired in real time, or may be a video acquired in non-real time. When the video to be processed is a video acquired in real time, real-time tracking of the target object can be realized.
Step 202, marking a target object image in a first video frame to be processed in the sequence of video frames to be processed.
In order to achieve tracking of the target object, the executing body first needs to determine the target object as soon as possible. The target object can be identified by utilizing a target identification algorithm, can be manually marked, or can be determined in other modes according to actual needs. The target object image may be one or more.
In some optional implementations of this embodiment, the marking the target object image in the first to-be-processed video frame in the to-be-processed video frame sequence may include the following steps:
first, at least one object to be detected contained in the first video frame to be processed is identified, and an image of the at least one object to be detected is displayed.
The executing body may identify all possible objects to be detected contained in the first video frame to be processed and display an image of the objects to be detected, so that the user selects a target object from the image of at least one object to be detected.
And a second step of marking the image of the object to be detected corresponding to the determination signal as a target object image in response to the detection of the determination signal of the image of the object to be detected in the image of the at least one object to be detected.
When the execution subject detects the determination signal, it is explained that the user designates the target object to be tracked. The execution subject may query the determination signal as to which image of the object to be detected corresponds, and then the execution subject may mark the image of the object to be detected as the target object image.
And 203, marking the target object image in the preset video frames in the video frame sequence to be processed to obtain the motion characteristic information of the target object corresponding to the target object image.
After the target object image is obtained, in order to realize accurate tracking of the target object, the method can also set the target object image in the video frame to be processed in advance in the video frame sequence to be processed for marking. Then, the execution body can obtain the motion characteristic information of the target object by marking the motion characteristics of the target object in the previous set video frames to be processed. The motion characteristic information of the application can be the characteristic of the target object during the motion process. For example, the target object may be the target person a, and the corresponding movement characteristic information may be information such as swing of an arm, size of a stride, and inclination of a body when the target person a walks.
In some optional implementations of this embodiment, the marking the target object image in the previous set of to-be-processed video frames in the to-be-processed video frame sequence to obtain the motion feature information of the target object corresponding to the target object image may include the following steps:
first, setting at least one type of designated mark point for the target object image for the video frame to be processed in the previously set video frames to be processed.
In order to acquire the movement characteristic information of the target object, the execution subject may set at least one type of designated mark point for the target object image. The specified mark point is a mark point of a specified position on the target object. For example, the designated mark points may be the hands, feet, shoulders, head, etc. of the target person a. Also, the number of mark points included in each type of designated mark point may be set according to actual needs.
And secondly, determining the motion characteristic information of the target object through the motion trail of the same type of designated mark point in the preset video frames to be processed.
The execution body can connect positions of the same type of designated mark points in the preset video frames to be processed, and the motion trail of the same type of designated mark points is obtained. It should be noted that if a certain type of designated mark point only includes one mark point, the mark point is connected to the position where the mark point appears in the previously set video frames to be processed, so as to obtain the motion trail. If a certain type of designated mark point comprises a plurality of mark points, numbering each mark point in the plurality of mark points, and connecting the positions of each mark point in the plurality of mark points in the front set to-be-processed video frames to obtain the motion trail.
After the motion trail is obtained, the execution body can analyze the motion trail and obtain motion characteristic information from the motion trail. For example, the execution subject may acquire the motion feature information according to a change trend, a change amount, and the like of the motion trail, and may also acquire the motion feature information according to other manners, which are specifically determined according to actual needs.
In some optional implementations of this embodiment, the determining the motion feature information of the target object according to the motion trail of the same type of designated mark point in the preset video frames may include the following steps:
first, for the same type of specified mark points in the at least one type of specified mark points, the motion trail of the same type of specified mark points in the pre-set video frames to be processed is acquired.
The execution body may first acquire a motion trajectory of each type of execution mark point in the previously set video frame to be processed. The method of obtaining the position information of the same type of designated mark point in the previous set to be the ordinate of the point on the coordinate axis, and the abscissa of the point is the time information of the corresponding to the to-be-processed video frame or the sequence number in the to-be-processed video frame sequence.
And step two, calculating the relative position information between at least one motion trail corresponding to the at least one type of designated mark point, and determining the motion characteristic information of the target object.
The position of the target object in the video frame to be processed may change due to the shooting angle of the video capturing apparatus, etc., but the change may be regarded as disturbance. In order to acquire the motion characteristics of the target object itself, the execution subject may calculate relative position information between at least one motion trajectory corresponding to at least one type of the specified mark points. That is, the relative position information may characterize the motion characteristics of the target object itself. The executing body may then perform further data processing on the relative position information to obtain motion feature information.
In some optional implementations of this embodiment, the calculating the relative position information between at least one motion trail corresponding to the at least one type of designated mark point and determining the motion feature information of the target object may include the following steps:
and calculating the relative position information of the at least one motion trail between the at least one type of designated mark points in the same video frame to be processed, and obtaining a relative position information sequence corresponding to the at least one motion trail.
The motion trail is constructed by designating mark points. The execution body can calculate the relative position information of a plurality of motion tracks among different types of designated mark points in the same video frame to be processed, and a relative position information sequence is obtained. Wherein the relative position information characterizes a relative movement between corresponding positions on the target object.
And a second step of obtaining a relative position change information sequence based on the relative position information sequence.
In order to obtain the motion relationship between different positions on the target object, the execution subject may calculate the change information of the relative position information, and obtain the relative position change information sequence. That is, the relative position change information included in the relative position change information sequence is a difference in distance between the next relative position information and the previous relative position information of the two adjacent relative position information. The relative position change information may characterize the magnitude of the change in the corresponding position on the target object as it moves.
And thirdly, setting a designated mark point corresponding to the relative position change information smaller than a set distance threshold value in the relative position change information sequence as a reference mark point.
Some of the plurality of marker points of the target object may be considered stationary relative to the body of the target object (e.g., the target object may be considered as having no positional change relative to the human body when moving, and some may be considered as moving relative to the body of the target object (e.g., the hand and foot may be considered as having positional change relative to the human body when moving, as for example a person). For this reason, the execution body may set a designated mark point corresponding to the relative position change information smaller than the set distance threshold value in the relative position change information sequence as the reference mark point.
And a fourth step of calculating target relative distance information between other types of specified mark points except the reference mark point in the at least one type of specified mark points and the reference mark point, and obtaining motion characteristic information of the target object according to the target relative distance information.
After obtaining the reference mark points, the execution body may recalculate target relative distance information between other types of specified mark points than the above-described reference mark points among the at least one type of specified mark points. And then, determining the characteristics of the positions of the target objects corresponding to other types of designated mark points in the motion process according to the target relative distance information to obtain motion characteristic information.
Step 204, identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information.
After the motion characteristic information is obtained, the executing body can process other to-be-identified to-be-processed video frames in the to-be-processed video frame sequence through the motion characteristic information so as to identify the target object image. Thus, accurate tracking of the target object can be achieved.
In some optional implementations of this embodiment, the identifying, based on the motion characteristic information, the target object image in the to-be-identified to-be-processed video frame other than the pre-set to-be-processed video frame in the to-be-processed video frame sequence may include the steps of:
and determining the predicted position of each type of specified mark point in the video frame to be identified and processed based on the motion characteristic information.
From the above description, the motion characteristic information may be a characteristic of the target object that is displayed during the motion. The video frames to be identified to be processed belong to a sequence of video frames to be processed. The motion characteristic information is obtained by setting a video frame to be processed in front of the video frame sequence to be processed. Accordingly, the execution subject can determine the predicted position of each of the at least one type of specified marker points in the to-be-identified pending video frame based on the motion feature information. For example, the motion characteristic information is obtained by the first 10 pending video frames in the sequence of pending video frames. The position of the target object in the 11 th video frame to be processed can be predicted according to the motion characteristic information.
And a second step of updating the motion characteristic information through the position information of the target object image in the to-be-identified to-be-processed video frame in response to the existence of the target object image corresponding to the predicted position of the to-be-identified to-be-processed video frame.
When the execution subject detects the target object image at the predicted position of the video frame to be identified to be processed, the motion characteristic information is described as being capable of accurately predicting the motion position of the target object. The executing body can update the motion characteristic information through the position information of the target object image in the to-be-identified to-be-processed video frame. In practice, the motion state of the target object is in a certain variation range (e.g., no abrupt change in speed occurs). Considering the frequency of the current terminal equipment for collecting video frames, the position information of the target object image in the video frame sequence to be processed in the video frame to be processed is not abrupt. Thus, the target object image can be found at the predicted position in general. The mode of updating the motion characteristic information may be to add new motion characteristic information according to the current position information of the target object. And then, the execution main body can circularly execute the steps to sequentially acquire the predicted positions in other to-be-identified to-be-processed video frames.
Therefore, the video frames to be processed of each frame are not independently identified, and the target objects are identified by combining the independent identification of the video frames to be processed and the motion characteristic information of the target objects, so that the accuracy of identifying the target objects is greatly improved.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for identifying target information according to the present embodiment. In the application scenario of fig. 3, the terminal device 103 may be a monitoring camera. The terminal device 103 may send the video to be processed including the target object acquired in real time to the server 105. The server 105 breaks down the video to be processed into a sequence of video frames to be processed; then marking a target object from a first video frame sequence to be processed in the video frame sequence to be processed; then, obtaining the motion characteristic information of the target object through a certain number of video frames to be processed in the video frame sequence to be processed; and finally identifying other target objects of the video frames to be processed through the motion characteristic information. Therefore, the video frames to be processed of each frame are not independently identified, and the target objects are identified by combining the independent identification of the video frames to be processed and the motion characteristic information of the target objects, so that the accuracy of identifying the target objects is greatly improved.
The method provided by the above-mentioned embodiment of the present disclosure first decomposes a video to be processed into a sequence of video frames to be processed; then marking a target object image in a first to-be-processed video frame in the to-be-processed video frame sequence; marking the target object image in the preset video frames to be processed in the video frame sequence to be processed, and obtaining the motion characteristic information of a target object corresponding to the target object image; and finally, identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information. The method and the device can improve the accuracy and the speed of tracking the target object.
With further reference to fig. 4, a flow 400 of yet another embodiment of a method for identifying target information is shown. The process 400 of the method for identifying target information includes the steps of:
in step 401, the video to be processed is decomposed into a sequence of video frames to be processed.
The content of step 401 is the same as that of step 201, and will not be described in detail here.
Step 402, marking a target object image in a first video frame to be processed in the sequence of video frames to be processed.
The content of step 402 is the same as that of step 202 and will not be described in detail here.
And step 403, marking the target object image in the preset video frames in the video frame sequence to be processed, and obtaining the motion characteristic information of the target object corresponding to the target object image.
The content of step 403 is the same as that of step 203, and will not be described in detail here.
Step 404, identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information.
The content of step 404 is the same as that of step 204 and will not be described in detail here.
Step 405, identifying the target object image in the to-be-identified to-be-processed video frame.
The execution body can mark the target object image through a mark frame or recolouring and the like, so that a technician can observe the tracking effect of the target object, and the technician can carry out subsequent processing on the target object image.
With further reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of an apparatus for identifying target information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.
As shown in fig. 5, the apparatus 500 for identifying target information of the present embodiment may include: a video decomposition unit 501, a target object image marking unit 502, a motion feature information acquisition unit 503, and a target recognition unit 504. Wherein the video decomposition unit 501 is configured to decompose the video to be processed into a sequence of video frames to be processed, where the sequence of video frames to be processed includes at least one video frame to be processed identified in time sequence, and the video frame to be processed includes the target object image; the target object image marking unit 502 is configured to mark a target object image in a first video frame to be processed in the above-described sequence of video frames to be processed; the motion feature information obtaining unit 503 is configured to mark the target object image in the pre-set video frame in the sequence of video frames to be processed, so as to obtain motion feature information of a target object corresponding to the target object image; the object recognition unit 504 is configured to recognize the object image in the to-be-recognized to-be-processed video frames other than the previously set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information.
In some optional implementations of this embodiment, the target object image marking unit 502 may include: an object to be detected recognition subunit (not shown) and a target object image marking subunit (not shown). Wherein the object to be detected identifying subunit is configured to identify at least one object to be detected contained in the first video frame to be processed, and display an image of the at least one object to be detected; and a target object image marking subunit, responsive to detecting a determination signal corresponding to an image of the object to be detected in the images of the at least one object to be detected, configured to mark the image of the object to be detected corresponding to the determination signal as a target object image.
In some optional implementations of this embodiment, the motion feature information obtaining unit 503 may include: a designated mark point setting subunit (not shown in the figure) and a motion feature information acquisition subunit (not shown in the figure). A specified mark point setting subunit configured to set at least one type of specified mark point for the target object image for a to-be-processed video frame of the previously set to-be-processed video frames; and a motion characteristic information acquisition subunit configured to determine motion characteristic information of the target object through the motion trail of the same type designated mark point in the preset video frames to be processed.
In some optional implementations of this embodiment, the motion feature information obtaining subunit may include: a motion trajectory acquisition module (not shown) and a motion feature information acquisition module (not shown). The motion trail acquisition module is configured to acquire motion trail of the same type of specified mark point in the preset video frames to be processed for the same type of specified mark point in the at least one type of specified mark point; and the motion characteristic information acquisition module is configured to calculate the relative position information between at least one motion trail corresponding to the at least one type of designated mark point and determine the motion characteristic information of the target object.
In some optional implementations of this embodiment, the motion feature information obtaining module may include: a relative position information calculation sub-module (not shown in the figure), a relative position change information sequence acquisition sub-module (not shown in the figure), a reference mark point setting sub-module (not shown in the figure), and a motion feature information acquisition sub-module (not shown in the figure). The relative position information calculation sub-module is configured to calculate the relative position information of the at least one motion trail between the at least one type of designated mark points in the same video frame to be processed, and obtain a relative position information sequence corresponding to the at least one motion trail; a relative position change information sequence obtaining sub-module configured to obtain a relative position change information sequence based on the relative position information sequence, where the relative position change information contained in the relative position change information sequence is a difference in distance between a next relative position information and a previous relative position information in two adjacent relative position information; a reference mark point setting sub-module configured to set a specified mark point corresponding to the relative position change information smaller than a set distance threshold value in the relative position change information sequence as a reference mark point; and the motion characteristic information acquisition sub-module is configured to calculate target relative distance information between other types of specified mark points except the reference mark point in the at least one type of specified mark points and the reference mark point, and acquire the motion characteristic information of the target object according to the target relative distance information.
In some optional implementations of this embodiment, the target identifying unit 504 may include: a predicted position acquisition subunit (not shown) and a target recognition subunit (not shown). Wherein the predicted position acquisition subunit is configured to determine, based on the motion feature information, a predicted position of each type of the at least one type of specified marker point in the to-be-identified to-be-processed video frame; and a target recognition subunit, responsive to the video frame to be recognized having the target object image corresponding to the predicted position, configured to update the motion feature information by the position information of the target object image in the video frame to be recognized.
The embodiment also provides an electronic device, including: one or more processors; and a memory having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to perform the method for identifying target information.
The present embodiment also provides a computer-readable medium having stored thereon a computer program which, when executed by a processor, implements the above-described method for identifying target information.
Referring now to FIG. 6, there is illustrated a schematic diagram of a computer system 600 suitable for use with an electronic device (e.g., server 105 of FIG. 1) implementing embodiments of the present disclosure. The electronic device shown in fig. 6 is merely an example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 6, the electronic device 600 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the electronic apparatus 600 are also stored. The processing device 601, the ROM 602, and the RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
In general, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, magnetic tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device 600 to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 shows an electronic device 600 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 6 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 609, or from storage means 608, or from ROM 602. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing means 601.
It should be noted that, the above-mentioned computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the above-mentioned two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: decomposing a video to be processed into a video frame sequence to be processed, wherein the video frame sequence to be processed comprises at least one video frame to be processed, which is identified according to a time sequence, and the video frame to be processed comprises a target object image; marking a target object image in a first to-be-processed video frame in the to-be-processed video frame sequence; marking the target object image in the preset video frames to be processed in the video frame sequence to be processed to obtain the motion characteristic information of a target object corresponding to the target object image; and identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a video decomposition unit, a target object image marking unit, a motion feature information acquisition unit, and a target recognition unit. The names of these units do not constitute limitations on the unit itself in some cases, and for example, the object recognition unit may also be described as "a unit that recognizes an image of an object by motion feature information".
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which features described above or their equivalents may be combined in any way without departing from the spirit of the invention. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (8)

1. A method for identifying target information, comprising:
decomposing a video to be processed into a video frame sequence to be processed, wherein the video frame sequence to be processed comprises at least one video frame to be processed, which is identified according to a time sequence, and the video frame to be processed comprises a target object image;
marking a target object image in a first to-be-processed video frame in the sequence of to-be-processed video frames;
setting at least one type of designated mark point for the target object image by setting the video frames to be processed in the video frame sequence to be processed;
For the same type of specified mark points in the at least one type of specified mark points, acquiring the motion trail of the same type of specified mark points in the pre-set video frames to be processed;
calculating the relative position information of the at least one motion trail between the at least one type of designated mark points in the same video frame to be processed to obtain a relative position information sequence corresponding to the at least one motion trail;
obtaining a relative position change information sequence based on the relative position information sequence, wherein the relative position change information contained in the relative position change information sequence is the distance difference between the next relative position information and the previous relative position information in the two adjacent relative position information;
setting a designated mark point corresponding to the relative position change information smaller than a set distance threshold value in the relative position change information sequence as a reference mark point;
calculating target relative distance information between other types of specified mark points except the reference mark point in the at least one type of specified mark points and the reference mark point, and obtaining motion characteristic information of the target object according to the target relative distance information;
And identifying the target object image in the to-be-identified to-be-processed video frames except the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion characteristic information.
2. The method of claim 1, wherein said marking a target object image in a first of said sequence of video frames to be processed comprises:
identifying at least one object to be detected contained in the first video frame to be processed, and displaying an image of the at least one object to be detected;
and in response to detecting a determining signal corresponding to an image of the object to be detected in the images of the at least one object to be detected, marking the image of the object to be detected corresponding to the determining signal as a target object image.
3. The method of claim 1, wherein the identifying the target object image in the sequence of video frames to be processed, other than the pre-set number of video frames to be processed, based on the motion characteristic information comprises:
determining the predicted position of each type of specified mark point in the at least one type of specified mark point in the video frame to be identified and processed based on the motion characteristic information;
And in response to the video frame to be identified, the target object image corresponding to the predicted position exists, and the motion characteristic information is updated through the position information of the target object image in the video frame to be identified.
4. An apparatus for identifying target information, comprising:
a video decomposition unit configured to decompose a video to be processed into a sequence of video frames to be processed, wherein the sequence of video frames to be processed contains at least one video frame to be processed identified in a time sequence, the video frame to be processed contains a target object image;
a target object image marking unit configured to mark a target object image in a first video frame to be processed in the sequence of video frames to be processed;
a motion feature information acquisition unit configured to set at least one type of designated mark point for the target object image for a video frame to be processed of a preceding set of video frames to be processed in the sequence of video frames to be processed; for the same type of specified mark points in the at least one type of specified mark points, acquiring the motion trail of the same type of specified mark points in the pre-set video frames to be processed; calculating the relative position information of the at least one motion trail between the at least one type of designated mark points in the same video frame to be processed to obtain a relative position information sequence corresponding to the at least one motion trail; obtaining a relative position change information sequence based on the relative position information sequence, wherein the relative position change information contained in the relative position change information sequence is the distance difference between the next relative position information and the previous relative position information in the two adjacent relative position information; setting a designated mark point corresponding to the relative position change information smaller than a set distance threshold value in the relative position change information sequence as a reference mark point; calculating target relative distance information between other types of specified mark points except the reference mark point in the at least one type of specified mark points and the reference mark point, and obtaining motion characteristic information of the target object according to the target relative distance information;
And a target recognition unit configured to recognize the target object image in the to-be-recognized to-be-processed video frames other than the pre-set to-be-processed video frames in the to-be-processed video frame sequence based on the motion feature information.
5. The apparatus of claim 4, wherein the target object image marking unit comprises:
a to-be-detected object recognition subunit configured to recognize at least one to-be-detected object contained in the first to-be-processed video frame and display an image of the at least one to-be-detected object;
and a target object image marking subunit, responsive to detecting a determination signal corresponding to an image of an object to be detected in the images of the at least one object to be detected, configured to mark the image of the object to be detected corresponding to the determination signal as a target object image.
6. The apparatus of claim 4, wherein the object recognition unit comprises:
a predicted position acquisition subunit configured to determine, based on the motion feature information, a predicted position of each type of the at least one type of specified marker point in the to-be-identified to-be-processed video frame;
and the target recognition subunit is used for responding to the target object image corresponding to the predicted position of the to-be-recognized to-be-processed video frame and is configured to update the motion characteristic information through the position information of the target object image in the to-be-recognized to-be-processed video frame.
7. An electronic device, comprising:
one or more processors;
a memory having one or more programs stored thereon,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-3.
8. A computer readable medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 3.
CN202010219513.7A 2020-03-25 2020-03-25 Method and device for identifying target information Active CN111445499B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010219513.7A CN111445499B (en) 2020-03-25 2020-03-25 Method and device for identifying target information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010219513.7A CN111445499B (en) 2020-03-25 2020-03-25 Method and device for identifying target information

Publications (2)

Publication Number Publication Date
CN111445499A CN111445499A (en) 2020-07-24
CN111445499B true CN111445499B (en) 2023-07-18

Family

ID=71650699

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010219513.7A Active CN111445499B (en) 2020-03-25 2020-03-25 Method and device for identifying target information

Country Status (1)

Country Link
CN (1) CN111445499B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528945B (en) * 2020-12-24 2024-04-26 上海寒武纪信息科技有限公司 Method and device for processing data stream
CN113850837B (en) * 2021-11-25 2022-02-08 腾讯科技(深圳)有限公司 Video processing method and device, electronic equipment, storage medium and computer product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147722A (en) * 2019-04-11 2019-08-20 平安科技(深圳)有限公司 A kind of method for processing video frequency, video process apparatus and terminal device
CN110188719A (en) * 2019-06-04 2019-08-30 北京字节跳动网络技术有限公司 Method for tracking target and device
CN110363814A (en) * 2019-07-25 2019-10-22 Oppo(重庆)智能科技有限公司 A kind of method for processing video frequency, device, electronic device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147722A (en) * 2019-04-11 2019-08-20 平安科技(深圳)有限公司 A kind of method for processing video frequency, video process apparatus and terminal device
CN110188719A (en) * 2019-06-04 2019-08-30 北京字节跳动网络技术有限公司 Method for tracking target and device
CN110363814A (en) * 2019-07-25 2019-10-22 Oppo(重庆)智能科技有限公司 A kind of method for processing video frequency, device, electronic device and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A Method for Visualizing Pedestrian TrafficFlow Using SIFT Feature Point Tracking;Yuji Tsuduki;《Paciffic-Rim Symposium on Image and Video Technology(PSIVT-2009)》;第5414卷;2-11,图1-3,5,7,9,10 *
Combining local and global motion models for feature point tracking;Aeron Buchanan等;《2017 IEEE conference on Computer Vision and Pattern Recognition》;全文 *
一种融合改进 Mean Shift 和;杜凯等;《广西大学学报:自然科学版》;第39卷(第5期);全文 *

Also Published As

Publication number Publication date
CN111445499A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
CN109584276B (en) Key point detection method, device, equipment and readable medium
WO2020062493A1 (en) Image processing method and apparatus
EP3885980A1 (en) Method and apparatus for processing information, device, medium and computer program product
CN110059623B (en) Method and apparatus for generating information
CN108646736A (en) Method for tracking target and device for tracking robot
CN110717918B (en) Pedestrian detection method and device
CN108509921B (en) Method and apparatus for generating information
CN111783626B (en) Image recognition method, device, electronic equipment and storage medium
EP3985610A1 (en) Audio collection device positioning method and apparatus, and speaker recognition method and system
CN111445499B (en) Method and device for identifying target information
CN113910224B (en) Robot following method and device and electronic equipment
JP2024502516A (en) Data annotation methods, apparatus, systems, devices and storage media
CN111310595B (en) Method and device for generating information
CN109903308B (en) Method and device for acquiring information
CN111382701B (en) Motion capture method, motion capture device, electronic equipment and computer readable storage medium
CN112307323B (en) Information pushing method and device
CN111401229B (en) Automatic labeling method and device for small visual targets and electronic equipment
CN110189364B (en) Method and device for generating information, and target tracking method and device
CN115900713A (en) Auxiliary voice navigation method and device, electronic equipment and storage medium
CN113703704B (en) Interface display method, head-mounted display device, and computer-readable medium
CN113034580B (en) Image information detection method and device and electronic equipment
CN111401182B (en) Image detection method and device for feeding rail
CN111694875B (en) Method and device for outputting information
US20220215648A1 (en) Object detection device, object detection system, object detection method, program, and recording medium
CN111586295A (en) Image generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant