CN109033394B

CN109033394B - Client for picture video annotation data

Info

Publication number: CN109033394B
Application number: CN201810862044.3A
Authority: CN
Inventors: 钟博煊; 周礼; 许淞斐
Original assignee: Zhejiang Shenmou Technology Co ltd
Current assignee: Zhejiang Shenmou Technology Co ltd
Priority date: 2018-08-01
Filing date: 2018-08-01
Publication date: 2022-02-11
Anticipated expiration: 2038-08-01
Also published as: CN109033394A

Abstract

The invention discloses a client for picture video annotation data, which comprises a local picture calling module, a local video calling module, an online video calling module, a picture playing module, a video playing module, a label setting module and a label annotation module. The picture playing module plays the picture to be marked according to the preset picture switching time, and the video playing module is used for playing the video to be marked. And the label labeling module labels the picture to be labeled or the video to be labeled. The client for the image video annotation data disclosed by the invention has the beneficial effects that the annotation can be automatically added to the video or the image by setting the video frame number and the like, and a label to be annotated can be prepared in advance before the annotation is added, so that the workload and the working strength are reduced, and the defects of large workload, high repetition rate, long time consumption and the like of the traditional annotation method are effectively overcome.

Description

Client for picture video annotation data

Technical Field

The invention belongs to the technical field of computer vision based on deep learning, and particularly relates to a client for image video annotation data.

Background

In the field of computer vision technology based on deep learning, a large amount of picture/video data is required to train a training model in order to serve a deep learning algorithm. Objectively, the amount of data required to train a model is large. Therefore, how to reasonably and efficiently collect training data is one of the important topics for deep learning.

When labeled, each picture corresponds to a label, which is the "explanation" for the picture. Fig. 1 shows two labeled folders, the left folder is a picture folder, the right folder is a txt file, fig. 2 shows the txt file in fig. 1, and each line of data in the file corresponds to each picture on the left of the picture one by one.

In the conventional labeling method, the pictures need to be manually labeled. In other words, all pictures are put in a folder, and then each picture is manually corresponding to a label in the txt file, so that the one-to-one correspondence is needed, and the workload is huge. The video is more troublesome, a screenshot tool is also needed to link the video into pictures, twenty or so frames of pictures are possibly needed every second, namely twenty pictures every second, and then the method of marking the pictures is circulated.

Disclosure of Invention

The present invention overcomes the above-mentioned drawbacks and provides a client for picture video annotation data, which is directed to the state of the art.

The invention adopts the following technical scheme that the client for the image video annotation data comprises:

the system comprises a local picture calling module and a local video calling module, wherein the local picture calling module is used for reading a picture to be labeled stored in a local memory, and the local video calling module is used for reading a video to be labeled stored in the local memory;

the online picture calling module is used for acquiring a video to be annotated of an online live broadcast source;

the image playing module plays the image to be marked according to preset image switching time, and the video playing module is used for playing the video to be marked;

the system comprises a label setting module, a label setting module and a display module, wherein a plurality of label labels are arranged in the label setting module, and each label represents one scene;

and the label labeling module is used for labeling the picture to be labeled or the video to be labeled according to the label.

According to the technical scheme, the client for the image video annotation data further comprises a keyboard input module, the keyboard input module is provided with keyboard keys, the number of the keyboard keys is consistent with that of the annotation labels of the label setting module, and each keyboard key corresponds to each annotation label uniquely.

According to the technical scheme, the label labeling module matches the picture to be labeled or the video to be labeled with the label corresponding to the keyboard key according to the keyboard key input by the keyboard input module.

According to the technical scheme, the client for the image video annotation data further comprises a local image input module, and the local image input module is used for importing the image acquired by the acquisition equipment into a local memory.

According to the technical scheme, the client for the image video annotation data further comprises a local video input module, and the local video input module is used for importing the video acquired by the acquisition equipment into a local memory.

According to the technical scheme, the client for the image video annotation data further comprises a local image conversion module, and the local image conversion module is used for converting the image temporarily stored in the local memory into the image to be annotated in the uniform image format.

According to the technical scheme, the client for the image video annotation data further comprises a local video conversion module, and the local video conversion module is used for converting the video temporarily stored in the local memory into the video to be annotated with the uniform video format.

According to the technical scheme, the client for the image video annotation data further comprises an online video recording module, wherein the online video recording module is used for recording a video to be annotated of an online live broadcast source and temporarily storing the video to be annotated in a local storage.

According to the technical scheme, the client for the image video annotation data further comprises a multi-thread multi-task concurrent processing module, and the multi-thread multi-task concurrent processing module is used for supporting the label annotation module, the image playing module and the video playing module to run simultaneously.

The client for the image video annotation data disclosed by the invention has the beneficial effects that the annotation can be automatically added to the video or the image by setting the video frame number and the like, and a label to be annotated can be prepared in advance before the annotation is added, so that the workload and the working strength are reduced, and the defects of large workload, high repetition rate, long time consumption and the like of the traditional annotation method are effectively overcome.

Drawings

Fig. 1 is a schematic diagram of a conventional labeling method.

Fig. 2 is another schematic diagram of a conventional labeling method.

Fig. 3 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 4 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 5 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 6 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 7 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 8 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 9 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 10 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 11 is a schematic diagram of a preferred embodiment of the present invention.

Fig. 12 is a schematic view of a preferred embodiment of the present invention.

Detailed Description

The invention discloses a client for image video annotation data, and the specific implementation of the invention is further described below with reference to the preferred embodiment.

Referring to fig. 3 to 12 of the drawings, fig. 3 to 12 respectively show schematic diagrams of the client for the picture video annotation data.

Preferably, the client for the video annotation data of pictures disclosed in the present patent application includes:

Furthermore, the client for the image video annotation data further comprises a keyboard input module, the keyboard input module is provided with keyboard keys with the same number as the annotation labels of the label setting module, and each keyboard key is uniquely corresponding to each annotation label.

Further, when a user triggers any keyboard key, the label labeling module matches the picture to be labeled or the video to be labeled with the label corresponding to the keyboard key according to the keyboard key input by the keyboard input module.

Furthermore, the client for the image video annotation data further comprises a local image input module, and the local image input module is used for importing images acquired by acquisition equipment such as a camera into a local memory.

Furthermore, the client for the picture video annotation data further comprises a local video input module, and the local video input module is used for importing videos collected by collecting equipment such as a camera into a local memory.

Further, the client for the image video annotation data further comprises a local image conversion module, and the local image conversion module is used for converting images acquired by the acquisition equipment such as the camera temporarily stored in the local memory into the images to be annotated in the uniform image format.

Further, the client for the image video annotation data further comprises a local video conversion module, and the local video conversion module is used for converting videos acquired by the acquisition devices such as the camera temporarily stored in the local memory into videos to be annotated in a uniform video format.

Further, the client for the picture video annotation data further comprises an online video recording module, wherein the online video recording module is used for recording a video to be annotated of an online live broadcast source and temporarily storing the video to be annotated in a local storage.

Furthermore, the client for the image video annotation data further comprises a multithreading multitask concurrent processing module, and the multithreading multitask concurrent processing module is used for supporting the label annotation module, the image playing module and the video playing module to run simultaneously.

The online video calling module, the online video recording module and other functional modules of the client for the picture video annotation data preferably adopt a streaming media server Red5 framework easyDarwin.

According to the preferred embodiment, referring to fig. 3 of the drawings, the client for tagging data of picture videos disclosed in the patent application of the present invention can automatically add tags to videos or pictures by setting the number of video frames, and the like, and can prepare tags to be tagged in advance before adding tags, which is helpful for reducing workload and working strength, and effectively solves the defects of large workload, high repetition rate, long time consumption, and the like of the conventional tagging method.

It will be apparent to those skilled in the art that modifications and equivalents may be made in the embodiments and/or portions thereof without departing from the spirit and scope of the present invention.

Claims

1. A client for video annotation data of pictures, comprising:

the online video calling module is used for acquiring a video to be annotated of an online live broadcast source;

the label labeling module is used for labeling the picture to be labeled or the video to be labeled according to the label;

the client for the image video annotation data further comprises a keyboard input module, the keyboard input module is provided with keyboard keys, the number of the keyboard keys is consistent with that of the annotation labels of the label setting module, and each keyboard key corresponds to each annotation label uniquely;

and the label labeling module matches the picture to be labeled or the video to be labeled with the label corresponding to the keyboard key according to the keyboard key input by the keyboard input module.

2. The client according to claim 1, wherein the client further comprises a local picture input module, and the local picture input module is configured to import the picture captured by the capturing device into a local storage.

3. The client according to claim 2, wherein the client further comprises a local video input module, and the local video input module is configured to import the video captured by the capturing device into the local storage.

4. The client according to claim 3, wherein the client further comprises a local picture conversion module, and the local picture conversion module is configured to convert the picture temporarily stored in the local storage into the picture to be annotated in the unified picture format.

5. The client according to claim 4, wherein the client further comprises a local video conversion module, and the local video conversion module is configured to convert the video temporarily stored in the local storage into the video to be annotated in the unified video format.

6. The client according to claim 1, further comprising an online video recording module, wherein the online video recording module is configured to record a video to be annotated of an online live source, and temporarily store the video to be annotated in a local storage.

7. The client for the picture video annotation data according to any one of claims 1 to 6, wherein the client for the picture video annotation data further comprises a multi-thread multi-task concurrent processing module, and the multi-thread multi-task concurrent processing module is configured to support the tag annotation module, the picture playing module, and the video playing module to run simultaneously.