CN109977822B - Data supply method, model training method, device, system, equipment and medium - Google Patents

Data supply method, model training method, device, system, equipment and medium Download PDF

Info

Publication number
CN109977822B
CN109977822B CN201910197522.8A CN201910197522A CN109977822B CN 109977822 B CN109977822 B CN 109977822B CN 201910197522 A CN201910197522 A CN 201910197522A CN 109977822 B CN109977822 B CN 109977822B
Authority
CN
China
Prior art keywords
data
training
video
model
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910197522.8A
Other languages
Chinese (zh)
Other versions
CN109977822A (en
Inventor
梁柱锦
刘运
蒋德为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Wangxing Information Technology Co Ltd
Original Assignee
Guangzhou Wangxing Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Wangxing Information Technology Co Ltd filed Critical Guangzhou Wangxing Information Technology Co Ltd
Priority to CN201910197522.8A priority Critical patent/CN109977822B/en
Publication of CN109977822A publication Critical patent/CN109977822A/en
Application granted granted Critical
Publication of CN109977822B publication Critical patent/CN109977822B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses a data supply method, a model training method, a device, a system, equipment and a medium. The data supply method comprises the following steps: acquiring a training request aiming at a video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training; acquiring matched target video data from a distributed storage data set according to the data identifier, wherein the distributed storage data set comprises video data of various types; and processing the target video data according to the batch processing mechanism to obtain training data corresponding to the video model. According to the technical scheme provided by the embodiment of the invention, the video data is directly trained, the occupied storage space is small, the time spent on reading the video data is small, and the training efficiency of the video model is improved.

Description

Data supply method, model training method, device, system, equipment and medium
Technical Field
The embodiment of the invention relates to the field of videos, in particular to a data supply method, a model training method, a device, a system, equipment and a medium.
Background
At present, for training a video model, video data is generally acquired first, the video data is analyzed into corresponding pictures (namely video frames) and audio information, the pictures and the audio information are respectively stored into data files, and in the process of training the video model, at least one of the following modes is adopted: the pictures are read from the picture data file for training and the audio information is read from the audio data file for training.
The training data supply mode of the existing video model is adopted: if video data is directly stored, the storage space required for the picture data file and the audio data file is larger, and the storage space required for the picture data file and the audio data file is larger than the storage space of the corresponding video data, so that more storage space is required for the training device. In addition, since the data amount of the picture data file and the audio data file is very large, it takes much time for reading at least one of the picture and the audio information during the training, so that the training efficiency is low.
Disclosure of Invention
The embodiment of the invention provides a data supply method, a model training method, a device, a system, equipment and a medium, which improve the training efficiency of a video model.
In a first aspect, an embodiment of the present invention provides a data supplying method, including:
acquiring a training request aiming at a video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training;
acquiring matched target video data from a distributed storage data set according to the data identifier, wherein the distributed storage data set comprises video data of various types;
and processing the target video data according to the batch processing mechanism to obtain training data corresponding to the video model.
In a second aspect, an embodiment of the present invention provides a model training method, including:
according to the data supply method in the first aspect, training data corresponding to the video model is obtained;
and inputting the training data into the video model to obtain a trained video model.
In a third aspect, an embodiment of the present invention provides a data supply apparatus, including:
the training request acquisition module is used for acquiring a training request aiming at the video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training;
the target data acquisition module is used for acquiring matched target video data from a distributed storage data set according to the data identification, wherein the distributed storage data set comprises video data of various types;
And the training data determining module is used for processing the target video data according to the batch processing mechanism to obtain training data corresponding to the video model.
In a fourth aspect, an embodiment of the present invention provides a model training apparatus, including:
the training data acquisition module is used for acquiring training data corresponding to the video model according to the data supply method in the first aspect;
and the video model training module is used for inputting the training data into the video model to obtain a trained video model.
In a fifth aspect, an embodiment of the present invention provides a data supply system, including: the data supply end is respectively connected with the distributed data storage end and the batch loading end; the distributed data storage end stores a distributed storage data set; the batch loading end stores a batch processing mechanism and generates a training request; the data supply terminal is provided with the data supply device as in the third aspect.
In a sixth aspect, an embodiment of the present invention provides a model training system, including: the model training end is respectively connected with the distributed data storage end and the batch loading end; the distributed data storage end stores a distributed storage data set; the batch loading end stores a batch processing mechanism and generates a training request; the model training terminal is provided with the model training apparatus as in the fourth aspect.
In a seventh aspect, an embodiment of the present invention provides an apparatus, including:
one or more processors;
a storage means for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data provision method described in the first aspect of the present invention or to implement the model training method described in the second aspect of the present invention.
In an eighth aspect, an embodiment of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the data supply method described in the first aspect of the present invention or implements the model training method described in the second aspect of the present invention.
The embodiment of the invention provides a data supply method, a model training method, a device, a system, equipment and a medium, wherein matched target video data is acquired in a distributed storage data set through a data identifier in a training request, meanwhile, the target video data is processed according to a preset batch processing mechanism, a large amount of time is not required to be spent for setting a data processing function, training data corresponding to a video model to be trained is obtained, and compared with a video model training mode for training pictures or audio information in the prior art, the technical scheme of the embodiment of the invention is adopted to directly train the video data, the occupied storage space is small, the time required for reading the video data is less, and the training efficiency of the video model is improved.
Drawings
Other features, objects and advantages of the present invention will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1A is a flowchart of a data supplying method according to a first embodiment of the present invention;
FIG. 1B is an original block diagram of a data supply process according to a first embodiment of the present invention;
FIG. 2A is a flowchart of a data supplying method according to a second embodiment of the present invention;
fig. 2B is a schematic diagram of a data supplying process according to a second embodiment of the present invention;
FIG. 3A is a flowchart of a model training method according to a third embodiment of the present invention;
FIG. 3B is a schematic diagram of a model training process according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data supply device according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a model training device according to a fifth embodiment of the present invention;
FIG. 6 is a schematic diagram of a data supply system according to a sixth embodiment of the present invention;
FIG. 7 is a schematic diagram of a model training system according to a seventh embodiment of the present invention;
fig. 8 is a schematic structural diagram of an apparatus according to an eighth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings. Furthermore, embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1
Fig. 1A is a flowchart of a data supply method according to an embodiment of the present invention, where the data supply method according to the embodiment of the present invention may be implemented by a data supply device according to an embodiment of the present invention, and the device may be implemented by software and/or hardware, and integrated into a device for implementing the method, where the device may be any intelligent terminal device carrying corresponding data processing capabilities.
Specifically, referring to fig. 1A, the method may include the steps of:
s110, acquiring a training request for a video model.
The training request comprises a preset batch processing mechanism and a data identifier corresponding to the training. Specifically, with the wide application of the neural network model in terms of data processing, the deep learning technology can simulate the behavior characteristics of the neural network to perform information processing, so as to achieve the purpose of video processing, and construct a video model with various self-learning and self-adapting video processing functions. When a video model is built, a large amount of training data is needed to carry out iterative training on an initially set neural network model, so that the trained video model can accurately achieve the aim of video processing for any video, and therefore, the training request in the embodiment can be used for indicating the video model to be trained, and the training data of the video model in the subsequent training process is needed to be acquired in advance.
Optionally, because the training requirements of the video model are different, the training request for the video model includes a batch processing mechanism that is preset for different training tasks and that is required to be satisfied by the training data in the corresponding batch during each iteration training, and a data identifier for the training data corresponding to the training. The batch processing mechanism comprises different data composition requirements of training data under corresponding batches in corresponding iterative training set according to different training tasks corresponding to a video model to be trained; the data identifier refers to a flag capable of uniquely indicating training data required for the present training, and in this embodiment, the data identifier may be a uniform resource locator (Uniform Resource Locator, URL) of video data, where the URL may be used to represent a file address stored in local or network video data, and also includes information such as a protocol and a path that are correspondingly satisfied when the video data is acquired.
Specifically, when a certain video model needs to be trained, so that the video model has a corresponding video processing function, a user can execute corresponding training operation to generate a corresponding training request, wherein the training operation can be selecting a data identifier of training data participating in the training, so as to generate a corresponding identifier list, and setting a batch processing mechanism which is required to be met by the training, so that the training request of the training for the video model to be trained is generated according to the identifier list and the batch processing mechanism, and the training data participating in the training is acquired later.
For example, "batch" in the batch processing mechanism in this embodiment refers to batch in machine learning training, that is, all training data that corresponds to participation in one iteration training; at this time, according to different training tasks, different requirements are set for the composition of training data in a batch; for common video classification training, the quantity of video data carrying various labels in the batch is required to be balanced as much as possible; for pair-wise video training, video data inside the batch is required to appear in pairs; training with a loss function (triple-loss), then requires that video data inside the batch appear in triples; meanwhile, some training tasks have requirements on the loading sequence of training data, other training tasks have requirements on difficult-to-find, and the composition of a training set can be dynamically adjusted according to training results; at this time, the different training requirements may be set in the batch processing mechanism in this embodiment.
In addition, as shown in fig. 1B, the schematic block diagram of data supply in this embodiment is shown in fig. 1B, and the user may execute a triggering operation corresponding to model training on the batch loading end, and select a data identifier of video data participating in the present training on the batch loading end, and generate a corresponding identifier list, and simultaneously acquire a batch processing mechanism selected in the present training, and generate a corresponding training request together with the identifier list, so that the data supply end can acquire the training request generated for the video model to be trained at the present time.
And S120, acquiring matched target video data from the distributed storage data set according to the data identification.
Wherein the distributed storage data set includes video data of each type; in particular, in order to improve flexibility of data supply, the distributed storage dataset may support multiple storage and reading modes of video data, where the distributed storage dataset may include video data stored in a local disk in a single file manner, video data that is collected in a packet manner to allow video data participating in training to be stored in a local disk, video data that is stored in a multi-port manner using a distributed data storage protocol (such as single video data or video data packets stored in a Hadoop distributed file system (Hadoop Distributed Filesystem, HDFS)), video data stored in an arbitrary network address via a network data protocol (such as video data stored in an arbitrary network address accessed via a URL), video data that may include video data published on the internet, video data buffered on a content distribution network (Content Distribution Network, CDN), video data uploaded to an active distributed file system (Fast Distributed File System, fastDFS), and video data shared by servers that open a hypertext transfer protocol (Hyper Text Transport Protocol, HTTP) service in the same local area network, and so on.
Alternatively, given that the training requirements for different video models are different, the training data formats required during training of the video models may be different, and the distributed storage data set of this embodiment may provide two different data storage strategies, one according to a storage strategy of a single video sample access (including a single video data stored in a local disk or URL access in a network, etc.), and the other according to a storage strategy of a package access to multiple video samples (including video data packets stored in a local disk or video data packets stored on a distributed file system such as HDFS/FastDFS, etc.). Further, for storing single video data, the distributed storage data set can store video associated contents such as URL, file name, data tag and other additional information of the single video data, so that various relevant information of the video data participating in training can be obtained when corresponding training data is provided for a video model to be trained subsequently; meanwhile, the embodiment also provides a set of packing program for video data, and if packing requirements exist during video data storage, the set of packing program provided in the embodiment can be adopted to pack various content information associated with corresponding video data, and then the content information is stored in a corresponding position in a distributed storage data set in a form of a video data packet. In addition, different data storage strategies have different advantages, and a single video sample access mode can provide random access of video data, so that the method is suitable for occasions with high requirements on data dynamic generation or data sequence randomness; the package access mode is characterized by high data reading speed, and can overcome the problem of data input/output delay caused by random access of a distributed storage data set; in the embodiment, when the user generates a training request by executing the corresponding triggering operation at the batch loading end, the user can adapt to the selection of video data under different storage strategies according to the training task of the video model to be trained at this time, so that the flexibility of acquiring the training data is improved; the data identifier in this embodiment may be an identifier of single video data, or may be a packet identifier of a packet of video data after being packaged.
Specifically, when a training request for a video model to be trained is obtained, the training request can be parsed to obtain a preset batch processing mechanism and a corresponding data identifier, wherein the preset batch processing mechanism and the corresponding data identifier are suitable for the training task; according to the data identification of all video data participating in the training request, acquiring matched target video data in a distributed storage data set; the target video data is the video data participating in the training, and comprises the video associated content such as the corresponding stored file name, label and other additional information.
As shown in fig. 1B, the storage locations in the distributed storage data set in this embodiment may include a local file system, a CDN cluster, an HDFS cluster, and a FastDFS cluster, where the CDN cluster, the FastDFS cluster, the HDFS cluster, and a training server cluster where each video model to be trained is located are mixed, each server contains hundreds of G memory capacity, several graphics cards for supporting model training, a disk array (Redundant Arrays of Independent Disks, RAID) with tens of T capacity, a central processor (Central Processing Unit, CPU) with tens of cores, and each server is connected by using a tera-meganetwork; the frequently accessed data can be cached in the memory at this time, so that the input/output of the hard disk can be reduced to the greatest extent during training, the reading speed of video data is improved, and the memory and hard disk resources of the server are effectively utilized; meanwhile, the video data are stored in a centralized mode by using a distributed storage mode, so that the distributed training is not required to copy training data in advance, and the data preparation stage in model training is accelerated.
S130, processing the target video data according to a batch processing mechanism to obtain training data corresponding to the video model.
Specifically, when the matched target video data is obtained in the distributed storage data set according to the data identifier corresponding to the training, the target video data can be subjected to corresponding batch processing according to a preset batch processing mechanism carried in the training request; specifically, in this embodiment, various video related contents corresponding to target video data under a corresponding batch may be received through an open transmission control protocol (Transmission Control Protocol, TCP) port, and the target video data under the batch is subjected to grouping processing according to a grouping mode required by a training task and is loaded into a memory, so as to obtain training data corresponding to a video model to be trained; when the video model is trained subsequently, corresponding decoding and preprocessing operations can be carried out on the training data, and the training data is converted into a format designated by the video model to be trained so as to facilitate subsequent training.
When the scheme provided by the embodiment is adopted to supply training data to the video model to be trained, the corresponding target video data can be obtained only according to the data identifier corresponding to the training, downloading and packaging processing of the video data are not needed in advance, and when distributed training is carried out, the video data participating in the training are not needed to be transmitted to all machines participating in the training, so that the preparation time of the training data of the model training is greatly shortened; meanwhile, video data stored in the distributed storage data set are compressed, when target video data is downloaded from the distributed storage data set to the memory according to the data identification, a large amount of bandwidth resources of training equipment are not occupied, own disk input/output resources are not occupied, decoding and preprocessing of subsequent training data mainly occupy own CPU, and the subsequent training data can be processed in parallel with training of a video model occupying an image processing unit (graphics processing unit, GPU), so that additional time is not required to be spent on waiting for data preprocessing, the time of training the video model is correspondingly reduced, and the hardware utilization rate in the training equipment is greatly improved. At the same time, the time-consuming data preparation and data preprocessing are standardized, and a set of flexible and customizable batch generation interfaces is provided, so that an algorithm engineer can concentrate on the improvement of a video model or the improvement of a training method, and a great deal of time is not required to be spent on processing data; and the end-to-end training mode can enable the video model to be more fit with the characteristics of service data, so that a better model can be trained, and the flexibility of video model training is improved.
According to the technical scheme provided by the embodiment, the matched target video data is obtained in the distributed storage data set through the data identification in the training request, meanwhile, the target video data is processed according to the preset batch processing mechanism, a large amount of time is not required to be spent for setting the data processing function, the training data corresponding to the video model to be trained is obtained, the video data is directly trained, the occupied storage space is small, the time spent for reading the video data is small, and the training efficiency of the video model is improved.
Example two
Fig. 2A is a flowchart of a data supplying method according to a second embodiment of the present invention, and fig. 2B is a schematic diagram of a data supplying process according to a second embodiment of the present invention. In this embodiment, optimization is performed based on the technical solution provided in the foregoing embodiment. Specifically, in this embodiment, a detailed explanation will be mainly given of a specific process of acquiring target video data in a distributed storage data set.
Optionally, as shown in fig. 2A, the present embodiment may include the following steps:
s210, acquiring a training request aiming at a video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training.
S220, determining the type of the data identifier, and if the data identifier is a single video identifier, acquiring matched single video data from a distributed storage data set according to the single video identifier; and if the data identifier is a packaged video identifier, acquiring matched packaged video data from the distributed storage data set according to the packaged video identifier.
Specifically, two different data storage strategies of accessing and packaging video according to a single video sample exist in the distributed storage data set, and data identifiers of video data stored by adopting different storage strategies can be selected according to different training tasks, so when the training request is analyzed to obtain the data identifiers of video data participating in the training, the type of the data identifiers needs to be judged firstly, and at the moment, if the data identifiers are single video identifiers, the matched single video data is directly obtained in the distributed storage data set according to the single video identifiers, wherein the single video data comprises single video data stored in a local disk or any network; if the data identifier is a packaged video identifier, directly acquiring matched packaged video data in a distributed storage data set according to the packaged video identifier, wherein the matched packaged video data comprises video data packets stored in a distributed file system such as a local disk or an HDFS/FastDFS and the like; and then obtaining matched target video data participating in the training.
Optionally, the video data stored in the distributed storage data set includes an internal video resource stored in a local disk or a distributed file system, and further includes an external video resource stored on any network, so that the matching target video data is obtained in the distributed storage data set according to the data identifier, including: if the target video data is internal video data, acquiring matched target video data from a distributed storage data set according to the data identification; and if the target video data is the external video data, the distributed storage data set acquires the matched target video data from the distributed storage data set according to the data identification after the distributed storage data set acquires the matched target video data from the external network.
Specifically, when target video data is acquired, firstly, whether the matched target video data is internal video data is judged according to the data identification, and whether corresponding target video data exists or not can be inquired in a local file system, a CDN cluster, an HDFS cluster and a FastDFS cluster under a distributed storage data set; if the target video data exist, the target video data are the internal video data, and the matched target video data are directly obtained in the distributed storage data set; if the target video data does not exist, the target video data is the external video data, so as to avoid traffic cost overhead caused by frequent access to the external video, and improve access speed, as shown in fig. 2B, the distributed storage data set can acquire the matched target video data in the corresponding external network according to the data identifier, store the matched target video data in the distributed storage data set, and send the target video data to the corresponding data supply terminal for downloading, so that the data supply terminal acquires the matched target video data in the distributed storage data set according to the data identifier after the data supply terminal acquires the matched target video data in the external network. Specifically, the external video data in this embodiment is stored in a cache of the distributed storage data set.
When the access to the obtained target video data is the video data of the public network, the distributed storage data set obtains the target video data from the source address and stores the target video data in the CDN cache under the same external video data when the access is first performed, and the target video data can be directly read from the CDN cache in the distributed storage data set when the subsequent access is needed again, so that the downloading flow is saved, and the downloading speed of the target video data is greatly improved. In addition, the video data which has been downloaded to the local disk in the embodiment can be uploaded to the FastDFS special for the distributed storage data set, so that each training machine can uniformly obtain the target video data from the FastDFS when performing distributed training, and overhead existing when a large amount of video data is copied to each training machine is avoided.
Optionally, the data storage manner in this embodiment is not limited to HDFS, fastDFS, and NFS, so long as the distributed data storage protocol or the network data protocol is satisfied, and the data caching network is not limited to the CDN network, and any protocol with data caching and load balancing may be used.
S230, processing the target video data according to a batch processing mechanism to obtain training data corresponding to the video model.
According to the technical scheme provided by the embodiment, the matched target video data are acquired in the distributed storage data set by adopting two different modes of the single video identification and the packaged video identification, so that the method is suitable for different training tasks, the flexibility of training data acquisition is improved, meanwhile, the external video data are cached in the distributed storage data set, the downloading speed of the target video data is improved, and the training efficiency of a video model is improved.
Example III
Fig. 3A is a flowchart of a model training method according to a third embodiment of the present invention, where the present embodiment may be applied to any case of training a video model. The model training method provided by the embodiment of the invention can be executed by the model training device provided by the embodiment of the invention, the device can be realized in a software and/or hardware mode, and the device can be any intelligent terminal device bearing corresponding data processing capability in the device for executing the method.
Optionally, the present embodiment may include the following steps:
s310, training data corresponding to the video model is obtained according to the data supply method.
Specifically, the data supply method is a data supply method provided in any other embodiment of the present invention, and in this embodiment, training data corresponding to a video model to be trained can be obtained by adopting the data supply method in the foregoing embodiment, and the data supply method in the foregoing embodiment has the same beneficial effects.
S320, inputting training data into the video model to obtain a trained video model.
Optionally, after the training data corresponding to the video model is obtained, the training data can be directly input into the video model to be trained, and the video model is trained by adopting the existing neural network training method to obtain a trained video model, so that the trained video model can accurately realize the corresponding video processing purpose for any video data.
Illustratively, as shown in fig. 3B, the training data is input into the video model in this embodiment, which may specifically include: decoding training data by using multiple threads; preprocessing the decoded training data; and inputting the preprocessed training data into the video model.
Specifically, after the training data corresponding to the video model is obtained, the training data can be loaded into the memory of the training machine, and the multi-thread is adopted to decode the training data into a specified format matched with the video model so as to carry out subsequent training; meanwhile, preprocessing the decoded training data, wherein the preprocessing can comprise application data enhancement processing on the training data and converting the application data enhancement processing into a format required by a video model; and inputting the preprocessed training data into a video model to be trained for training, so as to obtain the trained video model. The video data type supporting decoding in this embodiment includes all video and audio file formats, the decoding manner may select a CPU and a GPU, and may specify that decoding starts from any position in the video data, support outputting decoded frames with a specified transmission frame number per second (Frames Per Second, FPS), and support decoding outputting aligned video and audio streams, and may additionally provide information such as an original FPS, a frame width, a frame height, a video playing duration, a code rate, and whether the audio stream and a display time tag (Presentation Time Stamp, PTS) of each frame of the video data are included in addition to providing the video frame and the audio stream; the training data is decoded and then preprocessed, so that the RGB image format and the pulse code modulation (Pulse Code Modulation, PCM) audio stream contained in the decoded training data are subjected to data enhancement operation and then converted into the format required by a video model, and the preprocessing of the video frame can comprise common random clipping, random brightness, random contrast, random scaling and the like; the preprocessing of the audio stream can include random gain transformation, log-log spectrum, mel-spectrum, random clipping, superimposing two pieces of audio at a specified energy ratio, etc.; the random FPS conversion function is supported during decoding, and meanwhile, after data enhancement, the data can be transposed and output according to a specified dimension arrangement mode, so that training data meeting the training requirement of a video model is obtained. Furthermore, after decoding and preprocessing the training data, the embodiment can support the output of the processed training data in a Numpy array format, thereby meeting the input requirement of a mainstream video model; in addition, for some video models (such as mxnet) which disclose the GPU operation interface, the embodiment also supports directly loading the processed training data into the GPU in the format required by the GPU, and the operation and training are parallel, so that the time for waiting for loading the training data into the GPU in the video model training can be saved; all the computation and input/output are parallel in this embodiment, so that the resources of the training machine can be fully utilized, and the processing speed can be maximally improved.
The embodiment can correspondingly process the video frames and the audio streams contained in the video data simultaneously, and can train a video model with higher index and better performance based on the synchronous video frames and the audio information at the moment, thereby simplifying the training operation of the multi-mode video model; meanwhile, the audio data processing mode in the embodiment supports a plurality of common audio preprocessing methods, can be compatible with most of the audio processing modes disclosed at present, and can enable the data supply module to output data meeting the input requirements of the open source model by simply specifying a plurality of parameters, so that the operation steps for verifying the performance of the open source model are greatly simplified.
According to the technical scheme provided by the embodiment, through the data supply method provided by the embodiment, training data corresponding to the video model is obtained, the training data is input into the video model to be trained for training, the training efficiency of the video model is guaranteed, and the performance of the video model is improved.
Example IV
Fig. 4 is a schematic structural diagram of a data supply device according to a fourth embodiment of the present invention, and specifically, as shown in fig. 4, the device may include:
a training request obtaining module 410, configured to obtain a training request for a video model, where the training request includes a preset batch processing mechanism and a data identifier corresponding to the training;
A target data acquisition module 420, configured to acquire matching target video data from a distributed storage data set according to a data identifier, where the distributed storage data set includes video data of each type;
the training data determining module 430 is configured to process the target video data according to a batch processing mechanism to obtain training data corresponding to the video model.
According to the technical scheme provided by the embodiment, the matched target video data is obtained in the distributed storage data set through the data identification in the training request, meanwhile, the target video data is processed according to the preset batch processing mechanism, a large amount of time is not required to be spent for setting the data processing function, the training data corresponding to the video model to be trained is obtained, the video data is directly trained, the occupied storage space is small, the time spent for reading the video data is small, and the training efficiency of the video model is improved.
Further, the target data obtaining module 420 may be specifically configured to:
if the data identification is a single video identification, acquiring matched single video data from the distributed storage data set according to the single video identification;
and if the data identifier is a packaged video identifier, acquiring matched packaged video data from the distributed storage data set according to the packaged video identifier.
Further, the target data obtaining module 420 may be further specifically configured to:
if the target video data is internal video data, acquiring matched target video data from a distributed storage data set according to the data identification;
and if the target video data is the external video data, the distributed storage data set acquires the matched target video data from the distributed storage data set according to the data identification after the distributed storage data set acquires the matched target video data from the external network.
Further, the external video data is stored in a cache of the distributed storage data set.
Further, the batch processing mechanism includes a grouping mode for the target video data.
The data supply device provided in this embodiment is applicable to the data supply method provided in any of the above embodiments, and has corresponding functions and beneficial effects.
Example five
Fig. 5 is a schematic structural diagram of a model training device provided in a fifth embodiment of the present invention, and specifically, as shown in fig. 5, the device may include:
the training data obtaining module 510 is configured to obtain training data corresponding to the video model according to the data supplying method in any embodiment of the present invention;
the video model training module 520 is configured to input training data into a video model to obtain a trained video model.
According to the technical scheme provided by the embodiment, through the data supply method provided by the embodiment, training data corresponding to the video model is obtained, the training data is input into the video model to be trained for training, the training efficiency of the video model is guaranteed, and the performance of the video model is improved.
Further, the video model training module 520 may be specifically configured to:
decoding training data by using multiple threads;
preprocessing the decoded training data;
and inputting the preprocessed training data into the video model.
The model training device provided by the embodiment is applicable to the model training method provided by any embodiment, and has corresponding functions and beneficial effects.
Example six
Fig. 6 is a schematic diagram of a data supply system according to a sixth embodiment of the present invention. The present embodiment is mainly described in detail with respect to a training data supply process of a video model. Referring to fig. 6, the data supply system 60 of the present embodiment may include a distributed data store 610, a batch load 620, and a data supply 630 connected to the distributed data store 610 and the batch load 620, respectively.
Wherein the distributed data storage end 610 stores a distributed storage data set; the batch loading end 620 stores the batch processing mechanism and generates a training request; the data supply terminal 630 is provided with the data supply device provided in any of the embodiments of the present invention.
In particular, the construction principles of the distributed data store 610, the batch load 620, and the data supply 630 included in the data supply system 60 are specifically referred to the description of the data supply method provided in the embodiment of the present invention, and will not be described in detail herein.
Example seven
Fig. 7 is a schematic diagram of a model training system according to a seventh embodiment of the present invention. The present embodiment is mainly described in detail with respect to a training data supply process of a video model. Referring to FIG. 7, the model training system 70 of the present embodiment may include a distributed data store 710, a batch load 720, and a model training 730 coupled to the distributed data store 710 and the batch load 720, respectively.
Wherein the distributed data storage end 710 stores a distributed storage data set; the batch loading end 720 stores a batch processing mechanism and generates a training request; the model training end 730 is provided with the model training device provided in any embodiment of the present invention.
In particular, the construction principles of the distributed data storage 710, the batch loading 720, and the model training 730 included in the model training system 70 are specifically referred to the description of the model training method provided in the embodiment of the present invention, and will not be described in detail herein.
Example eight
Fig. 8 is a schematic structural diagram of an apparatus according to an eighth embodiment of the present invention, and as shown in fig. 8, the apparatus includes a processor 80, a storage device 81, and a communication device 82; the number of processors 80 in the device may be one or more, one processor 80 being taken as an example in fig. 8; the processor 80, the storage means 81 and the communication means 82 in the device may be connected by a bus or other means, in fig. 8 by way of example.
The storage device 81 is a computer readable storage medium, and may be used to store a software program, a computer executable program, and a module, such as a program instruction/module corresponding to a data supply method or a model training method provided in an embodiment of the present invention. The processor 80 executes various functional applications of the apparatus and data processing, that is, implements the above-described data supply method or model training method, by running software programs, instructions, and modules stored in the storage 81.
The storage device 81 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functions; the storage data area may store data created according to the use of the terminal, etc. Further, the storage 81 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, storage 81 may further include memory located remotely from processor 80, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The communication means 82 may be used to enable a network connection or a mobile data connection between devices.
The device provided by the embodiment can be used for executing the data supply method or the model training method provided by any embodiment, and has corresponding functions and beneficial effects.
Example nine
The ninth embodiment of the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, can implement the data supply method in any of the above embodiments. The method specifically comprises the following steps:
acquiring a training request aiming at a video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training;
acquiring matched target video data from a distributed storage data set according to the data identification, wherein the distributed storage data set comprises video data of various types;
and processing the target video data according to a batch processing mechanism to obtain training data corresponding to the video model.
Alternatively, the method for training a model in any of the above embodiments may specifically include:
according to the data supply method in any embodiment of the invention, training data corresponding to the video model is obtained;
And inputting the training data into the video model to obtain a trained video model.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present invention is not limited to the method operations described above, and may also perform the related operations in the data supply method or the model training method provided in any embodiment of the present invention.
From the above description of embodiments, it will be clear to a person skilled in the art that the present invention may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, although in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., and include several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present invention.
It should be noted that, in the embodiment of the data supply device or the model training device, each unit and module included are only divided according to the functional logic, but are not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations may be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A data supply method, comprising:
acquiring a training request aiming at a video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training;
acquiring matched target video data from a distributed storage data set according to the data identifier, wherein the distributed storage data set comprises video data of various types;
processing the target video data according to the batch processing mechanism to obtain training data corresponding to the video model;
The processing the target video data according to the batch processing mechanism to obtain training data corresponding to the video model includes:
and receiving various video associated contents corresponding to the target video data in the corresponding batch through an open transmission control protocol port, processing the target video data in the batch according to the batch processing mechanism, and loading the processed target video data into a memory to obtain training data corresponding to the video model.
2. The method of claim 1, wherein said obtaining matching target video data in a distributed stored dataset according to said data identification comprises:
if the data identifier is a single video identifier, acquiring matched single video data from a distributed storage data set according to the single video identifier;
and if the data identifier is a packaged video identifier, acquiring matched packaged video data from a distributed storage data set according to the packaged video identifier.
3. The method according to claim 1 or 2, wherein said obtaining matching target video data in a distributed storage dataset according to said data identification comprises:
if the target video data are internal video data, acquiring matched target video data from a distributed storage data set according to the data identification;
And if the target video data is external video data, the distributed storage data set acquires the matched target video data from the distributed storage data set according to the data identifier after the distributed storage data set acquires the matched target video data from the external network.
4. A method according to claim 3, wherein the external video data is stored in a cache of the distributed storage data set.
5. The method according to claim 1 or 2, wherein the batching mechanism comprises a grouping of the target video data.
6. A method of model training, comprising:
the data supply method according to any one of claims 1 to 5, obtaining training data corresponding to a video model;
and inputting the training data into the video model to obtain a trained video model.
7. The method of claim 6, wherein said inputting the training data into the video model comprises:
decoding the training data using multithreading;
preprocessing the decoded training data;
and inputting the preprocessed training data into the video model.
8. A data supply device, comprising:
The training request acquisition module is used for acquiring a training request aiming at the video model, wherein the training request comprises a preset batch processing mechanism and a data identifier corresponding to the training;
the target data acquisition module is used for acquiring matched target video data from a distributed storage data set according to the data identification, wherein the distributed storage data set comprises video data of various types;
the training data determining module is used for processing the target video data according to the batch processing mechanism to obtain training data corresponding to the video model;
the training data determining module is specifically configured to receive various video related contents corresponding to the target video data in a corresponding batch through an open transmission control protocol port, process the target video data in the batch according to the batch processing mechanism, and load the processed target video data into a memory to obtain training data corresponding to the video model.
9. A model training device, comprising:
a training data acquisition module, configured to obtain training data corresponding to the video model according to the data supply method of any one of claims 1 to 5;
and the video model training module is used for inputting the training data into the video model to obtain a trained video model.
10. A data supply system, comprising: the data supply end is respectively connected with the distributed data storage end and the batch loading end;
the distributed data storage end stores a distributed storage data set; the batch loading end stores a batch processing mechanism and generates a training request; the data supply terminal is provided with the data supply device as claimed in claim 8.
11. A model training system, comprising: the model training end is respectively connected with the distributed data storage end and the batch loading end;
the distributed data storage end stores a distributed storage data set; the batch loading end stores a batch processing mechanism and generates a training request; the model training end is provided with the model training apparatus according to claim 9.
12. An electronic device, the device comprising:
one or more processors;
a storage means for storing one or more programs;
when executed by the one or more processors, causes the one or more processors to implement the data supply method of any one of claims 1-5, or the model training method of claim 6 or 7.
13. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a data supply method according to any one of claims 1-5 or a model training method according to claim 6 or 7.
CN201910197522.8A 2019-03-15 2019-03-15 Data supply method, model training method, device, system, equipment and medium Active CN109977822B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910197522.8A CN109977822B (en) 2019-03-15 2019-03-15 Data supply method, model training method, device, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910197522.8A CN109977822B (en) 2019-03-15 2019-03-15 Data supply method, model training method, device, system, equipment and medium

Publications (2)

Publication Number Publication Date
CN109977822A CN109977822A (en) 2019-07-05
CN109977822B true CN109977822B (en) 2023-05-09

Family

ID=67079035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910197522.8A Active CN109977822B (en) 2019-03-15 2019-03-15 Data supply method, model training method, device, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN109977822B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427998A (en) * 2019-07-26 2019-11-08 上海商汤智能科技有限公司 Model training, object detection method and device, electronic equipment, storage medium
CN112395070A (en) * 2019-08-12 2021-02-23 阿里巴巴集团控股有限公司 Data processing system and method
CN110912926B (en) * 2019-12-04 2022-03-25 湖南快乐阳光互动娱乐传媒有限公司 Data resource back-source method and device
DE102020204033A1 (en) 2020-03-27 2021-09-30 Continental Automotive Gmbh Computer implemented method and distributed storage system for providing trusted data objects
CN114697682A (en) * 2020-12-29 2022-07-01 阿里巴巴集团控股有限公司 Video processing method and system

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6940540B2 (en) * 2002-06-27 2005-09-06 Microsoft Corporation Speaker detection and tracking using audiovisual data
CN102222213B (en) * 2010-07-29 2013-08-07 郑文明 Distributed vision computing method based on open type Web Service framework
US8804815B2 (en) * 2011-07-29 2014-08-12 Dialogic (Us) Inc. Support vector regression based video quality prediction
US9204103B1 (en) * 2011-12-30 2015-12-01 Emc Corporation Technique for parallel, distributed video processing
US10402448B2 (en) * 2017-06-28 2019-09-03 Google Llc Image retrieval with deep local feature descriptors and attention-based keypoint descriptors
CN107741899B (en) * 2017-10-16 2021-07-27 北京小米移动软件有限公司 Method, device and system for processing terminal data
CN108108754B (en) * 2017-12-15 2022-07-22 北京迈格威科技有限公司 Training and re-recognition method, device and system for re-recognition network
CN108829518B (en) * 2018-05-31 2020-01-03 北京百度网讯科技有限公司 Method and device for pushing information
CN108876166A (en) * 2018-06-27 2018-11-23 平安科技(深圳)有限公司 Financial risk authentication processing method, device, computer equipment and storage medium
CN109284417B (en) * 2018-08-27 2022-11-22 广州飞磨科技有限公司 Video pushing method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN109977822A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN109977822B (en) Data supply method, model training method, device, system, equipment and medium
US11374995B2 (en) Multimedia file processing
CN111277869B (en) Video playing method, device, equipment and storage medium
CN108933805A (en) A kind of document transmission method and system
CN110297944B (en) Distributed XML data processing method and system
CN111026982A (en) Intelligent contract processing method, computer equipment and storage medium
EP1864214A1 (en) Method and apparatus for configuring software resources for playing network programs
CN107193674B (en) Method and device for processing online push message
CN109634738A (en) Asynchronous processing method, server, storage medium and device based on micro services
CN101388892A (en) Method and apparatus for client-side aggregation of asynchronous fragmented requests
CN113312032B (en) Front-end project resource updating method and device, electronic equipment and storage medium
US20170289583A1 (en) Method and system for optimizing publication of live broadcasting message
CN113034629A (en) Image processing method, image processing device, computer equipment and storage medium
CN111240564A (en) Material display method and device, electronic equipment and storage medium
CN115292020A (en) Data processing method, device, equipment and medium
US9614900B1 (en) Multi-process architecture for a split browser
CN117041623A (en) Digital person live broadcasting method and device
CN118227343A (en) Data processing method, system, device, equipment, medium and product
CN115905061A (en) Data transfer device, DMA device, electronic apparatus, and data transfer method
CN107147706A (en) Data export method and device
CN116226045A (en) File data aggregation method, file data aggregation device and query system
CN112019689A (en) Incoming call show service processing system and method
CN110719303B (en) Containerization NRF method and system
CN112001156A (en) Form processing method and device and computer readable storage medium
CN108809900B (en) Framework and method for unified resource access

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant