WO2020093694A1 - Procédé de génération d'un modèle d'analyse vidéo, et système d'analyse vidéo - Google Patents

Procédé de génération d'un modèle d'analyse vidéo, et système d'analyse vidéo Download PDF

Info

Publication number
WO2020093694A1
WO2020093694A1 PCT/CN2019/090291 CN2019090291W WO2020093694A1 WO 2020093694 A1 WO2020093694 A1 WO 2020093694A1 CN 2019090291 W CN2019090291 W CN 2019090291W WO 2020093694 A1 WO2020093694 A1 WO 2020093694A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
video analysis
training
subsystem
video
Prior art date
Application number
PCT/CN2019/090291
Other languages
English (en)
Chinese (zh)
Inventor
邓真渝
唐朋成
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020093694A1 publication Critical patent/WO2020093694A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/76Circuitry for compensating brightness variation in the scene by influencing the image signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Definitions

  • This application relates to the field of video, and in particular to a method for generating a video analysis model and a video analysis system.
  • video analysis technology can be divided into the following four aspects.
  • Target detection mainly solves the problem of where multiple targets are on the image.
  • the region of interest (Region of Interest, ROI) bounding box of all targets is given on the image.
  • Target tracking is to find the target in the image sequence, complete the one-to-one correspondence with the same target in different frames, and give the target trajectory.
  • the target attribute is the use of structured analysis technology to analyze the attribute details of the target in the image, such as the gender, age, wear of the person, and the car model and brand.
  • Target recognition is based on the features extracted from the image, looking for the best match with the object model library in the image, and outputting the category of the object.
  • the typical application is face recognition. Based on the face feature extraction model, the face feature vector is extracted from the input face image, and then matched with the face feature vector in the database based on similarity to realize face control and face search And face matching functions.
  • the video analysis models that implement the above functions in the video analysis system are mostly obtained by using machine learning methods related to image processing and analysis.
  • the Convolutional Neural Network (CNN) model in deep learning is widely used due to better performance. use.
  • CNN Convolutional Neural Network
  • video analysis models can be deployed both on cloud servers and terminals.
  • the current video analysis model is to complete the training offline and then deploy it to the user's actual scene (online). Once the model is deployed online in the actual scenario, the model parameters are no longer updated, and the determined model parameters are used directly to complete the inference.
  • the embodiments of the present application provide a method and a video analysis system for generating a video analysis model, which can effectively use data collected in an actual scene for online training to improve the performance of the video analysis model.
  • the embodiments of the present application provide a method for generating a video analysis model, which is applied to a video analysis system.
  • the video stream data is first received, and the video stream data is analyzed to obtain Unlabeled image data, the unlabeled image data may include corresponding structured information; then, the low-quality images in the unlabeled image data are filtered to obtain images with image quality that meets the model training requirements, and then the obtained images Annotate to get a data set for model training.
  • the data set includes a training set and a test set.
  • an online training algorithm is used to train the training set in the data set online to obtain a video analysis model.
  • the video analysis model is released.
  • the method for generating a video analysis model provided by an embodiment of the present application can use video data collected by a video surveillance front-end device to automatically extract images and clean and annotate images to obtain a data set for model training, and then use a model online training algorithm for data The model is trained to obtain a video analysis model.
  • the method provided in the embodiments of the present application does not require excessive manual participation, automates the model training process, and can be deployed in a video analysis system to achieve online training and real-time optimization of the model and avoid performance degradation of the video analysis model.
  • the training set may also be trained online based on the original video analysis model using transfer learning or incremental training algorithms, where the original video analysis model includes an externally input video analysis model or the video analysis The previous version of the video analysis model saved by the system.
  • the original video analysis model can be optimized according to the image data obtained from the actual video surveillance scene, and the original video analysis model can be optimized to maintain the performance of the video analysis model.
  • the online training algorithm may be one or more of incremental training, transfer training, knowledge distillation, and meta-learning.
  • the model training hyperparameters can be automatically adjusted, and the hyperparameters used for video analysis model training can be automatically generated according to the deployment scenario of the video analysis model, where the hyperparameters include One or more of learning rate decay step size, total training step size, basic learning rate, and batch size.
  • publishing the video analysis model includes: testing the performance of the trained video analysis model, generating a model performance analysis report, and when the performance of the trained video analysis model meets the usage standards, publishing the trained video analysis model to video analysis In the system. Before publishing the trained video analysis model to the video analysis system, test the performance of the video analysis model. You can further filter the video analysis models that meet the standard to avoid publishing the poor performance video analysis model to the video analysis system, resulting in video analysis performance The situation of decline.
  • publishing the video analysis model obtained by online training is specifically that when the existing video analysis model is in an updatable state, the video analysis model obtained by online training is used for existing videos in the hospital Analyze the model. At the same time, you can also save the existing video analysis model to facilitate the version rollback.
  • publishing the video analysis model also includes using the online training to obtain the video analysis model for online training of the new video analysis model.
  • the performance of the video analysis model can be tested using the test set provided by the user or the test set from the data set after cleaning and annotation.
  • the performance analysis report of the video analysis model can include the inference delay, memory consumption, and One or more of the parameters based on the accuracy of the test set.
  • an embodiment of the present application provides a video analysis system.
  • the video analysis system has the function of implementing the method described in the first aspect. This function can be realized by hardware, and can also be realized by hardware executing corresponding software.
  • the hardware or software includes one or more subsystems corresponding to the above functions.
  • an embodiment of the present application provides a video analysis system, including: a processor, a memory, a bus, and a communication interface; the memory is used to store computer-executed instructions, and the processor and the memory are connected through the bus.
  • the processor executes the computer execution instructions of the memory, so that the device management center executes the method for generating a video analysis model as described in any of the first aspect above.
  • an embodiment of the present application provides a computer-readable storage medium having instructions stored therein, which when run on a computer, enables the computer to perform the generation of any one of the above-mentioned first aspects Video analysis model method.
  • an embodiment of the present application provides a computer program product containing instructions that, when run on a computer, enable the computer to execute any method for generating a video analysis model according to any one of the first aspect.
  • FIG. 1 is a schematic structural diagram of a video analysis system provided by an embodiment of this application.
  • FIG. 2 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • FIG. 3 is another schematic structural diagram of a video analysis system provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of a video analysis method for online optimization of a video analysis model provided by an embodiment of the present application
  • FIG. 5 is a schematic flowchart of an online optimization of a target recognition model provided by an embodiment of this application.
  • FIG. 6 is a schematic flowchart of online optimization of a target attribute model provided by an embodiment of the present application.
  • the model training in the current video analysis technology is completed offline and then deployed to the user's actual scene. Once the model is deployed online, the parameters of the model are no longer updated, and the video analysis needs to be completed using the determined model parameters. If you need to update the model parameters, you need to perform incremental training on the model parameters based on the labeled data set offline to obtain a new model, and then upload the new model to replace the online model.
  • the embodiments of the present application provide a video analysis method and system that supports online optimization, which can automatically clean and mark the video obtained in the actual scene after the model is deployed in the actual scene Data, and use the labeled data for online model training to alleviate the performance degradation of the model in the actual scene.
  • a possible video analysis system that supports online optimization includes a video analysis subsystem, a data management subsystem, an online training subsystem, and a model evaluation and publishing subsystem.
  • the video analysis subsystem can support the basic functions of decoding the video stream, as well as the storage and detection of video data and its decoded images. It can also include a video analysis model for various complex functions such as target detection, tracking, and recognition.
  • the video data acquired by the front-end in real time is transmitted to the video analysis system.
  • the video analysis subsystem decodes the received video data into an image sequence through the video analysis decoding tool, which can be stored through a storage area network (Storage Area Network, SAN), etc., and Image retrieval based on information such as time and video source can be achieved based on algorithms such as Deep First Search (DFS).
  • SAN Storage Area Network
  • DFS Deep First Search
  • the video analysis subsystem can also support the intelligent analysis of the decoded original image sequence, for example, the video analysis model is used to extract the target cutout, target trajectory, or target structured information (such as attributes , Feature vectors, etc.), the video analysis subsystem can perform unified storage and management of intelligent analysis and analysis results, and also supports retrieval of stored images and analysis results and other information when needed.
  • the function of the video analysis subsystem in this solution can be realized by the original video analysis system, or a new video analysis function can be added on the basis of the original video analysis system, and a new system for video analysis can also be provided separately As a video analysis subsystem in the embodiment of the present application.
  • the data management subsystem is used to manage the data set used for online training of the video analysis model, for example, to automatically clean, mark, and store unlabeled image data acquired by the video subsystem.
  • the image data obtained by the video analysis subsystem parsing the video stream cannot be directly used for online training of the video analysis model.
  • the data management subsystem needs to automatically clean and annotate the image data generated by the video analysis subsystem and its structured information.
  • the data management subsystem cleans the unlabeled images to filter the images in the unlabeled images whose quality does not meet the training requirements, and selects the image data whose image clarity and target shooting angle in the images meet the online training conditions for the video analysis model.
  • Image cleaning usually filters blurred images, images with inappropriate shooting angles, and images with occlusion.
  • evaluation indicators such as the Laplace operator can be used to filter out blurred images
  • the Euler angle detection model can be used to output the angle of the target in the image.
  • the target angle threshold can be limited to filter out the large-angle image. You can also use the trained occlusion classifier to filter Drop the image of the target being blocked.
  • the labeling of the image is to mark the corresponding attribute information (such as identity mark, external features, etc.) on the image data whose selected image quality meets the requirements of online training.
  • the data management subsystem can use its own integrated series of algorithms and strategies, and can also call other video analysis models in the video analysis system to clean and label the data to generate a data set for online training of the model.
  • the data set can be divided into Training set and test set.
  • multi-model fusion evaluation can be used for pre-annotation, and the structured information of the image can be used to verify and correct the annotation results, as follows: 1. Use multiple different models of the same function to vote on the input image data , Determine the pre-annotation result; 2.
  • the data management subsystem can automatically clean and label the image data. If the data management subsystem cannot determine that some image data is suitable for online model training, it can hand over these data to manual identification and labeling. Decide whether to discard these image data. Optionally, in order to improve the cleaning and labeling efficiency, the data management subsystem can also choose to directly discard the above image data, and the user can set corresponding cleaning and labeling strategies according to actual needs.
  • the online training subsystem is used for online training on the model training data generated by the data management subsystem, and outputs the video analysis model obtained by the online training.
  • online training algorithms such as incremental training, transfer learning, knowledge distillation, or meta-learning algorithms.
  • Model training needs to configure hyper-parameters, such as learning rate decay step size, total training step size, basic learning rate, or batch size.
  • the hyperparameters for model training can be automatically generated according to the deployment environment of the video analysis model, and there are various methods for automatically generating and adjusting hyperparameters.
  • the learning rate can be automatically multiplied by the attenuation coefficient based on the change in the accuracy of the intermediate model on the test set and the current loss function; similarly, the accuracy of the test model on the test set can be based on the original video analysis
  • the initial learning rate of the model training is automatically generated. If there is no original video analysis model, a larger initial learning rate can be set; while batch size depends on the size of the data volume, a smaller batch size can be set for small samples.
  • the online training subsystem can perform online training based on the original video analysis model and output the online video analysis model.
  • the original video analysis model can be an external input video analysis model, an intermediate model saved when the training process is interrupted, or a previous version model saved in the video analysis system.
  • the incremental image data collected by the actual scene is used to tune the internal parameters of the original video analysis model through knowledge distillation or transfer learning.
  • the optimization strategy for different scenarios can be artificial
  • the setting can also be adjusted adaptively through the internal frame.
  • the online training subsystem also supports the management and storage of the original video analysis model and the different versions of the video analysis model obtained after online training.
  • model training algorithm in the online training subsystem according to the scene needs, data set size, original model category or performance requirements, such as stochastic for the neural network model.
  • Gradient descent (SGD) algorithm the boosting algorithm can be used for the decision tree model, and the incremental training can be used for the support vector machine model to obtain new support vectors.
  • SGD Gradient descent
  • the embodiment of the present application does not limit the specific use of the online training subsystem here. Model training algorithm.
  • the model evaluation and release subsystem is used for performance evaluation and decision release of the models generated by the online training subsystem.
  • the model evaluation and publishing subsystem is used to perform performance tests on the model and output the detailed performance of the model Test reports, such as delay, resource overhead, and accuracy.
  • the user can decide whether to release the video analysis model generated by online training according to the performance test report, or can pre-set a release strategy, and the system can automatically decide whether to release the video analysis model according to the model performance test report.
  • the video analysis system also includes a resource scheduling subsystem, which is used to implement the allocation and scheduling of computing and storage resources.
  • a resource scheduling subsystem can automatically suspend the model training task at the right time according to the resources occupied by the training and the current available resources, and quickly interrupt model training when the video analysis service needs resources, save the model in the middle of the training, and have sufficient computing resources Start training again.
  • the video analysis system provided by the embodiment of the present application can decode the video stream acquired by the front end into an image sequence, and clean and mark the image sequence to obtain an image data set that can be used for model training, and then perform model training based on the above image data set to obtain
  • the video analysis model continuously improves the performance of the video analysis system.
  • the embodiments of the present application can make full use of the video data collected by the front end to generate a model for video analysis, and avoid the performance degradation of the model generated by offline training after being deployed online.
  • the method provided in the embodiments of the present application uses the image data collected in the actual monitoring scene to generate a video analysis model. Even though two different video monitoring scenes provide the same initial training model, the collected image data is different due to different monitoring scenes. There are differences, and the video analysis model generated through online training will also be different after a period of time.
  • the video analysis system in the embodiments of the present application can be deployed on a cloud platform, and the front end sends the collected video data to the video analysis system deployed on the cloud platform.
  • the video analysis system processes the video data collected by the front end. Train to generate a video analysis model.
  • the video analysis system in the embodiments of the present application may also exist in the form of software and be provided to users.
  • the users themselves install the video analysis system provided in the embodiments of the present application in the back-end analysis system of the video surveillance system and control the operation of the Apply for the video analysis system provided by the embodiments to achieve an online optimized video analysis model.
  • the original video analysis system can also be improved and upgraded to obtain the video analysis system provided by the embodiment of the present application, that is, the function of the online optimization model can be expanded on the basis of the original video analysis system.
  • the original video analysis system can use the generated video analysis model for video analysis.
  • the video analysis system provided in the embodiments of the present application can be provided to users as an independent device, and can support online update of video analysis models.
  • the video analysis system in FIG. 1 can be implemented by the computer device in FIG. 2.
  • FIG. 2 is a schematic diagram of the hardware structure of a computer device provided by an embodiment of the present application.
  • the computer device 200 includes at least one processor 201, a communication bus 202, a memory 203, and at least one communication interface 204.
  • the processor 201 may be a general-purpose central processing unit (central processing unit, CPU), image processor (graphic processing unit, GPU), neural network processor (neural-network process unit, NPU) microprocessor, application-specific integrated circuit (application-specific integrated circuit, ASIC), or one or more integrated circuits that are used to control the execution of the program of this application.
  • CPU central processing unit
  • image processor graphics processing unit, GPU
  • neural network processor neural-network process unit, NPU
  • ASIC application-specific integrated circuit
  • ASIC application-specific integrated circuit
  • the communication bus 202 may include a path to transfer information between the aforementioned components.
  • Communication interface 204 using any device such as a transceiver, for communicating with other devices or communication networks, such as Ethernet, wireless access network (RAN), wireless local area networks (WLAN), etc.
  • the communication method can be selected according to the actual application scenario, which is not limited in this application.
  • the memory 203 may be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (random access memory, RAM), or other types of information and instructions that can be stored
  • the dynamic storage device can also be electrically erasable programmable read-only memory (electrically erasable programmable-read-only memory (EEPROM), read-only compact disc (compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (Including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program code in the form of instructions or data structures and can be used by a computer Access to any other media, but not limited to this.
  • the memory may exist independently and be connected to the processor through the bus. The memory can also be integrated with the processor.
  • the memory 203 is used to store application program code for executing the solution of the present application, and is controlled and executed by the processor 201.
  • the processor 201 is used to execute the application program code stored in the memory 203, so as to implement the method for generating a video analysis model provided by the following embodiments of the present application.
  • the processor 201 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 2.
  • the computer device 200 may include multiple processors, such as the processor 201 and the processor 207 in FIG. 2. Each of these processors may be a single-core processor or a multi-core processor.
  • the processor here may refer to one or more devices, circuits, and / or processing cores for processing data (eg, computer program instructions).
  • the computer device 200 may further include an output device 205 and an input device 206.
  • the output device 205 communicates with the processor 201 and can display information in various ways.
  • the output device 205 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector. Wait.
  • the input device 206 communicates with the processor 201 and can accept user input in a variety of ways.
  • the input device 206 may be a mouse, keyboard, touch screen device, or sensor device.
  • the aforementioned computer device 200 may be a general-purpose computer device or a dedicated computer device.
  • the computer device 200 may be a desktop computer, a portable computer, a network server, a wireless terminal device, an embedded device, or a device with a similar structure as shown in FIG. 3.
  • the embodiment of the present application does not limit the type of the computer device 200.
  • FIGS. 1-2 from the aspects of data flow in the video analysis system, the steps of online optimization, etc., and take the online optimization of the target recognition model and target attribute model as an example.
  • the analysis system and method are further described.
  • FIG. 3 shows the data flow direction in the system when the video analysis model is generated in the embodiment of the present application.
  • FIG. 4 shows the process of online optimization of the model in this scheme. The scheme will be described below in conjunction with FIGS. 3 and 4.
  • the process of generating a video analysis model in this solution includes the following steps:
  • S401 Receive and parse the video stream to generate unlabeled image data.
  • the video analysis system can receive the video stream data collected by the front-end camera, and can also receive the video stream data saved by other devices.
  • Various video analysis and decoding tools are integrated in the video analysis subsystem, and the video analysis subsystem decodes the received video stream data to obtain unlabeled image sequences.
  • the video analysis subsystem also stores a video analysis model, which can analyze the decoded image sequence, and can support the extraction of target cutouts, target trajectories, and image structured information (such as target attributes, feature vectors, etc.) through the video analysis model.
  • the video analysis subsystem can also store the analysis results in a data warehouse (Data Bank, DB), and the DB is responsible for unified storage management of the analysis results.
  • the video analysis subsystem can also retrieve and store decoded image sequences.
  • Typical retrieval methods can be based on DFS and other algorithms to retrieve information based on time and video sources.
  • Typical storage methods can be SAN and other methods.
  • the video analysis subsystem can also classify and retrieve the stored images and analysis results through the Multiple Classifier System (MCS).
  • MCS Multiple Classifier System
  • the image sequence obtained by decoding the front-end video cannot be directly used for model training, and needs to be cleaned and labeled.
  • Image cleaning is mainly to filter low-quality images to obtain image data with image quality that meets training requirements.
  • Low-quality images include blurred images, large-angle images, and images with occlusion.
  • evaluation indicators such as the Laplace operator can be used to filter out blurred images
  • the Euler angle detection model can be used to output the angle of the target in the image.
  • the target angle threshold can be limited to filter out the large-angle image.
  • You can also use the trained occlusion classifier to filter Drop the image of the target being blocked.
  • Image labeling is to mark the corresponding attribute information (such as identity identification, external features, etc.) on the image data whose image quality meets the requirements of online training.
  • Different model training tasks can specify different labeling strategies.
  • the image cleaning algorithm and labeling strategy are stored in the data management subsystem of the video analysis system, which is used to automatically clean and label the image sequence.
  • the data management subsystem can use its own integrated series of algorithms and strategies, and can also call other video analysis models in the video analysis system to clean and label the data to generate a data set for online training of the model.
  • the data set can be divided into Training set and test set. For example, multi-model fusion evaluation can be used for pre-annotation, and the structured information of the image can be used to verify and correct the annotation results, as follows: 1.
  • the data management subsystem can hand over these data to manual identification and labeling, and the human decides whether to discard these image data or the labeled information.
  • the data management subsystem can also choose to discard the above image data, and the user can set corresponding cleaning and labeling strategies according to actual needs.
  • the data set for online training is finally obtained.
  • the data set can be divided into a training set and a test set.
  • the model training for video analysis requires sufficient image data.
  • a model training request can be initiated.
  • the model online training process can be initiated.
  • the model can be based on the original video analysis model and training set during online training.
  • the original video analysis model can call an external model, it can also be an intermediate model in the training process, or it can be any version of the model specified by the user reserved internally.
  • the video analysis model is a model parameter file composed of some trained model parameters.
  • the online training algorithm may use one or more of incremental training, transfer learning, knowledge distillation, and meta-learning algorithms.
  • the hyperparameters used for model training can be automatically adjusted, where the hyperparameters include the learning rate decay step size, the total training step size, the basic learning rate, and batch size.
  • the hyperparameters include the learning rate decay step size, the total training step size, the basic learning rate, and batch size.
  • Different tuning strategies can be set for different monitoring scenarios and video analysis needs, and the internal framework can also be adapted by parameter training.
  • Version management and storage of the model can be implemented internally by the online training subsystem or by external devices.
  • the online model training algorithm in the embodiments of the present application has multiple implementations, typically based on the traditional stochastic gradient descent (SGD) method, and the input image based on the training set is obtained through a video analysis model Corresponding output, after calculating the objective function between the same label, derivate the weight, multiply it by the adaptive learning rate and feed it back to the original weight update, and repeat the update until the entire model's parameters converge; vector machine) method, through incremental training of the new training set to update the support vector, this application does not limit the specific model training method used here.
  • SGD stochastic gradient descent
  • the video analysis model obtained by online training cannot be directly used for video analysis, but the parameter performance needs to be tested first to evaluate the effectiveness of the model.
  • the video analysis model is released to the video analysis system.
  • the model evaluation and release subsystem in the video analysis system is used to evaluate and release the video analysis model.
  • Model testing is usually based on test sets. Performance testing is performed through a test framework or model performance testing tool integrated within the model evaluation and publishing subsystem, and a detailed, multi-dimensional model performance report is output to assist in the decision-making model release and model performance.
  • the report can include information such as delay, resource overhead, and accuracy.
  • the model test can use the test set in the data management subsystem data set, or can be provided separately by the user, such as a standard data set dedicated to performance acceptance.
  • the test framework or test tool uses the input model and test set to obtain the model output result and calculate the performance index, and then based on the output result and evaluate the corresponding model performance according to different task goals or video analysis needs.
  • the user can decide whether to release the obtained model according to the performance analysis report, or can predefine the release strategy in advance, and then write the customized release strategy into a script, which is automatically executed by the system and analyzed according to the model performance Report automation decision whether to release the model.
  • the release of the model refers to providing the trained model to the video analysis subsystem for video analysis, or to the online training subsystem for future online model training, and can also be published to the network for Used by other users.
  • the video analysis method for online model optimization can automatically generate an online training data set based on front-end video streaming data, which can realize online optimization of an intelligent video analysis model and is used to alleviate the performance degradation of the video analysis model in actual scenes. problem.
  • the method provided by the embodiment of the present application can make full use of the video data in the monitoring scene without backhauling, and does not violate user privacy.
  • the cleaning and labeling of front-end data can be greatly reduced or even separated from manual participation, which can greatly reduce the cost of labeling.
  • the online training of the model uses video data from specific scenes for training, the model obtained in this way is more targeted, and different models can be obtained in different scenes and have adaptive capabilities.
  • 5 and 6 are examples of online training of a target recognition model and a target attribute model to further illustrate the system and method provided by the present application.
  • Figure 5 is the online optimization of the target recognition model. Typical targets such as faces, human bodies or vehicles, etc. The method includes the following steps:
  • the video analysis subsystem receives the video data inflow request collected by the front-end device.
  • the front-end video data inflow request can carry information about the video stream, such as the video stream's code rate, time, and source identification fields.
  • the video analysis subsystem applies to the computing and storage resource scheduling subsystem for storage space and computing resources for intelligent analysis.
  • the video analysis subsystem may also choose to carry video-related information when sending a request to the computing and storage resource scheduling subsystem.
  • the computing and storage resource scheduling subsystem responds to the request of the video analysis subsystem.
  • the computing and storage resource scheduling subsystem evaluates the currently available computing resources and storage resources. If there is enough storage space, it plans the storage address and memory unit for the video data, and feeds back the results to the video analysis subsystem, which then analyzes the video.
  • the subsystem performs decoding, storage and analysis of video data.
  • the decoded image sequence of the video stream sent by the front end is processed by the target detection and tracking algorithm integrated in the video analysis subsystem to obtain the target trajectory.
  • the target track is an image sequence composed of target cutouts.
  • each target image carries track information, such as the original video ID, the target track ID, and the cutout ID of the target in the track.
  • the above ID can be used to assist subsequent image annotation.
  • the video analysis subsystem uses the obtained target cutout to obtain its series of structured information through an internally integrated intelligent analysis algorithm, and sends an inflow request for the target cutout and structured information to the data management subsystem.
  • the data management subsystem applies to the computing and storage resource scheduling subsystem for computing and storage resources.
  • the data management subsystem before processing the target cutout sent by the data management subsystem, the data management subsystem also needs to apply for computing and storage resources from the computing and storage resource scheduling system.
  • the computing and storage resource scheduling subsystem responds to the data management subsystem request.
  • the computing and storage resource scheduling subsystem allocates storage space and computing resources for cleaning and labeling the incoming target image. If the computing resources are temporarily insufficient, the image can be stored first, and then the cleaning and labeling can be performed when the computing resources are sufficient.
  • the computing and storage resource scheduling subsystem can choose to return only the storage location of the target image to the data management subsystem, without the need for inflow
  • the target image is allocated storage space separately.
  • the data management subsystem uses allocated computing resources to automatically clean and mark the input target data and structured information to obtain a data set.
  • the data cleaning of target identification is mainly to ensure that the angle and clarity of the target meet the requirements.
  • the labeling requires that different images of the same target only correspond to the same ID, and there are no images of different targets under the same ID.
  • evaluation indicators such as Laplace operator can be used to filter out fuzzy pictures, and the Euler angle detection model outputs the target angle while limiting the angle threshold to filter out large-angle pictures.
  • the labeling scheme in different scenes will be different. For example, in the face recognition scene of human witness comparison, only the target in the collected video needs to be aggregated by using the witness information to achieve higher accuracy.
  • the detection and tracking algorithm can use the detection and tracking algorithm to extract the target trajectory, and then select several optimal frames under each trajectory based on the quality score, and then use multiple face feature extraction models to calculate the similarity between different trajectories Degrees, and then merge the different trajectories. If the image carries time and location information, image data that appears at the same time but at a different location under the same ID is recognized as a wrong case. Finally, the data management subsystem stores the annotated data in the data set.
  • the data set can be divided into a training set for model training and a test set for testing the model obtained by training.
  • the data after cleaning and labeling is managed in a unified manner within the data management subsystem.
  • the newly added target image data needs to be ID-fused with the previously stored target image to meet the target identification labeling principle.
  • the system will count the amount of image data currently collected. When the training model requirements are met, it will send an online training request to the computing and storage resource scheduling subsystem.
  • the online training subsystem applies to the computing and storage resource scheduling subsystem for computing and storage resources for online training.
  • the data storage resource scheduling subsystem responds to the request of the online training subsystem.
  • the data storage resource scheduling subsystem responds to the online training subsystem request according to the current available resource status, and allocates computing and storage resources for online training of the target recognition model when the resources are sufficient.
  • the online training subsystem performs online model training.
  • the online training subsystem utilizes the resources allocated by the computing and storage resource scheduling subsystem, based on the training data set provided by the data management subsystem, uses an internally integrated online training framework or online training tool for online incremental training, obtains the video analysis model, and completes Model training online.
  • model training There are also many methods for model training, such as transfer learning, knowledge distillation, and meta-learning.
  • the online training subsystem can interrupt model training and save the intermediate results of model training, and support the continuation of interrupted training when the computing resources are idle.
  • the online training subsystem requests model evaluation.
  • the online training subsystem After obtaining the video analysis model, the online training subsystem requests the model evaluation and publishing subsystem to evaluate the model performance.
  • the model evaluation and publishing subsystem applies to the computing and storage resource scheduling subsystem for computing resources for model testing and evaluation.
  • the calculation and storage scheduling subsystem responds to the model evaluation and issues the subsystem request.
  • the computing and storage resource scheduling subsystem responds to the model evaluation and publishing subsystem requests based on the current available resource status. When sufficient resources are available, it allocates computing and storage resources for the online training video analysis model test evaluation.
  • the model evaluation and release subsystem uses the allocated resources to automatically measure the performance of the model obtained by online training according to the user's decision based on the test set provided by the user or the test set generated by the data management subsystem, and obtain and record the multi-dimensional analysis of the model Reports, including performance of model inference delay, memory footprint, model size, false alarm rate and recall rate based on test set.
  • the model evaluation and release subsystem can also support the performance comparison between the new model and the previous version of the model under the same test set.
  • the model evaluation and release subsystem feeds back the performance analysis report of the model to the user, and analyzes the release of the video analysis model based on the user or automated decision. a) If the video analysis model does not meet the comprehensive performance requirements, you can choose to retrain the model online. b) If the overall performance of the video analysis model is improved, you can choose to request the model release to the computing and storage resource scheduling subsystem.
  • the computing and storage resource scheduling subsystem responds to the model evaluation and release subsystem request.
  • the computing and storage resource scheduling subsystem will allocate storage space for the video analysis model in the video analysis subsystem and the online training subsystem according to the request.
  • the model evaluation and release subsystem stores the video analysis model in the video analysis subsystem.
  • the original target recognition model in the video analysis subsystem is in the updateable idle state, the original model is replaced with the new model.
  • the model version confirmed to be replaced can also be archived in the data management subsystem to facilitate version rollback and complete online optimization of the model.
  • the model evaluation and publishing subsystem can also store the video analysis model in the online training subsystem for subsequent online training process of the target recognition model.
  • model evaluation and publishing subsystem can also send the video analysis model to the data management subsystem for auxiliary image annotation.
  • the method shown in FIG. 5 includes a computing and resource scheduling subsystem.
  • the computing and storage resource scheduling subsystem can realize the reasonable allocation and scheduling of computing and storage resources among various subsystems and improve the efficiency of resource utilization.
  • Figure 6 is the online optimization method of the target attribute model.
  • the online optimization process of the target attribute model and the target recognition model is similar.
  • the main difference is that the labeling strategy and model testing method are different. It includes the following steps:
  • the video analysis subsystem receives the video data inflow request collected by the front end.
  • the front-end video data inflow request can carry information about the video stream, such as the video stream's code rate, time, and source identification fields.
  • the video analysis subsystem responds to the request of the terminal.
  • the video analysis subsystem evaluates the computing resources and storage resources available in the system. If there is enough storage space, it plans the storage address and memory unit for the input video stream data, and then performs decoding, storage, and analysis of the video stream data.
  • the decoded image sequence of the video stream sent by the front end is processed by the target detection and tracking algorithm integrated in the video analysis subsystem to obtain the target trajectory.
  • the target track is an image sequence composed of target cutouts.
  • each target image carries track information, such as the original video ID, the target track ID, and the cutout ID of the target in the track.
  • the above ID can be used to assist subsequent image annotation.
  • the video analysis subsystem uses the obtained target cutout to obtain its series of structured information through an internally integrated intelligent analysis algorithm, and sends an inflow request to the data management subsystem.
  • the data management subsystem allocates storage space and computing resources for cleaning and labeling the incoming image. If the computing resources are temporarily insufficient, the image can be stored first, and the cleaning and labeling can be performed when the computing resources are sufficient.
  • the data management subsystem may choose to send only the storage location of the target image to the data management subsystem.
  • the data management subsystem only needs to follow the target image The storage location to read the target image.
  • the data management subsystem cleans and marks the target.
  • the data management subsystem uses the allocated computing resources to automatically clean and annotate the input data and structured information, and divide the annotated data into a training set and a test set according to a certain proportion.
  • the image data of the training set and the test set are independent of each other. There is no duplicate image data in the training set and the test set.
  • the data cleaning in target attribute model training is mainly to ensure that the angle and clarity of the target meet the requirements, and the annotation requires the target image to have the correct attributes. Different monitoring scenarios correspond to different attribute labeling schemes.
  • face / human attribute labeling in a human witness scenario can directly label some attributes based on witness information, such as obtaining gender, age and ethnicity based on identity card information;
  • the trajectory can also be extracted based on the target detection and tracking algorithm. Different frames under the same trajectory are predicted by the basic human attribute model and voted. If there are differences between a few images and other image attributes, it is a misnomer. If the image The attribute's confidence level is low, it is a difficult example, and then the wrong and difficult examples can be handed over to multiple human attribute models for voting prediction and labeling.
  • the data management subsystem stores the annotated data in the data set.
  • the data set can be divided into a training set for model training and a test set for testing the model obtained by training.
  • the cleaned and marked data will be managed in a unified manner within the data management subsystem.
  • the newly added target image data will be managed in a unified manner with the previously stored target image. No fusion ID is required.
  • the system will count the amount of data currently collected when the training model is reached. When the quantity is required, send an online training request to the online training subsystem.
  • the online training subsystem responds to the online training subsystem request according to the current available resource status, and allocates computing and storage resources for online training of the target attribute model when the resources are sufficient.
  • the online training subsystem performs online model training.
  • the online training subsystem uses internal computing and storage resources, based on the training data set provided by the data management subsystem, uses an internally integrated online training framework or online training tool for online training, obtains the video analysis model, and completes the online generation of video analysis model.
  • model training methods such as transfer learning, knowledge distillation, and meta-learning, etc., and the corresponding model training method can be selected according to actual scene needs, which is not limited in the embodiments of the present application.
  • the online training subsystem can interrupt model training and save the intermediate results of model training, and support the continuation of interrupted training when the computing resources are idle.
  • the online training subsystem requests model evaluation.
  • the online training subsystem After the online training subsystem completes the online training of the model and obtains a new model, it requests the model evaluation and publishing subsystem to evaluate the performance of the model.
  • the model evaluation and publishing subsystem responds to the request of the online training subsystem.
  • the model evaluation and release subsystem responds to model evaluation and release subsystem requests based on the current available resource status. When resources are sufficient, it allocates computing and storage resources for the new model test evaluation obtained by online training.
  • the model evaluation and release subsystem uses the allocated resources to automatically test the performance of the model obtained by online training according to the user's decision based on the test set provided by the user or the test set generated by the data management subsystem to obtain a multi-dimensional analysis report of the model. Including the model's inference delay, memory footprint, model size, rejection rate based on test set, and accuracy of each attribute label.
  • the model evaluation and release subsystem can also support the performance comparison between the new model and the previous version of the model under the same test set.
  • the model evaluation and release subsystem determines the release strategy of the model.
  • the model evaluation and release subsystem feeds back the performance analysis report of the model to the user, based on the release of the user or automated decision model. a) If the model generated by the training does not meet the comprehensive performance requirements, you can choose the model to perform online training again; b) If the comprehensive performance of the new model is improved, then publish the model generated by the training to the video analysis subsystem, when When the original target attribute model in the video analysis subsystem is in an updateable idle state, the original model is replaced with a new model.
  • model version confirmed to be replaced can also be archived in the data management subsystem to facilitate version rollback and complete online optimization of the model.
  • model evaluation and publishing subsystem can also publish the generated model to the online training subsystem for subsequent online training process of the target recognition model.
  • model evaluation and publishing subsystem can also send the video analysis model obtained by online training to the data management subsystem, which is used to assist the image annotation.
  • the method shown in FIG. 6 does not set a computing and storage resource scheduling subsystem to manage the computing and storage resources in the system, but each subsystem manages the computing and storage resources by itself.
  • an online training data set can be automatically generated based on the front-end video stream data, which supports online optimization of the video analysis model, which can relieve the video analysis model in actual scenarios The problem of degraded performance; on the other hand, you can make full use of the actual monitoring scene data collected by the front end without backhaul, saving bandwidth and avoiding infringement of user privacy.
  • the monitoring data from specific monitoring scenes are used to train to obtain the model, so that different scenes can get different training models, so that the entire video analysis system has adaptive capabilities.
  • the above-mentioned video analysis system includes a hardware structure and / or a software module corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed by hardware or computer software driven hardware depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
  • the embodiments of the present application may divide the video analysis system into various subsystems according to the above method embodiments, for example, each subsystem may be divided corresponding to each function, or two or more functions may be integrated into one subsystem.
  • the integrated subsystem can be implemented in the form of hardware or software function modules. It should be noted that the division of the subsystems in the embodiments of the present application is schematic, and is only a division of logical functions. In actual implementation, there may be another division manner.
  • FIG. 1 shows a possible structural schematic diagram of the video analysis system involved in the above embodiment.
  • the video analysis system 100 includes a video analysis subsystem 110, a data management subsystem 120, an online training subsystem 130, and a model evaluation and publishing subsystem 140.
  • the video analysis subsystem 110 is used to receive and parse video stream data to obtain unlabeled image data, where the unlabeled image data includes structured information.
  • the data management subsystem 120 is used for cleaning and labeling unlabeled image data, that is, filtering low-quality images in the unlabeled image data to obtain image data whose image quality meets the training requirements, and adopts multi-model fusion evaluation to meet the training requirements.
  • Pre-annotate the image data and then use the structured data to verify the pre-annotated image, and finally generate a data set for online training of the model, where the data set includes the training set and the test set.
  • the online training subsystem 130 is used to perform online training on the training set using an online training algorithm to obtain the parameters of the video analysis model.
  • the model evaluation and publishing subsystem 140 is used to publish the video analysis model.
  • the low-quality image includes one or more of a blurred image, a large-angle image, and an image with occlusion.
  • the video analysis subsystem 110 can receive the video stream data collected by the front-end camera, and can also receive the video stream data saved by other devices.
  • the video stream data received by the video analysis subsystem 110 can include attribute information of the video stream data. Such as video stream bit rate, shooting time, and source identification.
  • Various video analysis and decoding tools are integrated in the video analysis subsystem 110 to decode the video stream data to obtain unlabeled image data.
  • the unlabeled image data may include structured information of the image, such as target attributes and feature vectors.
  • the data management subsystem 120 may clean and mark unmarked image data in various ways.
  • the data management subsystem 120 can use Laplace operator and other evaluation indicators to filter out the blurred image, use the Euler angle detection model to output the angle of the target in the image, and limit the target angle threshold to filter out the large-angle image.
  • the occlusion classifier filters out the images where the target is occluded.
  • the data management subsystem 120 can also use the training image quality sub-model to automatically implement image quality scoring, and preset image quality thresholds to filter image data whose image quality does not meet the model training requirements.
  • the data management subsystem 120 may use the feature vector clustering algorithm or the multi-model fusion evaluation algorithm to label the unlabeled image data obtained by screening to obtain a data set for online model training. When the amount of image data marked in the training set meets the requirements for model training, a model training request can be initiated.
  • the online training subsystem 130 can use an internally integrated online training algorithm or online training tool for online training to obtain a new video analysis model, or it can be based on the original video analysis model, using migration learning or incremental training algorithms. Online training is performed on the training set to obtain a video analysis model, which is an optimization of the original video analysis model for the current monitoring scene.
  • the online training algorithm adopted by the online training subsystem 130 may be one or more of incremental training, transfer learning knowledge distillation, or meta-learning algorithm.
  • the hyperparameters used in online training can be automatically adjusted, such as one or more of the learning rate decay step size, the total training step size, the basic learning rate, and the batch size.
  • the model evaluation and publishing subsystem 140 first tests the performance of the video analysis model and generates a model performance analysis report.
  • the video The analysis model is released to the video analysis system 100.
  • the video analysis subsystem 110 in the video analysis system 100 may use a video analysis model obtained by online training for video analysis and processing, and the online training subsystem 130 may also use a video analysis model obtained by online training for online training.
  • the model evaluation and publishing subsystem 140 may test the performance of the video analysis model according to the test set provided by the user or the training set in the data set.
  • the resulting model performance analysis report may include the inference delay and memory of the video analysis model One or more of the occupancy and the accuracy based on the test set, but not limited to the above parameters.
  • the video analysis system 100 may further include a resource scheduling subsystem 150, which is used to allocate and schedule computing and storage resources of the video analysis system.
  • a resource scheduling subsystem 150 which is used to allocate and schedule computing and storage resources of the video analysis system.
  • the video analysis system is presented in the form of dividing each subsystem corresponding to each function, or the video analysis system is presented in the form of dividing each subsystem in an integrated manner.
  • “Subsystem” here can refer to Application-Specific Integrated Circuit (ASIC), circuits, processors and memories that execute one or more software or firmware programs, integrated logic circuits, and / or others that can provide the above Functional device.
  • ASIC Application-Specific Integrated Circuit
  • the video analysis system 100 may take the form shown in FIG. 2.
  • the video analysis subsystem 110, the data management subsystem 120, the online training subsystem 130, the model evaluation and publishing subsystem 140, and the resource scheduling subsystem 150 in FIG. 1 can be implemented by the processor 201 and memory 203 of FIG. .
  • the video analysis subsystem 110, the data management subsystem 120, the online training subsystem 130, the model evaluation and publishing subsystem 140, and the resource scheduling subsystem 150 can call the application program code stored in the memory 203 by the processor 201 To implement, the embodiments of the present application do not limit this.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a dedicated computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be from a website site, computer, server or data center Transmit to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (Digital Subscriber Line, DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device including one or more servers and data centers that can be integrated with the medium.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, Solid State Disk (SSD)).

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

Les modes de réalisation de la présente invention concernent un procédé de génération d'un modèle d'analyse vidéo, et un système d'analyse vidéo. Le procédé consiste : à acquérir des données de flux vidéo, à analyser un flux vidéo, et à extraire des données d'image non étiquetées dans le flux vidéo par un système d'analyse vidéo ; à utiliser une politique d'étiquetage automatique afin de gérer les données d'image, et à achever l'étiquetage et le nettoyage des données d'image ; à réaliser un apprentissage de modèle en ligne conformément aux données d'image étiquetées, et à générer un modèle d'analyse vidéo ; à tester le modèle nouvellement généré, et à délivrer un rapport d'analyse de performance du modèle ; et lorsque les performances satisfont une exigence, à libérer le modèle d'analyse vidéo au système d'analyse vidéo. Le procédé des modes de réalisation de la présente invention utilise pleinement un flux vidéo acquis dans un scénario réel, et réalise un apprentissage en ligne d'un modèle d'analyse vidéo, ce qui permet d'optimiser en continu les performances du modèle d'analyse vidéo tout en garantissant la sécurité des données, et d'atténuer la détérioration des performances du modèle dans des scénarios réels.
PCT/CN2019/090291 2018-11-07 2019-06-06 Procédé de génération d'un modèle d'analyse vidéo, et système d'analyse vidéo WO2020093694A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811321394.5 2018-11-07
CN201811321394.5A CN111160380A (zh) 2018-11-07 2018-11-07 生成视频分析模型的方法及视频分析系统

Publications (1)

Publication Number Publication Date
WO2020093694A1 true WO2020093694A1 (fr) 2020-05-14

Family

ID=70554626

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/090291 WO2020093694A1 (fr) 2018-11-07 2019-06-06 Procédé de génération d'un modèle d'analyse vidéo, et système d'analyse vidéo

Country Status (2)

Country Link
CN (1) CN111160380A (fr)
WO (1) WO2020093694A1 (fr)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859872A (zh) * 2020-07-07 2020-10-30 中国建设银行股份有限公司 一种文本标注方法和装置
CN112182297A (zh) * 2020-09-30 2021-01-05 北京百度网讯科技有限公司 训练信息融合模型、生成集锦视频的方法和装置
CN112381114A (zh) * 2020-10-20 2021-02-19 广东电网有限责任公司中山供电局 一种深度学习图像标注系统及方法
CN112398875A (zh) * 2021-01-18 2021-02-23 北京电信易通信息技术股份有限公司 视频会议场景下基于机器学习的流数据安全漏洞探测方法
CN112418335A (zh) * 2020-11-27 2021-02-26 北京云聚智慧科技有限公司 基于连续图像帧跟踪标注的模型训练方法及电子设备
CN112686864A (zh) * 2020-12-30 2021-04-20 中山嘉明电力有限公司 基于图像识别的液体、气体跑冒滴漏识别方法及系统
CN112783650A (zh) * 2021-01-20 2021-05-11 之江实验室 一种基于ai芯片的多模型并行推理方法
CN113705593A (zh) * 2020-05-21 2021-11-26 孙民 产生训练数据的方法及电子装置
CN113822957A (zh) * 2021-02-26 2021-12-21 北京沃东天骏信息技术有限公司 用于合成图像的方法和装置
CN114863364A (zh) * 2022-05-20 2022-08-05 碧桂园生活服务集团股份有限公司 一种基于智能视频监控的安防检测方法及系统
CN115359341A (zh) * 2022-08-19 2022-11-18 无锡物联网创新中心有限公司 一种模型更新方法、装置、设备及介质
RU2787224C1 (ru) * 2021-11-13 2022-12-30 Иван Борисович Сиваченко Способ оценки функционального состояния по показателям походки и/или кинематики движений и применение способа в сфере продвижения товаров и услуг

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111695699B (zh) * 2020-06-12 2023-09-08 北京百度网讯科技有限公司 用于模型蒸馏的方法、装置、电子设备及可读存储介质
CN111783997B (zh) * 2020-06-29 2024-04-23 杭州海康威视数字技术股份有限公司 一种数据处理方法、装置及设备
CN113706372A (zh) * 2020-06-30 2021-11-26 稿定(厦门)科技有限公司 自动抠图模型建立方法及系统
CN112016682B (zh) * 2020-08-04 2024-01-26 杰创智能科技股份有限公司 视频表征学习、预训练方法及装置、电子设备、存储介质
CN113422751B (zh) * 2020-08-27 2023-12-05 阿里巴巴集团控股有限公司 基于在线强化学习的流媒体处理方法、装置及电子设备
CN112115993B (zh) * 2020-09-11 2023-04-07 昆明理工大学 一种基于元学习的零样本和小样本证件照异常检测方法
CN112231668A (zh) * 2020-09-18 2021-01-15 同盾控股有限公司 基于击键行为的用户身份认证方法、电子设备和存储介质
CN112214638B (zh) * 2020-10-20 2023-04-21 湖南快乐阳光互动娱乐传媒有限公司 一种基于大数据挖掘的视频内容解构方法和系统
CN112529845A (zh) * 2020-11-24 2021-03-19 浙江大华技术股份有限公司 图像质量值确定方法、装置、存储介质及电子装置
CN112365107B (zh) * 2020-12-16 2024-01-23 北京易华录信息技术股份有限公司 一种基于人工智能的近视风险评估方法、装置及系统
CN113645415A (zh) * 2021-08-13 2021-11-12 京东科技信息技术有限公司 用于生成视频的方法、装置、电子设备和介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103747189A (zh) * 2013-11-27 2014-04-23 杨新锋 一种数字图像处理方法
US20170308909A1 (en) * 2016-04-20 2017-10-26 OA Labs LLC Systems and methods for sensor data analysis through machine learning
CN108154118A (zh) * 2017-12-25 2018-06-12 北京航空航天大学 一种基于自适应组合滤波与多级检测的目标探测系统及方法
CN108229321A (zh) * 2017-11-30 2018-06-29 北京市商汤科技开发有限公司 人脸识别模型及其训练方法和装置、设备、程序和介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077236B (zh) * 2013-01-09 2015-11-18 公安部第三研究所 便携式设备实现视频知识采集与标注功能的系统及方法
CN104539895B (zh) * 2014-12-25 2017-12-05 桂林远望智能通信科技有限公司 一种视频分层存储系统及处理方法
CN107886512A (zh) * 2016-09-29 2018-04-06 法乐第(北京)网络科技有限公司 一种确定训练样本的方法
CN108694718A (zh) * 2018-05-28 2018-10-23 中山大学附属第六医院 直肠癌术前同期新辅助放化疗疗效评估系统和方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103747189A (zh) * 2013-11-27 2014-04-23 杨新锋 一种数字图像处理方法
US20170308909A1 (en) * 2016-04-20 2017-10-26 OA Labs LLC Systems and methods for sensor data analysis through machine learning
CN108229321A (zh) * 2017-11-30 2018-06-29 北京市商汤科技开发有限公司 人脸识别模型及其训练方法和装置、设备、程序和介质
CN108154118A (zh) * 2017-12-25 2018-06-12 北京航空航天大学 一种基于自适应组合滤波与多级检测的目标探测系统及方法

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113705593A (zh) * 2020-05-21 2021-11-26 孙民 产生训练数据的方法及电子装置
CN111859872A (zh) * 2020-07-07 2020-10-30 中国建设银行股份有限公司 一种文本标注方法和装置
CN112182297A (zh) * 2020-09-30 2021-01-05 北京百度网讯科技有限公司 训练信息融合模型、生成集锦视频的方法和装置
CN112381114A (zh) * 2020-10-20 2021-02-19 广东电网有限责任公司中山供电局 一种深度学习图像标注系统及方法
CN112418335A (zh) * 2020-11-27 2021-02-26 北京云聚智慧科技有限公司 基于连续图像帧跟踪标注的模型训练方法及电子设备
CN112418335B (zh) * 2020-11-27 2024-04-05 北京云聚智慧科技有限公司 基于连续图像帧跟踪标注的模型训练方法及电子设备
CN112686864A (zh) * 2020-12-30 2021-04-20 中山嘉明电力有限公司 基于图像识别的液体、气体跑冒滴漏识别方法及系统
CN112398875B (zh) * 2021-01-18 2021-04-09 北京电信易通信息技术股份有限公司 视频会议场景下基于机器学习的流数据安全漏洞探测方法
CN112398875A (zh) * 2021-01-18 2021-02-23 北京电信易通信息技术股份有限公司 视频会议场景下基于机器学习的流数据安全漏洞探测方法
CN112783650A (zh) * 2021-01-20 2021-05-11 之江实验室 一种基于ai芯片的多模型并行推理方法
CN112783650B (zh) * 2021-01-20 2024-01-16 之江实验室 一种基于ai芯片的多模型并行推理方法
CN113822957A (zh) * 2021-02-26 2021-12-21 北京沃东天骏信息技术有限公司 用于合成图像的方法和装置
RU2787224C1 (ru) * 2021-11-13 2022-12-30 Иван Борисович Сиваченко Способ оценки функционального состояния по показателям походки и/или кинематики движений и применение способа в сфере продвижения товаров и услуг
CN114863364A (zh) * 2022-05-20 2022-08-05 碧桂园生活服务集团股份有限公司 一种基于智能视频监控的安防检测方法及系统
CN115359341A (zh) * 2022-08-19 2022-11-18 无锡物联网创新中心有限公司 一种模型更新方法、装置、设备及介质
CN115359341B (zh) * 2022-08-19 2023-11-17 无锡物联网创新中心有限公司 一种模型更新方法、装置、设备及介质

Also Published As

Publication number Publication date
CN111160380A (zh) 2020-05-15

Similar Documents

Publication Publication Date Title
WO2020093694A1 (fr) Procédé de génération d'un modèle d'analyse vidéo, et système d'analyse vidéo
US11989597B2 (en) Dataset connector and crawler to identify data lineage and segment data
US10565442B2 (en) Picture recognition method and apparatus, computer device and computer- readable medium
JP6445716B2 (ja) ビデオストリームのエンティティベースの時間的セグメント化
WO2021217934A1 (fr) Procédé et appareil de surveillance de quantité de bétail, et dispositif informatique, et support de stockage
WO2020010694A1 (fr) Procédé et appareil de surveillance de santé d'animal, et support d'informations lisible par ordinateur
US20160098636A1 (en) Data processing apparatus, data processing method, and recording medium that stores computer program
US20220179884A1 (en) Label Determining Method, Apparatus, and System
US20220138893A9 (en) Distributed image analysis method and system, and storage medium
US20200394448A1 (en) Methods for more effectively moderating one or more images and devices thereof
CN112732949A (zh) 一种业务数据的标注方法、装置、计算机设备和存储介质
CN114997414B (zh) 数据处理方法、装置、电子设备和存储介质
US11645579B2 (en) Automated machine learning tagging and optimization of review procedures
CN117649515A (zh) 一种基于数字孪生的半监督3d目标检测方法、系统和设备
US20210049499A1 (en) Systems and methods for diagnosing computer vision model performance issues
US11762895B2 (en) Method and system for generating and assigning soft labels for data node data
CN114898182A (zh) 一种基于目标检测学习算法的图片数据筛选方法及系统
US20190138931A1 (en) Apparatus and method of introducing probability and uncertainty via order statistics to unsupervised data classification via clustering
US20200356825A1 (en) Model for health record classification
US12020786B2 (en) Model for health record classification
US11818227B1 (en) Application usage analysis-based experiment generation
US11983085B1 (en) Dynamic usage-based segmentation
CN113326805B (zh) 一种人体封面更新方法、装置、电子设备及存储介质
JP2021196912A (ja) 実行装置、実行方法、および実行プログラム
CN113014412B (zh) 一种宕机故障业务延迟时间的预测方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19881660

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19881660

Country of ref document: EP

Kind code of ref document: A1