WO2022121196A1 - 一种可伸缩视觉计算系统 - Google Patents

一种可伸缩视觉计算系统 Download PDF

Info

Publication number
WO2022121196A1
WO2022121196A1 PCT/CN2021/087017 CN2021087017W WO2022121196A1 WO 2022121196 A1 WO2022121196 A1 WO 2022121196A1 CN 2021087017 W CN2021087017 W CN 2021087017W WO 2022121196 A1 WO2022121196 A1 WO 2022121196A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
module
stream
instructions
information
Prior art date
Application number
PCT/CN2021/087017
Other languages
English (en)
French (fr)
Inventor
高文
王耀威
白鑫贝
纪雯
田永鸿
Original Assignee
鹏城实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 鹏城实验室 filed Critical 鹏城实验室
Priority to US18/037,408 priority Critical patent/US20230412769A1/en
Publication of WO2022121196A1 publication Critical patent/WO2022121196A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model

Definitions

  • the invention relates to the application field of visual sensors, in particular to a scalable visual computing system.
  • Video surveillance system has unique advantages such as intuition, accuracy and rich information content. With the development of computer, image processing technology and transmission technology, video surveillance system is widely used in security, police, transportation, production and other fields. At present, the number of cameras installed in various places such as roads, communities, airports/train stations, large venues, etc. is increasing, and the resolution is getting higher and higher, resulting in a sharp increase in the amount of video or image data generated. Processing and transport present great challenges.
  • the technical problem to be solved by the present invention is to provide a scalable visual computing system aiming at solving the problem that the existing video monitoring system cannot adapt to the real-time processing requirements of massive data, aiming at the deficiencies of the prior art.
  • a scalable visual computing system characterized in that it includes front-end devices, edge services, and cloud services that establish communication connections in sequence;
  • the front-end equipment is used to perceive and collect scene visual information to obtain image data, perform video image processing, feature coding processing and intelligent analysis processing on the image data, and combine the processing results with front-end equipment identification information, time information and space information.
  • the information is encapsulated to obtain a compressed video stream, a feature encoding stream and a structured result stream, and the compressed video stream, the feature encoding stream and the structured result stream are correspondingly output according to the configuration; it is also used to report its own state information to the the edge service, and receive the control instruction and model stream issued by the edge service, and complete the configuration of its own working parameters and model update, and the control instruction includes device control instructions and function definition instructions;
  • the edge service is used to receive and store the compressed video stream, the feature encoding stream and the structured result stream sent by the front-end device, output the feature encoding stream and the structured result stream in real time and aggregate them to the cloud service, and adjust the data according to the data of the cloud service.
  • Fetching instructions to output compressed video streams to cloud services on demand also used to receive and process node access management instructions reported by front-end devices and update the device management list, and to report status information of front-end devices and edge services to cloud services, receive The model query instruction of the front-end device is forwarded to the cloud service, the model flow and control instructions issued by the cloud service are received, and the model flow and control instructions are issued to the front-end device, and the control instructions include device control instructions and Function definition instruction; it is also used to complete the multi-node linkage system organization plan generation, data information configuration plan planning, collaborative work scheduling, image data processing analysis, data collaborative analysis and joint optimization work according to the defined functional tasks;
  • the cloud service is used to receive, store and aggregate the feature code stream and structured result stream output by the edge service in real time, and retrieve the compressed video stream on demand from the edge service; it is also used to store the data used to support various applications.
  • the algorithm model manages the algorithm life cycle and update process, receives the model query command sent by the edge service or front-end device, returns the model query result or model stream accordingly, randomly sends control commands according to the triggering situation, receives and responds to the first Third-party user demand instructions; it is also used for big data analysis and mining and simulation calculations, and to perform multilateral collaborative tasks; it is also used to receive device status information reported by edge services, and to perform configuration management, function definition, and resource coordination scheduling for all nodes.
  • the front-end device includes a spatiotemporal determination module, an image acquisition module, an image processing module, an intelligent computing module, and a device control and data interaction module; wherein the spatiotemporal determination module is used to obtain the front-end device Unified time information, maintain the time synchronization between front-end devices, determine the position, speed, and attitude information of the front-end devices, and provide the space-time information to other modules of the front-end device in real time for calculation and transmission, and receive device control and data interaction modules
  • the image acquisition module is used for the acquisition and conversion of image data, and sends the image data to the image processing module;
  • the image processing module is used for image processing.
  • the data is pre-processed, compressed, encoded, and transcoded, and the compressed video stream carrying the time stamp information is output to the device control and data interaction module, and is also used to output the pre-processed image data to the intelligent computing module, the receiving device.
  • the control instruction sent by the control and data interaction module completes the configuration of the processing parameters; the intelligent computing module is used to perform structured analysis, feature extraction and feature encoding on the image data, and output the encoded feature stream and the structured result stream to the device control and data interaction module, the intelligent computing module is also used to receive control instructions, the control instructions include parameter configuration instructions and function definition instructions, receive model flow and dynamically update the algorithm model; the device control and data interaction module is used to The received time information, spatial information, compressed video streams, pictures, encoded feature streams, and structured result streams are packaged and encapsulated, sent to the edge service, received and parsed from the model stream and control instructions issued by the edge service or cloud service, and sent to the edge service.
  • the model flow and control instructions are sent to the corresponding processing modules; they are also used to complete the work flow control, equipment control, state monitoring, model update and transmission control of the front-end equipment, and to obtain the working state and identification information of the equipment.
  • the scalable visual computing system wherein the edge service includes an integrated control module, a streaming media module, a data storage module and a computing processing module; wherein the integrated control module is used to receive reported data or instructions from front-end equipment, Control the response process, push the encoded feature stream and structured result stream to the cloud service in real time, receive and forward the control instruction or model stream issued by the cloud service, manage the access process and status of the front-end equipment, monitor the status of the front-end equipment , for scheduling the cooperative working mode among multiple front-end devices; the streaming media module is used to receive the compressed video stream, and perform transcoding, interception and packaging on the compressed video stream; the data storage module is used for Receive the compressed video stream of the streaming media module, the coded feature stream and the structured result stream reported by the front-end device, and classify, store and manage the compressed video stream, the coded feature stream and the structured result stream, and receive the The compressed video stream or picture retrieval command sent by the device is retrieved and returned to the cloud service by conditional retrieval; the computing processing module is used
  • the scalable visual computing system wherein the data storage module includes an access management sub-module, a data retrieval sub-module, and a database sub-module; wherein, the access management sub-module is used to support data input, storage, and adjustment.
  • the data retrieval sub-module is used for data query and retrieval operations;
  • the database sub-module is used to store structured data or unstructured data.
  • the scalable visual computing system wherein the database sub-module includes a video file library, a picture file library, a feature file library, a structured feature library and a structured result library; wherein, the video file library is used to store video Stream data and its summary information; the picture file library is used to store image format file data and its summary information; the feature file library is used to store unstructured feature stream data and its summary information; for storing structured feature data; the structured result library is used for storing structured result data.
  • the scalable visual computing system wherein the cloud service includes a central control module, a computational simulation module, a data analysis module, a data center module, an algorithm model warehouse and a user interaction module; wherein the central control module is used for All nodes in the system perform configuration management and resource scheduling, uniformly manage the transmission control process of data flow, control flow and model flow, issue device control, function definition, model update commands to front-end devices, and issue tasks to edge services.
  • the computing simulation module is used for structured analysis and processing, simulation prediction, model training, model joint optimization, collaborative strategy generation, and output calculation results
  • the data analysis module uses It is used for receiving encoded feature streams and structured result streams, or retrieving and retrieving data in the data center module according to user instructions, gathering big data information, performing analysis and mining, and extracting high-level semantic information and returning it to the user
  • the data center module uses It is used to retrieve compressed video streams or pictures from edge services on demand, and to store, retrieve and retrieve output of encoded feature streams, structured result streams, and compressed video streams or pictures retrieved on demand
  • the algorithm model warehouse is used for storage, retrieval and retrieval. For the storage, query, delivery process and life cycle management of the algorithm model
  • the user interaction module is used to receive user-related instructions and return processing results.
  • the scalable visual computing system wherein the data stream includes multimedia data, feature data, result information, spatiotemporal information, environmental data, device data, and algorithm models.
  • control flow refers to instruction data related to system operation
  • instruction data includes device registration instructions, login instructions, logout instructions, device control instructions, function definition instructions, and parameter configuration. instructions and data query/recall instructions.
  • the scalable visual computing system wherein the central control module includes a configuration management sub-module, a cooperative scheduling sub-module, and an instruction processing sub-module; wherein, the configuration management sub-module is used for front-end equipment, edge services, cloud Serve all nodes for security authentication, configuration management and status monitoring; the cooperative scheduling sub-module is used to issue device control instructions, function definition instructions, and working parameter instructions to the front-end equipment according to the scheduling strategy; the instruction processing sub-module is used to be responsible for Receives, parses and processes the reporting and query commands of edge services, issues model stream data, and is responsible for responding to user interaction modules.
  • the configuration management sub-module is used for front-end equipment, edge services, cloud Serve all nodes for security authentication, configuration management and status monitoring
  • the cooperative scheduling sub-module is used to issue device control instructions, function definition instructions, and working parameter instructions to the front-end equipment according to the scheduling strategy
  • the instruction processing sub-module is used to be responsible for Receives, parses and processes the reporting and query commands of edge
  • the scalable visual computing system wherein the data analysis module includes a data retrieval sub-module, a statistical analysis sub-module and a data mining sub-module, and the data retrieval sub-module is used to receive or initiate data retrieval and data retrieval to the data center module. Calling the instruction to obtain the data required for the analysis task; the statistical analysis sub-module is used for multi-dimensional analysis of the aggregated feature and result data by using classification, regression, and correlation analysis methods; the data mining sub-module is used for artificial Intelligence, machine learning, and statistical methods can automatically extract hidden useful information and knowledge from a large amount of historical data or real-time data.
  • the core of the scalable visual computing system proposed by the present invention lies in the parallel data transmission architecture of three types of data streams: compressed video stream, encoded feature stream and model stream.
  • the transmission of the feature stream is real-time, and the front-end device can report the compressed video stream and the encoded feature stream at the same time according to the configuration, which can not only relieve the pressure of data transmission, but also aggregate effective information in real time for joint data analysis;
  • the transmission of the model stream has Occasionally, when the model needs to be updated, the model stream is directly or indirectly transmitted to the front-end device by the cloud service to realize the dynamic deployment and update of the model, thereby supporting the completion of different application tasks by defining the functions and algorithms of the front-end device;
  • the present invention proposes
  • the scalable visual computing system also has scalability, compressed video streams can be stored in edge services, encoded feature streams can be aggregated to cloud services in real time, and cloud services use feature information for subsequent tasks such as analysis, identification, and retrieval.
  • the compressed video stream can be retrieved from the edge service only after the user's authorization;
  • all front-end devices, edge services and cloud service nodes in the scalable visual computing system provided by the present invention have a globally unified time and space Identification, that is, all nodes have a unified time representation method and synchronized time information, a unified spatial information representation method and reference system, and a globally unique device identification, where the spatial information includes position, speed, attitude and its accuracy information;
  • the scalable visual computing system has an autonomous decision-making mechanism for event response. Based on the definable feature of front-end device functions, the system can dynamically configure node working states, working parameters, algorithm models, and output data streams, thereby automatically completing traditional video surveillance. Some tasks in the system that rely on a lot of human labor.
  • FIG. 1 is a schematic block diagram of a preferred embodiment of a scalable visual computing system of the present invention.
  • FIG. 2 is a schematic block diagram of an intelligent computing module of the present invention.
  • FIG. 3 is a schematic block diagram of a computing processing module of the present invention.
  • FIG. 4 is a schematic block diagram of the data analysis module of the present invention.
  • the present invention provides a scalable visual computing system.
  • the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
  • the existing video surveillance system includes two parts: end-side or end-cloud.
  • the camera is responsible for video/image acquisition and compression, and only outputs one compressed video stream.
  • the edge/cloud part is responsible for the analysis and processing of large-scale video/image data.
  • Due to the restriction of system bandwidth a large amount of raw data is difficult to aggregate, and the centralized analysis of a large amount of data also brings huge processing pressure to the edge/cloud, and the edge/cloud can only perform decoding on the decoded video. Processing will lead to a certain performance loss, and the single function and purpose of the camera will also cause the problem of repeated deployment of the camera and waste of resources.
  • the defects of the system architecture seriously restrict the video Industrial application of big data.
  • since the original video is directly uploaded to the cloud service for calculation and storage, there is still a problem that user privacy cannot be effectively protected.
  • the present invention provides a scalable visual computing system, as shown in FIG. 1 , which includes a front-end device 10 , an edge service 20 and a cloud service 30 that establish communication connections in sequence; the front-end device 10.
  • the edge service 20 is used to receive and Store the compressed video stream, feature encoding stream and structured result stream sent by the front-end device, output the feature encoding stream and structured result stream in real time and aggregate to the cloud service, and output the compressed video stream on demand according to the data retrieval instruction of the cloud service to Cloud service; it is also used to receive and process node access management instructions reported by front-end devices and update the device management list.
  • the cloud service receives the model flow and control instructions issued by the cloud service, and delivers the model flow and control instructions to the front-end device, where the control instructions include device control instructions and function definition instructions; it is also used to define The functional tasks of the multi-node linkage system are completed, and the multi-node linkage system organization plan generation, data information configuration plan planning, collaborative work scheduling, image data processing analysis, data collaborative analysis and joint optimization work are completed; the cloud service 30 is used for real-time reception, storage and Aggregate feature code streams and structured result streams output by edge services, and retrieve compressed video streams on demand from the edge services; it is also used to store algorithm models used to support various applications, and manage algorithm life cycles and update processes , receives the model query command sent by the edge service or front-end device, returns the model query result or model stream accordingly, randomly sends control commands according to the triggering situation, receives and responds to third-party
  • this embodiment draws on the biological principle that the information transmitted by the human eye to the brain is extracted and reduced, and proposes a scalable visual computing system, which includes front-end equipment, edge services, and cloud services (terminal, edge, cloud) Three subsystems, the core of the scalable visual computing system lies in the parallel data transmission architecture of the compressed video stream, the encoded feature stream and the model stream.
  • the transmission of the compressed video stream and the encoded feature stream is real-time, so
  • the front-end equipment can simultaneously report the compressed video stream and the encoded feature stream according to the configuration, which can not only relieve the pressure of data transmission, but also gather valid information in real time for joint data analysis;
  • the transmission of the model stream is sporadic, and when the model needs to be updated,
  • the model stream is directly or indirectly transmitted to the front-end device by the cloud service to realize the dynamic deployment and update of the model, so as to support the completion of different application tasks by defining the functions and algorithms of the front-end device;
  • the scalable visual computing system proposed by the present invention also has scalable Scalability, compressed video streams can be stored in edge services, and encoded feature streams can be aggregated to cloud services in real time.
  • Cloud services use feature information for subsequent tasks such as analysis, identification, and retrieval.
  • cloud services must use original image data due to business needs,
  • the compressed video stream can be retrieved from the edge service only after the user's authorization. Therefore, this embodiment redefines the computing architecture, transmission architecture, and functional characteristics of each subsystem of the video surveillance system, and achieves the effect of full utilization of resources through optimized division of labor and organic cooperation between subsystems, thereby realizing real-time monitoring of massive video data. Handling and efficient use.
  • the specific functions of the front-end equipment can be flexibly defined by software, and the algorithm model can be updated dynamically, so as to achieve the purpose of one machine for multiple purposes; the front-end equipment also has globally unified space-time identification information, which is convenient for realizing multi-machine collaborative tasks .
  • the scalable visual computing system has a multi-stream parallel data transmission architecture of compressed video stream, encoded feature stream, and model stream. Distributed storage and meso-scale data analysis, macro big data analysis and mining in cloud services, not only reduces the pressure of data transmission in the system, but also relieves the pressure of centralized computing of cloud services, and solves the problem of video big data that is difficult to aggregate, store, and store.
  • Edge services and cloud services also have an automatic decision-making event response mechanism, which can automatically configure front-end functions, models and output content according to the tasks to be executed, complete the scheduling control of the front-end and edge in the process of task execution, and reduce the impact of various applications on human labor. degree of dependence.
  • All front-end devices, edge services, and cloud service nodes in the scalable visual computing system provided by this embodiment have globally unified space-time identifiers, that is, all nodes have a unified time representation method and synchronized time information, and have a unified spatial information representation method and a reference system, with a globally unique device identifier, wherein the spatial information includes position, speed, attitude and its accuracy information; the scalable visual computing system has an autonomous decision-making mechanism for event response, based on the feature that the front-end device function can be defined, The system can dynamically configure the node working status and working parameters, algorithm model, output data flow and other contents, so as to automatically complete some tasks that rely on a lot of human labor in the traditional video surveillance system.
  • the front-end device in this embodiment may be a digital retina front-end.
  • the so-called digital retina front-end is analogous to the human retina, which evolves and innovates the traditional camera and even the visual computing architecture, so as to support the city brain more intelligently and serve intelligence Intelligent applications such as security and urban fine management. More specifically, the traditional camera only compresses the captured video data and uploads it to the cloud for storage, and then performs analysis and identification processing; while the digital retina front-end in this embodiment can complete efficient video encoding and compact feature expression. , output compressed video data and feature data in real time, where the video data is stored in the edge service, and the feature data is finally aggregated to the cloud service in real time, and the cloud service can retrieve the original data according to business needs and authorization mechanisms.
  • the front-end device determines its own spatiotemporal information in real time, perceives and collects scene visual information, obtains image data, and the image data includes videos and pictures, and then performs image processing and intelligent analysis processing on the image data. Including video/picture pre-processing, video compression coding and transcoding, feature extraction and feature coding, structural analysis, etc., and finally encapsulate the processing results together with time information, spatial information, and device identification information to generate compressed video streams and feature encoding streams. and structured result streams, optionally output according to the configuration, and report device status information at a fixed rhythm.
  • the front-end device receives the device control instructions issued by the edge service, completes the adjustment of the front-end on and off control and working parameters, receives the function definition instructions issued by the edge service, completes the configuration of the front-end functions, output data, etc., receives the edge
  • the model update command issued by the service completes the loading, full or incremental update of the algorithm model.
  • the video/picture pre-processing includes performing operations such as noise reduction, dehazing, and white balance adjustment on the original video/picture to improve the video/picture quality; video compression coding and transcoding are based on orthogonal transformation. Principles, background modeling techniques and other encoding and decoding algorithms to eliminate redundant information in the original video data, and generate more efficient video streams according to the configured encoding format.
  • the front-end device 10 includes a spatiotemporal determination module, an image acquisition module, an image processing module, an intelligent computing module, and a device control and data interaction module.
  • the space-time determination module is used to obtain unified time information, which is used to realize and maintain time synchronization between nodes in the system; the space-time determination module is also used to obtain the space-time information of the front-end equipment
  • the space-time information is provided to other modules of the front-end equipment in real time for calculation and transmission, and the space-time information includes information such as position, speed, and attitude; the space-time determination module is also used to receive the control instructions sent by the equipment control and data interaction module, Complete the configuration of its own working parameters.
  • the image acquisition module is used for acquisition and conversion of image data, and sends the image data to the image processing module.
  • the image processing module is used to pre-process, compress, encode, and transcode image data, and output a compressed video stream carrying timestamp information to the device control and data interaction module, and is also used for
  • the pre-processed image data is output to the intelligent computing module; the image processing module is further configured to receive the control instruction sent by the device control and data interaction module, and complete the configuration of the processing parameters.
  • the image data includes video data and picture data, and the step of preprocessing the image data includes: performing operations such as noise reduction, dehazing, and white balance adjustment on the image data to improve video/picture quality .
  • the steps of compressing, encoding and transcoding the image data include: adopting an encoding and decoding algorithm based on the principle of orthogonal transformation, background modeling technology, etc. to eliminate redundant information in the original image data, and generate the encoding format according to the configured encoding format. More efficient video streaming.
  • the intelligent computing module is used to perform structured analysis, feature extraction and feature encoding on the image data, and output the encoded feature stream and the structured result stream to the device control and data interaction module, and the intelligent computing module also It is used to receive control instructions, which include parameter configuration instructions and function definition instructions, receive model streams and dynamically update algorithm models.
  • the step of performing feature extraction on the image data includes: performing feature extraction operations on the image data or objects or regions of interest therein, including traditional manual features and deep learning features, which are used to aggregate to edge services or cloud services for feature extraction Retrieval and data co-analysis.
  • the step of feature encoding the image data includes: encoding and compressing traditional handcrafted features and a large number of deep learning features extracted from the image data to obtain a compact feature encoding stream.
  • the feature encoding standards include but are not limited to CDVS and CDVA.
  • the step of performing structured analysis on the image data includes: detecting, tracking, identifying, segmenting, and counting interesting targets or events in the image data to obtain target structured information, such as face recognition information, driver driving information, etc. Behavior analysis, traffic flow statistics, vehicle/pedestrian counting, license plate recognition information, road structure information, scene information, etc., and then encapsulated into a structured result stream according to a certain format.
  • the intelligent computing module includes a feature extraction sub-module, a feature coding sub-module and a structural analysis sub-module;
  • the feature extraction sub-module is used for image data and state information of front-end equipment to perform feature extraction, and send the extracted feature information to the feature encoding sub-module;
  • the feature encoding sub-module is used to encode the feature information and output the encoded feature stream;
  • the structural analysis sub-module is used to combine the front-end
  • the status information of the device analyzes and processes the image data, and outputs a structured result stream.
  • the device control and data interaction module is used to complete process control, device control, state monitoring, model update and transmission control of digital retina front-end data processing, obtain device working status and device identification information, and receive spatiotemporal information , compress video streams/pictures, encoded feature streams and structured result streams, package and encapsulate them, send them to edge services, receive and parse model streams and control instructions issued by edge services or cloud services, and send instructions to the front end The processing module of the device.
  • the edge service receives and processes the node access management instructions reported by the front-end, specifically including: receiving and processing instructions for device registration, login, and logout sent by the front-end device, updating the device management list, and sending the front-end device to the front-end device.
  • the access information is reported to the cloud service; the device status information reported by the front-end device is received and monitored. If the device's working state changes or an abnormal situation occurs, such as the device is in an abnormal working state, or the front-end position, attitude and other spatial information changes, it will be sent to the cloud.
  • the service reports the status information of the front-end device; receives the model query command reported by the front-end device and forwards it to the cloud service.
  • the edge service receives and aggregates the data stream reported by the front end, processes and saves, and forwards part of the data to the cloud service. Specifically, it includes: receiving the compressed video stream and picture data, feature encoding stream and structured result stream reported by the front-end equipment, intercepting, packaging and transcoding the compressed video stream, saving the video file, and updating the video file database; After parsing, repackage it, save it as a picture file in the specified format, and update the picture file database; after parsing and unpacking the feature code stream, save the feature file, and update the structured feature database and feature file database; parse the structured result stream Then store it in the structured result database; forward the feature code stream and structured result stream directly to the cloud service.
  • the edge service receives the device control instructions and function definition instructions issued by the cloud service, and forwards them to the front-end device; receives the video/picture data retrieval instructions issued by the cloud service, and searches the database for the required data according to the query conditions. Retrieve videos/pictures and return the video/picture information to the cloud service; receive the model update instruction issued by the cloud service and forward it to the front-end device.
  • the edge service completes the work of multi-node linkage system organization plan generation, data information configuration plan planning, collaborative work scheduling, video/picture processing analysis, data collaborative analysis, and joint optimization according to defined functional tasks. Specifically, the edge service independently determines the number of front-end nodes, detailed node information and data requirements required to complete the task according to the functional tasks of the actual application system, combined with predefined algorithms, configures the relevant front-end nodes, and schedules multiple front-ends to work together.
  • the video/picture processing analysis is to use the visual information processing technology to re-analyze and process the video/picture , extract useful information for post-processing or new tasks.
  • the edge service includes an integrated control module, a streaming media module, a data storage module, and a computing processing module.
  • the integrated control module is used to receive the reported data or instructions from the front-end equipment, control the response process, push the encoded feature stream and the structured result stream to the cloud service in real time, receive and forward the data to the cloud service It manages the access process and status of front-end devices, monitors the status of front-end devices, and schedules the cooperative working mode among multiple front-end devices.
  • the streaming media module is configured to receive the compressed video stream, and perform transcoding, interception and packaging on the compressed video stream.
  • the data storage module is configured to receive the compressed video stream of the streaming media module, the encoded feature stream and the structured result stream reported by the front-end device, and store the compressed video stream, the encoded feature stream and the structured result stream. Classify the result stream for storage and management, receive the compressed video stream or image retrieval instruction issued by the cloud service, and retrieve the compressed video stream or image according to the conditions to return the compressed video stream or image to the cloud service.
  • the computing and processing module is used to complete the work of generating the organization plan of the multi-node linkage system, processing and analyzing the image data, collaboratively analyzing the multi-node data, and jointly optimizing the multi-node linkage system according to the defined functional tasks.
  • the computing processing module includes an intelligent video analysis sub-module, a data association analysis sub-module and a joint optimization sub-module.
  • the intelligent video analysis sub-module, data association analysis sub-module and joint optimization sub-module The module jointly performs data co-processing, joint analysis and optimization on the compressed video stream, feature encoding stream and structured result stream output by several front-end devices, and completes some specific tasks concerned in the construction of smart cities, for example, by monitoring a certain area.
  • calculation processing module is also used for Video/picture processing and analysis, using visual information processing technology to re-analyze and process videos/pictures to extract useful information for post-processing or new tasks.
  • the data storage module includes an access management sub-module, a data retrieval sub-module, and a database sub-module; wherein, the access management sub-module is used to support data input, storage, and retrieval operations; the The data retrieval sub-module is used for data query and retrieval operations; the database sub-module is used to store structured data or unstructured data.
  • the database submodule includes a video file library, a picture file library, a feature file library, a structured feature library, and a structured result library; wherein, the video file library is used to store video stream data and Its summary information; the picture file library is used to store image format file data and its summary information; the feature file library is used to store unstructured feature stream data and its summary information; the structured feature library is used to store structure The structured result database is used to store the structured result data.
  • the cloud service performs configuration management and function definition for all nodes, receives and responds to third-party user demand instructions; aggregates visual big data information, retrieves video/picture data from edge services on demand, and organizes various types of structured data. / Unstructured data storage; store algorithm models used to support various applications, manage the algorithm life cycle and update process, respond to front-end model query instructions; perform big data analysis and mining and simulation calculations, and execute more macroscopic Multilateral collaborative tasks.
  • the cloud service receives the status information of the front-end equipment reported by the edge service, monitors and maintains the working status and space-time information of the equipment, and updates the equipment management list.
  • the cloud service receives and aggregates the encoded feature stream and the structured result stream reported by the edge service in real time, and after parsing the encoded feature stream, saves the unstructured feature with a large amount of data as a feature file, and extracts the corresponding
  • the data abstract and keyword information are stored in the unstructured feature database, and the structured feature information with a small amount of data is stored in the structured feature database; after the structured result stream is parsed, it is stored in the corresponding structured result database;
  • the cloud service receives the compressed video stream/picture reported by the front-end device in real time, stores the video data/picture as a video/picture file, and stores the video/picture summary information in the video/picture database; After the user's authorization, the cloud service retrieves video data/pictures from the edge service on demand.
  • the cloud service's algorithm model warehouse It stores algorithm model files that can support different software and hardware platforms, different applications, and different performances, manages the algorithm life cycle through the algorithm model database and its update mechanism, receives and responds to front-end algorithm model queries and pull instructions, and responds to required Model retrieval and software and hardware compatibility checks can also actively issue the latest version of algorithm model update instructions and model stream data.
  • the cloud service receives user demand instructions, and uses video and feature big data information for data analysis and mining, simulation prediction, joint optimization, etc., to complete large-scale multi-lateral collaborative tasks in space and time, such as traffic efficiency analysis and traffic lights.
  • the cloud service 30 includes a central control module, a computational simulation module, a data analysis module, a data center module, an algorithm model repository, and a user interaction module.
  • the central control module is used to perform configuration management and resource scheduling on all nodes in the system, uniformly manage the transmission control process of data flow, control flow and model flow, and issue device control, Function definition, model update commands, issue tasks to edge services, receive and process data reporting instructions and status reporting instructions from edge services, and implement secure connection and coordinated scheduling of devices in the system.
  • the calculation simulation module is used for structural analysis and processing, simulation prediction, model training, model joint optimization, collaborative strategy generation, and outputting calculation results.
  • the data analysis module is used to receive encoded feature streams and structured result streams, or to retrieve and retrieve data from the data center module according to user instructions, gather big data information, perform analysis and mining, and extract high-level Semantic information is returned to the user.
  • the data analysis module includes a data retrieval sub-module, a statistical analysis sub-module and a data mining sub-module, and the data retrieval sub-module is used to receive or initiate data retrieval and adjustment to the data center module. Fetching instructions to obtain the data required for the analysis task; the statistical analysis sub-module is used to perform multi-dimensional analysis on the aggregated feature and result data by methods such as classification, regression, and correlation analysis; the data mining sub-module is used for artificial Intelligence, machine learning, statistics and other means can automatically extract hidden useful information and knowledge from a large amount of historical data or real-time data.
  • the data center module is used for retrieving compressed video streams from edge services on demand, and for storing, retrieving, and retrieving encoded feature streams, structured result streams, and compressed video streams retrieved on demand output.
  • the algorithm model warehouse is used for the storage, query, delivery process and life cycle management of the algorithm model.
  • the user interaction module is configured to receive user-related instructions and return a processing result.
  • the data flow includes multimedia data, feature data, result information, spatiotemporal information, environmental data, device data and algorithm models; the control flow refers to instruction data related to system operation.
  • the central control module includes a configuration management sub-module, a cooperative scheduling sub-module, and an instruction processing sub-module.
  • the configuration management submodule is used to perform security authentication, configuration management and status monitoring on all nodes of front-end devices, edge services, and cloud services.
  • the cooperative scheduling sub-module is configured to issue device control instructions, function definition instructions, and work parameter instructions to the front-end device according to the scheduling policy.
  • the instruction processing sub-module is responsible for receiving, parsing and processing the reporting instructions and query instructions of the edge service, delivering model stream data, and being responsible for responding to the user interaction module.
  • the scalable visual computing system includes a large number of digital retina front-ends deployed in public places such as transportation hubs, important checkpoints, and community streets, several edge services scattered in different locations, and cloud services responsible for overall monitoring and command.
  • Input the object data to be tracked through the user interaction interface of the cloud service including multimedia data, spatiotemporal identification data, electronic data, trace data, social data, environmental data, device data, etc.
  • multimedia data including multimedia data, spatiotemporal identification data, electronic data, trace data, social data, environmental data, device data, etc.
  • Objects may include criminals, problem vehicles, unusual events, and the like.
  • Face detection algorithm key point detection algorithm
  • face feature extraction algorithm face detection algorithm models include but are not limited to YoloV4, SSD, etc.
  • Key point detection models include but are not limited to Resnet18, MobileNetV2, etc.
  • Feature extraction algorithms include but are not limited to sphereface , arcface.
  • the digital retina front-end first uses the face detection model to detect the position and size of the face, then cuts the face area from the original image, uses the key point detection model to extract key point information, and returns to 68 key points, which are further screened Five key points of facial features; the digital retina front-end filters the detected face area, and the filtering conditions include: according to the occlusion of the face, when the occlusion ratio is greater than a certain threshold, filter out the face picture; according to the face Image quality, when the ambiguity is greater than a certain threshold, the face picture is filtered out; the face pose and pupil distance are calculated according to the key points, when the pose is greater than a certain threshold or the pupil distance is smaller than a certain threshold, the face picture is filtered out;
  • the digital retina front-end uses affine transformation or similarity transformation to perform face calibration on the screened face pictures according to 5 key points;
  • the digital retina front-end extracts facial feature information from the calibrated face image, obtains a feature vector (for example, 1024 dimensions), reports the feature flow and front-end node time, space and device identification information to the edge service and finally aggregates it to the cloud service. Compress video streams for reporting and saving to edge services;
  • the cloud service uses the same feature extraction model as the front-end to extract the features of the face images to be tracked, and compares them with the reported features, uses the cosine distance as the similarity measure, roughly identifies the specific person, and records the front-end information and face structure to which they belong. information;
  • the cloud service automatically generates target tracking control instructions, which are forwarded by the edge service to the front-end node that finds a specific person to track suspicious faces.
  • the tracking algorithms include but are not limited to KCF and deepsort; the cloud service automatically generates front-end parameter control instructions, which are sent by the edge service.
  • the face image and the time, space and device identification information of the front-end node are reported to the edge service, and then aggregated to the cloud service in real time for secondary identification, and the compressed video stream is reported and saved to the edge service in real time;
  • the cloud service uses a more complex and accurate network model (for example, Resnet50) to extract and identify the facial features of the aggregated images of specific persons, further confirms the specific persons, records the spatiotemporal identification of the front-end to which they belong, and sends an alarm signal;
  • Resnet50 a more complex and accurate network model
  • the cloud service displays the confirmation result of the criminal in the previous step, and retrieves and displays the front-end raw video data of the specific person found in real time from the edge service, and further confirms it manually to lock the criminal;
  • the terminal, edge, and cloud automatically select appropriate algorithms for calculation, and different types of output data are effectively processed through software configuration and algorithm processing. Connected, it not only makes reasonable use of resources, but also reflects the flexibility of the end, edge, and cloud collaborative computing architecture based on the digital retina.
  • the core of the scalable visual computing system proposed by the present invention lies in the parallel data transmission architecture of the compressed video stream, the encoded feature stream and the model stream.
  • the transmission of the compressed video stream and the encoded feature stream has the following characteristics: Real-time, the front-end device can simultaneously report the compressed video stream and the encoded feature stream according to the configuration, which can not only relieve the pressure of data transmission, but also can gather effective information in real time for joint data analysis; the transmission of the model stream is sporadic, and when necessary When the model is updated, the model stream is directly or indirectly transmitted to the front-end device by the cloud service to realize the dynamic deployment and update of the model, thereby supporting the completion of different application tasks by defining the functions and algorithms of the front-end device; the scalable visual computing proposed by the present invention All nodes in the system have globally unified spatiotemporal information and device identification, which is not only convenient for joint analysis and target optimization of visual data between different nodes, but also facilitates data fusion processing with other sensing systems; the internal end of the scalable visual computing

Abstract

提供了一种可伸缩视觉计算系统,其特征在于,包括依次建立通信连接的前端设备、边缘服务以及云服务;前端设备,用于对采集的影像数据进行智能化分析处理,并根据配置相应地输出压缩视频流、特征编码流和结构化结果流;边缘服务,用于接收并存储前端设备发送的压缩视频流、特征编码流和结构化结果流,实时输出特征编码流和结构化结果流汇聚到云服务,根据云服务的数据调取指令按需输出压缩视频流至云服务;云服务,用于实时接收和汇聚边缘服务输出的特征编码流和结构化结果流,从边缘服务按需调取压缩视频流。本发明通过端、边、云三者之间优化分工和有机协作,实现对海量视频数据的实时处理和有效利用。

Description

一种可伸缩视觉计算系统 技术领域
本发明涉及视觉传感器应用领域,特别涉及一种可伸缩视觉计算系统。
背景技术
视频监控系统具有直观、准确、信息内容丰富等独特优点,随着计算机、图像处理技术和传输技术的发展,视频监控系统在安防、警务、交通、生产等领域的应用日益广泛。目前,在道路、社区、机场/火车站、大型场馆等各类场所安装的摄像机数量越来越多,清晰度也越来越高,导致产生的视频或图像数据量急剧增大,这给数据处理和传输带来了极大的挑战。
传统视频监控系统一般采用摄像机进行图像或视频采集和压缩,后台服务器进行数据处理、分析和识别的分工模式,一方面大量视频数据的传输会对系统宽带带来很大的压力,且由于后台处理能力有限,大量视频数据集中在后台处理会导致视频堆积以及信息处理不及时,很多数据来不及分析即被新数据覆盖;另一方面,摄像机用途较为单一,功能无法灵活配置,当应用业务发生变化时,往往需要重新加装摄像机,造成了资源的极大浪费,而且整个处理过程需要大量的人工参与,智能化程度不高。
传统视频监控系统的问题归根结底是数据急剧膨胀而系统架构跟不上摄像机等单机设备的发展水平导致的,只有对系统构架以及内部数据交互方式进行改进,同时与实际应用需求相结合对系统设计进行优化,才能适应海量数据的实时处理要求。
因此,现有技术还有待于改进和发展。
发明内容
本发明要解决的技术问题在于,针对现有技术的不足,提供一种可伸缩视觉计算系统,旨在解决现有视频监控系统无法适应海量数据实时处理要求的问题。
为了解决上述技术问题,本发明所采用的技术方案如下:
一种可伸缩视觉计算系统,其特征在于,包括依次建立通信连接的前端设备、边缘服务以及云服务;
所述前端设备,用于感知和采集场景视觉信息得到影像数据,对所述影像数据进行视频图像处理、特征编码处理以及智能化分析处理,并将处理结果和前端设备标识信息、 时间信息以及空间信息进行封装得到压缩视频流、特征编码流和结构化结果流,根据配置相应地输出所述压缩视频流、特征编码流和结构化结果流;还用于按照固定节拍上报自身的状态信息给所述边缘服务,并接收所述边缘服务下发的控制指令和模型流,完成对自身工作参数的配置和模型更新,所述控制指令包括设备控制指令和功能定义指令;
所述边缘服务,用于接收并存储所述前端设备发送的压缩视频流、特征编码流和结构化结果流,实时输出特征编码流和结构化结果流汇聚到云服务,根据云服务的数据调取指令按需输出压缩视频流至云服务;还用于接收和处理前端设备上报的节点接入管理指令并更新设备管理列表,还用于上报前端设备和边缘服务的状态信息给云服务,接收前端设备的模型查询指令并转发至云服务,接收云服务下发的模型流和控制指令,并将所述模型流和控制指令下发至所述前端设备,所述控制指令包括设备控制指令和功能定义指令;还用于根据定义的功能任务,完成多节点联动系统组织方案生成、数据信息配置方案规划、协同工作调度、影像数据处理分析、数据协同分析和联合优化的工作;
所述云服务,用于实时接收、存储和汇聚边缘服务输出的特征编码流和结构化结果流,从所述边缘服务按需调取压缩视频流;还用于存储用于支撑各类应用的算法模型,对算法生命周期和更新流程进行管理,接收所述边缘服务或前端设备发送的模型查询指令,相应地返回模型查询结果或模型流,根据触发情况随机地发送控制指令,接收及响应第三方用户需求指令;还用于进行大数据分析与挖掘以及仿真计算,执行多边协同任务;还用于接收边缘服务上报的设备状态信息,对所有节点进行配置管理、功能定义和资源协同调度。
所述可伸缩视觉计算系统,其中,所述前端设备包括时空确定模块、影像采集模块、影像处理模块、智能计算模块、设备控制与数据交互模块;其中,所述时空确定模块用于获取前端设备统一的时间信息,保持前端设备之间的时间同步,确定前端设备的位置、速度、姿态信息,并将时空信息实时提供给前端设备的其他模块用于计算和传输,接收设备控制与数据交互模块发送的控制指令,完成对自身工作参数的配置;所述影像采集模块用于影像数据的采集与转换,并将所述影像数据发送给所述影像处理模块;所述影像处理模块用于对影像数据进行前处理、压缩、编码、转码处理,输出携带时间戳信息的压缩视频流至所述设备控制与数据交互模块,还用于将前处理后的影像数据输出至智能计算模块,接收设备控制与数据交互模块发送的控制指令,完成对处理参数的配置;所述智能计算模块用于对影像数据进行结构化分析、特征提取和特征编码,输出编码特征流和结构化结果流至设备控制与数据交互模块,所述智能计算模块还用于接收控制指 令,所述控制指令包括参数配置指令和功能定义指令,接收模型流并动态更新算法模型;所述设备控制与数据交互模块用于将接收的时间信息、空间信息、压缩视频流、图片、编码特征流和结构化结果流进行打包和封装,发送至边缘服务,接收和解析边缘服务或云服务下发的模型流和控制指令,将所述模型流和控制指令发送给相应的处理模块;还用于完成前端设备的工作流程控制、设备控制、状态监控、模型更新以及传输控制,获取设备的工作状态和标识信息。
所述的可伸缩视觉计算系统,其中,所述边缘服务包括综合控制模块、流媒体模块、数据存储模块以及计算处理模块;其中,所述综合控制模块用于接收前端设备的上报数据或指令,控制其响应过程,将所述编码特征流和结构化结果流实时推送给云服务,接收并转发云服务下发的控制指令或模型流,管理前端设备的接入流程与状态,监听前端设备状态,对于多个前端设备之间的协同工作方式进行调度;所述流媒体模块用于接收所述压缩视频流,对所述压缩视频流进行转码、截取和打包;所述数据存储模块用于接收所述流媒体模块的压缩视频流、前端设备上报的编码特征流和结构化结果流,并将所述压缩视频流、编码特征流和结构化结果流进行分类存储和管理,接收云服务下发的压缩视频流或图片调取指令并按条件检索返回压缩视频流或图片至云服务;所述计算处理模块用于根据定义的功能任务,完成多节点联动系统组织方案生成、影像数据处理分析、多节点数据协同分析和联合优化的工作。
所述的可伸缩视觉计算系统,其中,所述数据存储模块包括存取管理子模块、数据检索子模块、数据库子模块;其中,所述存取管理子模块用于支持数据输入、保存以及调取操作;所述数据检索子模块用于数据查询、检索操作;所述数据库子模块用于对结构化数据或非结构化数据进行存储。
所述的可伸缩视觉计算系统,其中,所述数据库子模块包括视频文件库、图片文件库、特征文件库、结构化特征库以及结构化结果库;其中,所述视频文件库用于存储视频流数据及其摘要信息;所述图片文件库用于存储图像格式文件数据及其摘要信息;所述特征文件库用于存储非结构化特征流数据及其摘要信息;所述结构化特征库用于存储结构化特征数据;所述结构化结果库用于存储结构化结果数据。
所述的可伸缩视觉计算系统,其中,所述云服务包括中央控制模块、计算仿真模块、数据分析模块、数据中心模块、算法模型仓库和用户交互模块;其中,所述中央控制模块用于对系统内所有节点进行配置管理和资源调度,对数据流、控制流和模型流的传输控制过程进行统一管理,向前端设备下发设备控制、功能定义、模型更新命令,向边缘 服务下发任务,接收及处理边缘服务的数据上报指令和状态上报指令;所述计算仿真模块用于结构化分析处理、仿真预测、模型训练、模型联合优化、协同策略生成,输出计算结果;所述数据分析模块用于接收编码特征流和结构化结果流,或根据用户指令在数据中心模块检索和调取数据,汇聚大数据信息,进行分析与挖掘,提取出高层语义信息返回给用户;所述数据中心模块用于按需从边缘服务调取压缩视频流或图片,用于编码特征流、结构化结果流和按需调取的压缩视频流或图片的存储、检索和调取输出;所述算法模型仓库用于算法模型的存储、查询、下发流程以及生命周期管理;所述用户交互模块用于接收用户相关指令,返回处理结果。
所述的可伸缩视觉计算系统,其中,所述数据流包括多媒体数据、特征数据、结果信息、时空信息、环境数据、设备数据以及算法模型。
所述的可伸缩视觉计算系统,其中,所述控制流是指与系统运行相关的指令数据,所述指令数据包括设备注册指令、登录指令、注销指令、设备控制指令、功能定义指令、参数配置指令以及数据查询/调取指令。
所述的可伸缩视觉计算系统,其中,所述中央控制模块包括配置管理子模块、协同调度子模块、指令处理子模块;其中,所述配置管理子模块用于对前端设备、边缘服务、云服务所有节点进行安全认证、配置管理和状态监控;所述协同调度子模块用于根据调度策略向前端设备下发设备控制指令、功能定义指令、工作参数指令;所述指令处理子模块用于负责接收、解析和处理边缘服务的上报指令和查询指令,下发模型流数据,负责对用户交互模块的响应。
所述的可伸缩视觉计算系统,其中,所述数据分析模块包括数据检索子模块、统计分析子模块以及数据挖掘子模块,所述数据检索子模块用于接收或向数据中心模块发起数据检索和调取指令,获取分析任务所需要的数据;所述统计分析子模块用于采用分类、回归、相关分析方法对汇聚的特征和结果数据进行多维度分析;所述数据挖掘子模块用于采用人工智能、机器学习、统计学手段,从大量历史数据或实时数据中自动提取出隐含的有用信息和知识。
有益效果:与现有技术相比,本发明提出的可伸缩视觉计算系统,其核心在于压缩视频流、编码特征流以及模型流三类数据流并行的数据传输架构,所述压缩视频流和编码特征流的传输具有实时性,所述前端设备根据配置可同时上报压缩视频流和编码特征流,既能够缓解数据传输压力,又能够实时汇聚有效信息进行数据联合分析;所述模型流的传输具有偶发性,当需要更新模型时,模型流由云服务直接或间接地传输至前端设 备,实现模型动态部署和更新,从而支持通过对前端设备功能和算法的定义完成不同的应用任务;本发明提出的可伸缩视觉计算系统还具有可伸缩性,压缩视频流可保存在边缘服务,编码特征流可实时汇聚至云服务,云服务利用特征信息进行分析、识别、检索等后续任务,当云服务因业务需要必须使用原始影像数据时,经过用户授权后,方可从边缘服务调取压缩视频流;本发明提供的可伸缩视觉计算系统内所有前端设备、边缘服务以及云服务节点具有全局统一的时空标识,即所有节点具有统一的时间表示方法和同步的时间信息,具有统一的空间信息表示方法和参考系统,具有全局唯一的设备标识,其中,空间信息包括位置、速度、姿态及其精度信息;所述可伸缩视觉计算系统具有事件响应的自主决策机制,基于前端设备功能可定义的特点,系统能够动态配置节点工作状态与工作参数、算法模型、输出数据流等内容,从而自动完成传统视频监控系统中依赖大量人力劳动的一些任务。
附图说明
图1为本发明一种可伸缩视觉计算系统较佳实施例的原理框图。
图2为本发明智能计算模块的原理框图。
图3为本发明计算处理模块的原理框图。
图4为本发明数据分析模块的原理框图。
具体实施方式
本发明提供一种可伸缩视觉计算系统,为使本发明的目的、技术方案及效果更加清楚、明确,以下参照附图并举实施例对本发明进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定本发明。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本发明的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本发明所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。
下面结合附图,通过对实施例的描述,对发明内容作进一步说明。
现有的视频监控系统包括端-边或者端-云两部分,摄像机负责视频/图像采集和压缩,只输出一个压缩视频流,边/云部分负责大规模视频/图像数据的分析处理。然而,由于受到系统带宽的制约,大量的原始数据存在难以汇聚的问题,而且大量数据的集中分析也给边/云带来了巨大的处理压力,且边/云只能针对解码后的视频进行处理会导致一定的性能损失,摄像机功能和用途单一还会造成摄像机重复部署和资源浪费的问题,随着视频数据的急剧增长和系统智能化程度要求逐渐提高,该系统架构的缺陷严重制约了视频大数据的产业化应用。此外,由于原始视频直接上传云服务进行计算和存储还存在用户隐私得不到有效保护的问题。
基于现有技术所存在的问题,本发明提供了一种可伸缩视觉计算系统,如图1所示,其包括依次建立通信连接的前端设备10、边缘服务20以及云服务30;所述前端设备10,用于感知和采集场景视觉信息得到影像数据,对所述影像数据进行视频图像处理、特征编码处理以及智能化分析处理,并将处理结果和前端设备标识信息、时间信息以及空间信息进行封装得到压缩视频流、特征编码流和结构化结果流,根据配置相应地输出所述压缩视频流、特征编码流和结构化结果流;还用于按照固定节拍上报自身的状态信息给所述边缘服务,并接收所述边缘服务下发的控制指令和模型流,完成对自身工作参数的配置和模型更新,所述控制指令包括设备控制指令和功能定义指令;所述边缘服务20,用于接收并存储所述前端设备发送的压缩视频流、特征编码流和结构化结果流,实时输出特征编码流和结构化结果流汇聚到云服务,根据云服务的数据调取指令按需输出压缩视频流至云服务;还用于接收和处理前端设备上报的节点接入管理指令并更新设备管理列表,还用于上报前端设备和边缘服务的状态信息给云服务,接收前端设备的模型查询指令并转发至云服务,接收云服务下发的模型流和控制指令,并将所述模型流和控制指令下发至所述前端设备,所述控制指令包括设备控制指令和功能定义指令;还用于根据定义的功能任务,完成多节点联动系统组织方案生成、数据信息配置方案规划、协同工作调度、影像数据处理分析、数据协同分析和联合优化的工作;所述云服务30,用于实 时接收、存储和汇聚边缘服务输出的特征编码流和结构化结果流,从所述边缘服务按需调取压缩视频流;还用于存储用于支撑各类应用的算法模型,对算法生命周期和更新流程进行管理,接收所述边缘服务或前端设备发送的模型查询指令,相应地返回模型查询结果或模型流,根据触发情况随机地发送控制指令,接收及响应第三方用户需求指令;还用于进行大数据分析与挖掘以及仿真计算,执行多边协同任务;还用于接收边缘服务上报的设备状态信息,对所有节点进行配置管理、功能定义和资源协同调度。
具体来讲,传统视频监控系统的问题归根结底是数据急剧膨胀而系统架构跟不上摄像机等单机设备的发展水平导致的,只有对系统架构及其内部数据交互方式进行改进,才能适应海量数据的实时处理要求,同时与实际应用需求相结合,对系统设计进行优化。因此,本实施例借鉴人眼传递到大脑的信息是经过提取与缩减的生物学原理,提出了一种可伸缩视觉计算系统,其包括前端设备、边缘服务、云服务(端、边、云)三个子系统,所述可伸缩视觉计算系统的核心在于压缩视频流、编码特征流以及模型流三类数据流并行的数据传输架构,所述压缩视频流和编码特征流的传输具有实时性,所述前端设备根据配置可同时上报压缩视频流和编码特征流,既能够缓解数据传输压力,又能够实时汇聚有效信息进行数据联合分析;所述模型流的传输具有偶发性,当需要更新模型时,模型流由云服务直接或间接地传输至前端设备,实现模型动态部署和更新,从而支持通过对前端设备功能和算法的定义完成不同的应用任务;本发明提出的可伸缩视觉计算系统还具有可伸缩性,压缩视频流可保存在边缘服务,编码特征流可实时汇聚至云服务,云服务利用特征信息进行分析、识别、检索等后续任务,当云服务因业务需要必须使用原始影像数据时,经过用户授权后,方可从边缘服务调取压缩视频流。因此,本实施例对视频监控系统的计算架构、传输架构以及各个子系统的功能特性重新定义,通过子系统之间优化分工和有机协作达到资源充分利用的效果,从而实现对海量视频数据的实时处理和有效利用。
在本实施例中,所述前端设备的具体功能可通过软件灵活定义,算法模型可动态更新,从而达到一机多用的目的;前端设备还具有全局统一的时空标识信息,便于实现多机协同任务。归纳起来,所述可伸缩视觉计算系统具有压缩视频流、编码特征流、模型流多流并行的数据传输架构,通过在前端设备对视觉高价值信息的及时提取,在边缘服务对视觉原始数据进行分布式存储以及中观数据分析,在云服务进行宏观大数据分析与挖掘,既降低了系统内数据传输压力,又缓解了云服务的集中计算压力,解决了视频大数据难汇聚、难存储、难处理等问题,通过边-云(边缘服务和云服务)协作支持联邦 学习方式,从而能够解决数据隐私保护情况下的模型训练问题。边缘服务和云服务还具有自动决策事件响应机制,能够根据所执行的任务自动配置前端功能、模型及输出内容,完成任务执行过程中对前端、边缘的调度控制,降低各类应用对人力劳动的依赖程度。
本实施例提供的可伸缩视觉计算系统内所有前端设备、边缘服务以及云服务节点具有全局统一的时空标识,即所有节点具有统一的时间表示方法和同步的时间信息,具有统一的空间信息表示方法和参考系统,具有全局唯一的设备标识,其中,空间信息包括位置、速度、姿态及其精度信息;所述可伸缩视觉计算系统具有事件响应的自主决策机制,基于前端设备功能可定义的特点,系统能够动态配置节点工作状态与工作参数、算法模型、输出数据流等内容,从而自动完成传统视频监控系统中依赖大量人力劳动的一些任务。
作为举例,本实施例中的前端设备可以为数字视网膜前端,所谓数字视网膜前端,即类比于人类视网膜,对传统摄像头乃至视觉计算架构进行演进与革新,从而能够更加智能地支持城市大脑,服务智能安防、城市精细管理等智能应用。更为具体的来说,传统摄像头只是把拍摄到的视频数据压缩后上传到云端进行存储,再做分析识别处理;而本实施例中的数字视网膜前端能够完成高效的视频编码和紧凑的特征表达,实时输出压缩视频数据和特征数据,其中,视频数据存储在边缘服务,特征数据最终实时汇聚至云服务,云服务可根据业务需要和授权机制对原始数据进行调取。
在一些实施方式中,所述前端设备实时确定自身时空信息,感知和采集场景视觉信息,得到影像数据,所述影像数据包括视频和图片,然后对影像数据进行图像处理和智能化分析处理,具体包括视频/图片前处理、视频压缩编码与转码、特征提取与特征编码、结构化分析等,最终将处理结果加上时间信息、空间信息、设备标识信息一起封装生成压缩视频流、特征编码流和结构化结果流,根据配置可选地输出,同时按照固定节拍上报设备状态信息。所述前端设备接收边缘服务下发的设备控制指令,完成对前端开、关机控制和工作参数的调节,接收边缘服务下发的功能定义指令,完成对前端功能、输出数据等的配置,接收边缘服务下发的模型更新指令,完成对算法模型的加载、全量或增量更新。
本实施例中,所述视频/图片前处理包括对原始视频/图片进行降噪、去雾、调节白平衡等操作,以提高视频/图片质量;视频压缩编码与转码是采用基于正交变换原理、背景建模技术等的编解码算法来消除原始视频数据中的冗余信息,根据配置的编码格式生成更加高效的视频流。
如图1所述,所述前端设备10包括时空确定模块、影像采集模块、影像处理模块、智能计算模块、设备控制与数据交互模块。
在本实施例中,所述时空确定模块用于获取统一的时间信息,用于实现和保持系统内节点之间的时间同步;所述时空确定模块还用于获取前端设备的时空信息并将所述时空信息实时提供给前端设备的其它模块用于计算和传输,所述时空信息包括位置、速度、姿态等信息;所述时空确定模块还用于接收设备控制与数据交互模块发送的控制指令,完成对自身工作参数的配置。
在本实施例中,所述影像采集模块用于影像数据的采集与转换,并将所述影像数据发送给所述影像处理模块。
在本实施例中,所述影像处理模块用于对影像数据进行前处理、压缩、编码、转码处理,输出携带时间戳信息的压缩视频流至所述设备控制与数据交互模块,还用于将前处理后的影像数据输出至智能计算模块;所述影像处理模块还用于接收设备控制与数据交互模块发送的控制指令,完成对处理参数的配置。本实施例中,所述影像数据包括视频数据和图片数据,所述对影像数据进行前处理的步骤包括:对影像数据进行降噪、去雾、调节白平衡等操作,以提高视频/图片质量。对所述影像数据进行压缩、编码以及转码处理的步骤包括:采用基于正交变换原理、背景建模技术等的编解码算法来消除原始影像数据中的冗余信息,根据配置的编码格式生成更加高效的视频流。
在本实施例中,所述智能计算模块用于对影像数据进行结构化分析、特征提取和特征编码,输出编码特征流和结构化结果流至设备控制与数据交互模块,所述智能计算模块还用于接收控制指令,所述控制指令包括参数配置指令和功能定义指令,接收模型流并动态更新算法模型。本实施例中,对影像数据进行特征提取的步骤包括:对影像数据或者其中感兴趣目标或区域进行特征提取操作,包括传统手工特征和深度学习特征,用于汇聚到边缘服务或者云服务进行特征检索和数据协同分析。对所述影像数据进行特征编码的步骤包括:对从影像数据中提取出的传统手工特征和大量深度学习特征进行编码压缩,得到紧凑的特征编码流,特征编码标准包括但不限于CDVS、CDVA。对所述影像数据进行结构化分析的步骤包括:对影像数据中的感兴趣目标或事件进行检测、跟踪、识别、分割、统计等处理,得到目标结构化信息,例如人脸识别信息、司机驾驶行为分析、交通流量统计、车辆/行人计数、车牌识别信息、道路结构化信息、场景信息等等,然后按照一定格式封装成结构化结果流。
具体来讲,如图2所示,所述智能计算模块包括特征提取子模块、特征编码子模块 以及结构化分析子模块;所述特征提取子模块用于对影像数据、前端设备的状态信息进行特征提取,并将提取的特征信息发送给所述特征编码子模块;所述特征编码子模块用于对所述特征信息进行编码,输出编码特征流;所述结构化分析子模块用于结合前端设备的状态信息对影像数据进行分析处理,输出结构化结果流。
在本实施例中,所述设备控制与数据交互模块用于完成数字视网膜前端数据处理的流程控制、设备控制、状态监控、模型更新以及传输控制,获取设备工作状态和设备标识信息,接收时空信息、压缩视频流/图片、编码特征流和结构化结果流,将它们进行打包和封装,发送至边缘服务,接收和解析边缘服务或云服务下发的模型流和控制指令,将指令发送给前端设备的处理模块。
在一些实施方式中,所述边缘服务接收和处理前端上报的节点接入管理指令,具体包括:接收和处理前端设备发送的设备注册、登录、注销等指令,更新设备管理列表,并将前端设备接入信息上报至云服务;接收和监控前端设备上报的设备状态信息,如果出现设备工作状态改变或者异常情况,例如设备处于异常工作状态,或者前端位置、姿态等空间信息发生改变,则向云服务上报前端设备状态信息;接收前端设备上报的模型查询指令并转发至云服务。
在一些实施方式中,所述边缘服务接收及汇聚前端上报的数据流,进行处理和保存,将部分数据转发至云服务。具体包括:接收前端设备上报的压缩视频流和图片数据、特征编码流和结构化结果流,将压缩视频流进行截取、打包和转码后,保存视频文件,并更新视频文件数据库;将图片数据解析后重新封装,保存为指定格式图片文件,并更新图片文件数据库;将特征编码流进行解析、拆包后,保存特征文件,并更新结构化特征数据库和特征文件数据库;将结构化结果流解析后存入结构化结果数据库;将特征编码流和结构化结果流直接转发至云服务。
在一些实施方式中,边缘服务接收云服务下发的设备控制指令和功能定义指令,转发至前端设备;接收云服务下发的视频/图片数据调取指令,根据查询条件在数据库中对所需视频/图片进行检索,返回视频/图片信息至云服务;接收云服务下发的模型更新指令,转发至前端设备。
在一些实施方式中,边缘服务根据定义的功能任务,完成多节点联动系统组织方案生成、数据信息配置方案规划、协同工作调度、视频/图片处理分析、数据协同分析和联合优化的工作。具体包括:边缘服务根据实际应用系统的功能任务、结合预定义的算法,自主确定完成任务所需的前端节点数量、详细节点信息以及数据要求,对相关前端节点 进行配置,调度多个前端协同工作,利用汇聚的若干前端特征数据或结构化结果进行联合分析与优化,完成在智慧城市建设中所关注的一些具体任务,例如,通过监控一定区域内相互联接的道路交通参数,实时计算信号灯切换最优分配方式,动态调整交通信号灯控制方案;对肇事逃逸车辆/违法犯罪分子的追踪和抓捕,等等;其中,视频/图片处理分析是采用视觉信息处理技术对视频/图片进行再次分析与处理,提取有用信息,用于事后处理或者新任务中。
在一些实施方式中,如图1所示,所述边缘服务包括综合控制模块、流媒体模块、数据存储模块以及计算处理模块。
在本实施例中,所述综合控制模块用于接收前端设备的上报数据或指令,控制其响应过程,将所述编码特征流和结构化结果流实时推送给云服务,接收并转发云服务下发的控制指令或模型流,管理前端设备的接入流程与状态,监听前端设备状态,对于多个前端设备之间的协同工作方式进行调度。
在本实施例中,所述流媒体模块用于接收所述压缩视频流,对所述压缩视频流进行转码、截取和打包。
在本实施例中,所述数据存储模块用于接收所述流媒体模块的压缩视频流、前端设备上报的编码特征流和结构化结果流,并将所述压缩视频流、编码特征流和结构化结果流进行分类存储和管理,接收云服务下发的压缩视频流或图片调取指令并按条件检索返回压缩视频流或图片至云服务。
在本实施例中,所述计算处理模块用于根据定义的功能任务,完成多节点联动系统组织方案生成、影像数据处理分析、多节点数据协同分析和联合优化的工作。
具体来讲,如图3所示,所述计算处理模块包括智能视频分析子模块、数据关联分析子模块以及联合优化子模块,所述智能视频分析子模块、数据关联分析子模块以及联合优化子模块联合对若干前端设备输出的压缩视频流、特征编码流以及结构化结果流进行数据协同处理、联合分析与优化,完成在智慧城市建设中所关注的一些具体任务,例如,通过监控一定区域内相互联接的道路交通参数,实时计算信号灯切换最优分配方式,动态调整交通信号灯控制方案;对肇事逃逸车辆/违法犯罪分子的追踪和抓捕,等等;其中,所述计算处理模块还用于视频/图片处理分析,采用视觉信息处理技术对视频/图片进行再次分析与处理,提取有用信息,用于事后处理或者新任务中。
在一些实施方式中,所述数据存储模块包括存取管理子模块、数据检索子模块、数据库子模块;其中,所述存取管理子模块用于支持数据输入、保存以及调取操作;所述 数据检索子模块用于数据查询、检索操作;所述数据库子模块用于对结构化数据或非结构化数据进行存储。
在一些具体的实施方式中,所述数据库子模块包括视频文件库、图片文件库、特征文件库、结构化特征库以及结构化结果库;其中,所述视频文件库用于存储视频流数据及其摘要信息;所述图片文件库用于存储图像格式文件数据及其摘要信息;所述特征文件库用于存储非结构化特征流数据及其摘要信息;所述结构化特征库用于存储结构化特征数据;所述结构化结果库用于存储结构化结果数据。
在一些实施方式中,云服务对所有节点进行配置管理和功能定义,接收及响应第三方用户需求指令;汇聚视觉大数据信息,从边缘服务按需调取视频/图片数据,对各类结构化/非结构化数据进行存储;存储用于支撑各类应用的算法模型,对算法生命周期和更新流程进行管理,响应前端的模型查询指令;进行大数据分析与挖掘以及仿真计算,执行更加宏观的多边协同任务。
具体来讲,云服务接收边缘服务上报的前端设备状态信息,对设备工作状态和时空信息进行监控和维护,更新设备管理列表;监听边缘节点的接入请求、连接状态、设备状态信息等,对边缘节点的设备标识、权限、状态、连接关系等进行管理;根据应用系统用途或接收的用户需求指令,自动生成需要调度的节点类型、数量、标识号、工作参数以及调度策略,下发设备参数配置指令、功能定义指令以及算法模型更新指令,对相关前端、边缘节点进行配置。
在一些实施方式中,云服务实时接收及汇聚边缘服务上报的编码特征流和结构化结果流,将编码特征流进行解析后,将数据量较大的非结构化特征保存为特征文件,提取相应的数据摘要和关键字信息存储到非结构化特征数据库,将数据量较小的结构化特征信息存储到结构化特征数据库;将结构化结果流解析后,存储到相应的结构化结果数据库;在前端设备和云服务直接连通的情况下,云服务实时接收前端设备上报的压缩视频流/图片,将视频数据/图片存储为视频/图片文件,将视频/图片摘要信息存储到视频/图片数据库;经用户授权后,云服务从边缘服务按需调取视频数据/图片,经过处理和智能计算后,用于完成调查取证、视频回放、事故确认、二次识别等任务;云服务的算法模型仓库存储了能够支持不同软硬件平台、不同应用、不同性能的算法模型文件,通过算法模型数据库及其更新机制对算法生命周期进行管理,接收及响应前端的算法模型查询与拉取指令,对所需模型进行检索以及软硬件适配性检查,也能够主动下发最新版本的算法模型更新指令和模型流数据。
在一些实施方式中,云服务接收用户需求指令,利用视频、特征大数据信息进行数据分析与挖掘、仿真预测、联合优化等,完成大时空尺度的多边协同任务,例如:交通通行效率分析及信号灯控制优化、公交车路线及调度优化、移动目标识别与追踪等;利用大规模数据集和超强算力完成大型模型训练,与边缘服务协作完成联邦学习模型训练,支持隐私保护应用。
在一些实施方式中,如图1所示,所述云服务30包括中央控制模块、计算仿真模块、数据分析模块、数据中心模块、算法模型仓库和用户交互模块。
在本实施例中,所述中央控制模块用于对系统内所有节点进行配置管理和资源调度,对数据流、控制流和模型流的传输控制过程进行统一管理,向前端设备下发设备控制、功能定义、模型更新命令,向边缘服务下发任务,接收及处理边缘服务的数据上报指令和状态上报指令,实现系统内设备的安全连接和协同调度。
在本实施例中,所述计算仿真模块用于结构化分析处理、仿真预测、模型训练、模型联合优化、协同策略生成,输出计算结果。
在本实施例中,所述数据分析模块用于接收编码特征流和结构化结果流,或根据用户指令在数据中心模块检索和调取数据,汇聚大数据信息,进行分析与挖掘,提取出高层语义信息返回给用户。
具体来讲,如图4所示,所述数据分析模块包括数据检索子模块、统计分析子模块以及数据挖掘子模块,所述数据检索子模块用于接收或向数据中心模块发起数据检索和调取指令,获取分析任务所需要的数据;所述统计分析子模块用于采用分类、回归、相关分析等方法对汇聚的特征和结果数据进行多维度分析;所述数据挖掘子模块用于采用人工智能、机器学习、统计学等手段,从大量历史数据或实时数据中自动提取出隐含的有用信息和知识。
在本实施例中,所述数据中心模块用于按需从边缘服务调取压缩视频流,用于编码特征流、结构化结果流和按需调取的压缩视频流的存储、检索和调取输出。
在本实施例中,所述算法模型仓库用于算法模型的存储、查询、下发流程以及生命周期管理。
在本实施例中,所述用户交互模块用于接收用户相关指令,返回处理结果。
在一些具体的实施方式中,所述数据流包括多媒体数据、特征数据、结果信息、时空信息、环境数据、设备数据以及算法模型;所述控制流是指与系统运行相关的指令数据。
在一些实施方式中,所述中央控制模块包括配置管理子模块、协同调度子模块、指令处理子模块。
在本实施例中,所述配置管理子模块用于对前端设备、边缘服务、云服务所有节点进行安全认证、配置管理和状态监控。
在本实施例中,所述协同调度子模块用于根据调度策略向前端设备下发设备控制指令、功能定义指令、工作参数指令。
在本实施例中,所述指令处理子模块用于负责接收、解析和处理边缘服务的上报指令和查询指令,下发模型流数据,负责对用户交互模块的响应。
下面以视频监控场景下的对象追踪应用为例,对本发明一种可伸缩视觉计算系统的工作过程做进一步的解释说明:
所述可伸缩视觉计算系统包括部署在交通枢纽、重要关卡、社区街道等公共场所的大量数字视网膜前端,分散于不同位置的若干边缘服务以及负责总体监控指挥的云服务。
通过云服务的用户交互接口输入待追踪的对象数据,包含多媒体数据、时空标识数据、电子数据、痕迹数据、社交数据、环境数据、设备数据等。具体例如,一张含有人脸的图片,该人脸图片出现的时间、地点、环境、状态信息等;与该人物相关的关联人物、关联车辆、关联设备等;与该人脸图片相关的行为信息、行为轨迹信息、状态变化信息等。对象可包括犯罪分子、问题车辆、异常事件等。
所述数字视网膜前端工作在正常监控模式,其拍摄画面可能存在人群密集、目标较小的情况,需要在大量数据中搜索目标,以通过人脸检测追踪特定人员为例,前端部署了快速的人脸检测算法、关键点检测算法和人脸特征提取算法,人脸检测算法模型包括但不限于YoloV4、SSD等,关键点检测模型包括但不限于Resnet18、MobileNetV2等,特征提取算法包括但不限于sphereface、arcface。
数字视网膜前端首先利用人脸检测模型检测出人脸的位置和大小,然后将人脸区域从原图中切割出来,利用关键点检测模型提取关键点信息,回归得到68个关键点,并进一步筛选出五官的5个关键点;数字视网膜前端对检测出的人脸区域进行过滤,过滤条件包括:根据脸部的遮挡情况,当遮挡比例大于一定阈值时,滤除该人脸图片;根据人脸图片质量,当模糊度大于一定阈值时,滤除该人脸图片;根据关键点计算人脸姿态、瞳孔距离,当姿态大于一定阈值或者瞳距小于一定阈值时,滤除该人脸图片;
数字视网膜前端对筛选出的人脸图片,根据5个关键点利用仿射变换或相似变换进行人脸校准;
数字视网膜前端对校准后的人脸图像提取面部特征信息,得到特征向量(例如,1024维),将特征流以及前端节点时、空和设备标识信息上报至边缘服务并最终汇聚到云服务,将压缩视频流上报并保存至边缘服务;
云服务采用与前端一致的特征提取模型对待追踪人脸图片进行特征提取,并与上报特征比对,采用余弦距离作为相似性度量,粗略识别出特定人员,记录其所属的前端信息和人脸结构化信息;
云服务自动生成目标跟踪控制指令,由边缘服务转发至发现特定人员的前端节点,对可疑人脸进行跟踪,跟踪算法包括但不限于KCF、deepsort;云服务自动生成前端参数控制指令,由边缘服务转发至发现特定人员的前端节点,调整相应前端的角度、焦距等参数,对拍摄画面进行放大,得到更清晰的特定人员图像,根据人脸跟踪结果,选择清晰度高、角度正的一张人脸图片以及前端节点的时、空、设备标识信息上报至边缘服务,然后实时汇聚至云服务,用于二次识别,同时将压缩视频流实时上报并保存至边缘服务;
云服务对汇聚的特定人员图片采用更复杂、更精确的网络模型(例如,Resnet50)进行人脸特征提取和识别,对特定人员进一步确认,记录其所属前端的时空标识,并发送报警信号;
云服务对上一步的犯罪分子确认结果进行展示,并从边缘服务实时调取和显示发现特定人员的前端原始视频数据,经由人工进一步确认,从而锁定犯罪分子;
相关部门根据获取的罪犯所处时间、地点信息,采取下一步行动。
在上述流程中,对罪犯进行搜索、跟踪、确认的不同阶段,在云服务的调度下,端、边、云自动选择合适的算法进行计算,通过软件配置和算法处理将不同类型的输出数据有效衔接起来,既合理利用了资源,又体现了基于数字视网膜的端、边、云协同计算架构的灵活性。
综上所述,本发明提出的可伸缩视觉计算系统,其核心在于压缩视频流、编码特征流以及模型流三类数据流并行的数据传输架构,所述压缩视频流和编码特征流的传输具有实时性,所述前端设备根据配置可同时上报压缩视频流和编码特征流,既能够缓解数据传输压力,又能够实时汇聚有效信息进行数据联合分析;所述模型流的传输具有偶发性,当需要更新模型时,模型流由云服务直接或间接地传输至前端设备,实现模型动态部署和更新,从而支持通过对前端设备功能和算法的定义完成不同的应用任务;本发明提出的可伸缩视觉计算系统内所有节点具有全局统一的时空信息和设备标识,既便于不 同节点之间视觉数据的联合分析和目标优化,又便于和其他传感系统数据融合处理;所述可伸缩视觉计算系统内端、边、云具有协同机制,数字视网膜前端功能可软件定义,算法模型可动态更新,边/云具有任务/事件响应自主决策机制和联合调度功能,它们之间通过实时调节的数据和控制指令交互,实现用户指定的任务。
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。

Claims (10)

  1. 一种可伸缩视觉计算系统,其特征在于,包括依次建立通信连接的前端设备、边缘服务以及云服务;
    所述前端设备,用于感知和采集场景视觉信息得到影像数据,对所述影像数据进行视频图像处理、特征编码处理以及智能化分析处理,并将处理结果和前端设备标识信息、时间信息以及空间信息进行封装得到压缩视频流、特征编码流和结构化结果流,根据配置相应地输出所述压缩视频流、特征编码流和结构化结果流;还用于按照固定节拍上报自身的状态信息给所述边缘服务,并接收所述边缘服务下发的控制指令和模型流,完成对自身工作参数的配置和模型更新,所述控制指令包括设备控制指令和功能定义指令;
    所述边缘服务,用于接收并存储所述前端设备发送的压缩视频流、特征编码流和结构化结果流,实时输出特征编码流和结构化结果流汇聚到云服务,根据云服务的数据调取指令按需输出压缩视频流至云服务;还用于接收和处理前端设备上报的节点接入管理指令并更新设备管理列表,还用于上报前端设备和边缘服务的状态信息给云服务,接收前端设备的模型查询指令并转发至云服务,接收云服务下发的模型流和控制指令,并将所述模型流和控制指令下发至所述前端设备,所述控制指令包括设备控制指令和功能定义指令;还用于根据定义的功能任务,完成多节点联动系统组织方案生成、数据信息配置方案规划、协同工作调度、影像数据处理分析、数据协同分析和联合优化的工作;
    所述云服务,用于实时接收、存储和汇聚边缘服务输出的特征编码流和结构化结果流,从所述边缘服务按需调取压缩视频流;还用于存储用于支撑各类应用的算法模型,对算法生命周期和更新流程进行管理,接收所述边缘服务或前端设备发送的模型查询指令,相应地返回模型查询结果或模型流,根据触发情况随机地发送控制指令,接收及响应第三方用户需求指令;还用于进行大数据分析与挖掘以及仿真计算,执行多边协同任务;还用于接收边缘服务上报的设备状态信息,对所有节点进行配置管理、功能定义和资源协同调度。
  2. 根据权利要求1所述可伸缩视觉计算系统,其特征在于,所述前端设备包括时空确定模块、影像采集模块、影像处理模块、智能计算模块、设备控制与数据交互模块;其中,所述时空确定模块用于获取前端设备统一的时间信息,保持前端设备之间的时间同步,确定前端设备的位置、速度、姿态信息,并将时空信息实时提供给前端设备的其他模块用于计算和传输,接收设备控制与数据交互模块发送的控制指令,完成对自身工作参数的配置;所述影像采集模块用于影像数据的采集与转换,并将所述影像数据发送给所述影像处理模块;所述影像处理模块用于对影像数据进行前处理、压缩、编码、转 码处理,输出携带时间戳信息的压缩视频流至所述设备控制与数据交互模块,还用于将前处理后的影像数据输出至智能计算模块,接收设备控制与数据交互模块发送的控制指令,完成对处理参数的配置;所述智能计算模块用于对影像数据进行结构化分析、特征提取和特征编码,输出编码特征流和结构化结果流至设备控制与数据交互模块,所述智能计算模块还用于接收控制指令,所述控制指令包括参数配置指令和功能定义指令,接收模型流并动态更新算法模型;所述设备控制与数据交互模块用于将接收的时间信息、空间信息、压缩视频流、图片、编码特征流和结构化结果流进行打包和封装,发送至边缘服务,接收和解析边缘服务或云服务下发的模型流和控制指令,将所述模型流和控制指令发送给相应的处理模块;还用于完成前端设备的工作流程控制、设备控制、状态监控、模型更新以及传输控制,获取设备的工作状态和标识信息。
  3. 根据权利要求1所述的可伸缩视觉计算系统,其特征在于,所述边缘服务包括综合控制模块、流媒体模块、数据存储模块以及计算处理模块;其中,所述综合控制模块用于接收前端设备的上报数据或指令,控制其响应过程,将所述编码特征流和结构化结果流实时推送给云服务,接收并转发云服务下发的控制指令或模型流,管理前端设备的接入流程与状态,监听前端设备状态,对于多个前端设备之间的协同工作方式进行调度;所述流媒体模块用于接收所述压缩视频流,对所述压缩视频流进行转码、截取和打包;所述数据存储模块用于接收所述流媒体模块的压缩视频流、前端设备上报的编码特征流和结构化结果流,并将所述压缩视频流、编码特征流和结构化结果流进行分类存储和管理,接收云服务下发的压缩视频流或图片调取指令并按条件检索返回压缩视频流或图片至云服务;所述计算处理模块用于根据定义的功能任务,完成多节点联动系统组织方案生成、影像数据处理分析、多节点数据协同分析和联合优化的工作。
  4. 根据权利要求3所述的可伸缩视觉计算系统,其特征在于,所述数据存储模块包括存取管理子模块、数据检索子模块、数据库子模块;其中,所述存取管理子模块用于支持数据输入、保存以及调取操作;所述数据检索子模块用于数据查询、检索操作;所述数据库子模块用于对结构化数据或非结构化数据进行存储。
  5. 根据权利要求4所述的可伸缩视觉计算系统,其特征在于,所述数据库子模块包括视频文件库、图片文件库、特征文件库、结构化特征库以及结构化结果库;其中,所述视频文件库用于存储视频流数据及其摘要信息;所述图片文件库用于存储图像格式文件数据及其摘要信息;所述特征文件库用于存储非结构化特征流数据及其摘要信息;所述结构化特征库用于存储结构化特征数据;所述结构化结果库用于存储结构化结果数 据。
  6. 根据权利要求1所述的可伸缩视觉计算系统,其特征在于,所述云服务包括中央控制模块、计算仿真模块、数据分析模块、数据中心模块、算法模型仓库和用户交互模块;其中,所述中央控制模块用于对系统内所有节点进行配置管理和资源调度,对数据流、控制流和模型流的传输控制过程进行统一管理,向前端设备下发设备控制、功能定义、模型更新命令,向边缘服务下发任务,接收及处理边缘服务的数据上报指令和状态上报指令;所述计算仿真模块用于结构化分析处理、仿真预测、模型训练、模型联合优化、协同策略生成,输出计算结果;所述数据分析模块用于接收编码特征流和结构化结果流,或根据用户指令在数据中心模块检索和调取数据,汇聚大数据信息,进行分析与挖掘,提取出高层语义信息返回给用户;所述数据中心模块用于按需从边缘服务调取压缩视频流或图片,用于编码特征流、结构化结果流和按需调取的压缩视频流或图片的存储、检索和调取输出;所述算法模型仓库用于算法模型的存储、查询、下发流程以及生命周期管理;所述用户交互模块用于接收用户相关指令,返回处理结果。
  7. 根据权利要求6所述的可伸缩视觉计算系统,其特征在于,所述数据流包括多媒体数据、特征数据、结果信息、时空信息、环境数据、设备数据以及算法模型。
  8. 根据权利要求6所述的可伸缩视觉计算系统,其特征在于,所述控制流是指与系统运行相关的指令数据,所述指令数据包括设备注册指令、登录指令、注销指令、设备控制指令、功能定义指令、参数配置指令以及数据查询/调取指令。
  9. 根据权利要求6所述的可伸缩视觉计算系统,其特征在于,所述中央控制模块包括配置管理子模块、协同调度子模块、指令处理子模块;其中,所述配置管理子模块用于对前端设备、边缘服务、云服务所有节点进行安全认证、配置管理和状态监控;所述协同调度子模块用于根据调度策略向前端设备下发设备控制指令、功能定义指令、工作参数指令;所述指令处理子模块用于负责接收、解析和处理边缘服务的上报指令和查询指令,下发模型流数据,负责对用户交互模块的响应。
  10. 根据权利要求6所述的可伸缩视觉计算系统,其特征在于,所述数据分析模块包括数据检索子模块、统计分析子模块以及数据挖掘子模块,所述数据检索子模块用于接收或向数据中心模块发起数据检索和调取指令,获取分析任务所需要的数据;所述统计分析子模块用于采用分类、回归、相关分析方法对汇聚的特征和结果数据进行多维度分析;所述数据挖掘子模块用于采用人工智能、机器学习、统计学手段,从大量历史数据或实时数据中自动提取出隐含的有用信息和知识。
PCT/CN2021/087017 2020-12-08 2021-04-13 一种可伸缩视觉计算系统 WO2022121196A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/037,408 US20230412769A1 (en) 2020-12-08 2021-04-13 Scalable Visual Computing System

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011425341.5 2020-12-08
CN202011425341.5A CN112804188B (zh) 2020-12-08 2020-12-08 一种可伸缩视觉计算系统

Publications (1)

Publication Number Publication Date
WO2022121196A1 true WO2022121196A1 (zh) 2022-06-16

Family

ID=75806533

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/087017 WO2022121196A1 (zh) 2020-12-08 2021-04-13 一种可伸缩视觉计算系统

Country Status (3)

Country Link
US (1) US20230412769A1 (zh)
CN (1) CN112804188B (zh)
WO (1) WO2022121196A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358729A (zh) * 2022-10-21 2022-11-18 成都戎星科技有限公司 一种卫星影像数据智能发布系统
CN115861775A (zh) * 2022-12-08 2023-03-28 广州市双照电子科技有限公司 一种视频分析方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108512909A (zh) * 2018-03-14 2018-09-07 日照职业技术学院 一种基于物联网的计算机远程控制系统
CN109451040A (zh) * 2018-12-10 2019-03-08 王顺志 基于边缘计算的物联网组网系统及组网方法
US20200192741A1 (en) * 2016-01-28 2020-06-18 Intel Corporation Automatic model-based computing environment performance monitoring
WO2020209951A1 (en) * 2019-04-09 2020-10-15 FogHorn Systems, Inc. Intelligent edge computing platform with machine learning capability
CN111787321A (zh) * 2020-07-06 2020-10-16 济南浪潮高新科技投资发展有限公司 用于边缘端的基于深度学习的图片压缩、解压缩方法及系统
CN111901573A (zh) * 2020-08-17 2020-11-06 泽达易盛(天津)科技股份有限公司 一种基于边缘计算的细颗粒度实时监管系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105898313A (zh) * 2014-12-15 2016-08-24 江南大学 一种新的基于视频大纲的监控视频可伸缩编码技术

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200192741A1 (en) * 2016-01-28 2020-06-18 Intel Corporation Automatic model-based computing environment performance monitoring
CN108512909A (zh) * 2018-03-14 2018-09-07 日照职业技术学院 一种基于物联网的计算机远程控制系统
CN109451040A (zh) * 2018-12-10 2019-03-08 王顺志 基于边缘计算的物联网组网系统及组网方法
WO2020209951A1 (en) * 2019-04-09 2020-10-15 FogHorn Systems, Inc. Intelligent edge computing platform with machine learning capability
CN111787321A (zh) * 2020-07-06 2020-10-16 济南浪潮高新科技投资发展有限公司 用于边缘端的基于深度学习的图片压缩、解压缩方法及系统
CN111901573A (zh) * 2020-08-17 2020-11-06 泽达易盛(天津)科技股份有限公司 一种基于边缘计算的细颗粒度实时监管系统

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115358729A (zh) * 2022-10-21 2022-11-18 成都戎星科技有限公司 一种卫星影像数据智能发布系统
CN115861775A (zh) * 2022-12-08 2023-03-28 广州市双照电子科技有限公司 一种视频分析方法及系统
CN115861775B (zh) * 2022-12-08 2024-02-23 广州市双照电子科技有限公司 一种视频分析方法及系统

Also Published As

Publication number Publication date
CN112804188B (zh) 2021-11-26
CN112804188A (zh) 2021-05-14
US20230412769A1 (en) 2023-12-21

Similar Documents

Publication Publication Date Title
US9704393B2 (en) Integrated intelligent server based system and method/systems adapted to facilitate fail-safe integration and/or optimized utilization of various sensory inputs
CA2824330C (en) An integrated intelligent server based system and method/systems adapted to facilitate fail-safe integration and/or optimized utilization of various sensory inputs
CN114971574A (zh) 基于云边协同的多模态信息复合感知与融合架构及方法
WO2022121196A1 (zh) 一种可伸缩视觉计算系统
CN107318000A (zh) 一种基于云平台的无线视频监控系统
CN204155449U (zh) 一种车牌识别比对高清摄像机及监控系统
US20110109742A1 (en) Broker mediated video analytics method and system
CN100514379C (zh) 技防网的智能监控动态报警系统
CN101702771B (zh) 网络视频智能监控系统及方法
WO2011041903A1 (en) Video analytics with pre-processing at the source end
Guo et al. CrossRoI: cross-camera region of interest optimization for efficient real time video analytics at scale
CN102982311A (zh) 基于视频结构化描述的车辆视频特征提取系统及方法
WO2017000880A1 (zh) 信息处理方法、信息处理装置和视频监控系统
CN104239309A (zh) 视频分析检索服务端、系统及方法
WO2013131189A1 (en) Cloud-based video analytics with post-processing at the video source-end
CN113378616A (zh) 视频分析方法、视频分析的管理方法及相关设备
CN102665064A (zh) 一种基于标准标记与快速检索的交通视频监控系统
CN115103157A (zh) 基于边云协同的视频分析方法、装置、电子设备及介质
CN113744528A (zh) 一种智慧城市交通视频监控系统
Li et al. A city monitoring system based on real-time communication interaction module and intelligent visual information collection system
CN111510680A (zh) 一种图像数据的处理方法、系统及存储介质
CN116208633A (zh) 一种人工智能服务平台系统、方法、设备及介质
CN112312070A (zh) 一种数字视网膜云端软件调度方法
CN116824480A (zh) 基于DeepStream的监控视频分析方法及系统
CN104902238A (zh) 一种智能视频分析服务器

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 18037408

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21901912

Country of ref document: EP

Kind code of ref document: A1